Hacker News

Unfriendly GPT

Hacker News - 4 hours 38 min ago
Categories: Hacker News

Are we entering the third wave of AI?

Hacker News - 4 hours 45 min ago

Comments URL: https://news.ycombinator.com/item?id=40386670

Points: 2

# Comments: 0

Categories: Hacker News

Thoughts on DSPy

Hacker News - 4 hours 45 min ago

Been tinkering with DSPy and I thought I will share my thoughts. Let’s start by setting some context.

Let’s say you are building a RAG(Retrieval Augmented Generation).

The typical approach is to: 1. Chunk the documents 2. Insert into a vector store 3. Given a query, retrieve the context(s) by kNN search 4. Give the context to a language model for answer generation.

This gives a great starting point but obviously this does not guarantee 100% accuracy from first iteration. Now, the typical approach is to iterate and make it better. Generally this involves tweaking: 1. Chunking and insertion strategy 2. Retrieval strategy 3. Prompt engineering

After some trial n error, you arrive at the most optimal combination hyper parameter settings that gives you the best accuracy. After a few days, you realize the RAG’s accuracy has regressed. Now, again you are back to square one. This is a monolithic and unsustainable way to optimize. This is what DSPy tries to solve. DSPy is primarily two things: 1. Separate programming from prompt engineering using signatures and give greater control to program language models using traditional programming techniques. 2. Incorporate best practice prompting techniques in the framework to optimize the LMs response.

With DSPy, you can implement and iterate on improving your app additional tools at your disposal. First, you can separate your prompts and write a signature instead. A signature looks like this “context, question -> answer”. This signature gets “compiled” into an actual prompt.

In addition, you can also break this problem into 2 modules, 1. Retrieval module 2. Generation module

Modules are classes that that take a signature with inputs and generate an output and you can start iterating on each module separately. Now, with the retrieval module, you can do a an iterative query generation & context retrieval before passing it to the generation module. This is a technique used to improve RAG performance by building up additional context by incrementally asking questions based on retrieved context and using the generate questions to retrieve additional context.

With DSPy, this becomes straightforward because, DSPy exposes constructs like - Settings where you can define the number of hops - Signature like “context, question -> query” that gets compiled into a proper prompt - Optimizers to incrementally append retrieved contexts to the prompt automatically by the framework

Coming to the answer generation module, it’s not atypical to define a set of rules while prompt engineering. DSPy solves this using run time assertions with backtracking where in, you can define an assertion instead of a rule and DSPy with automatically backtrack and improve the prompt if the assertion fails. DSPy will automatically back track and prefix the prompt to include additional rules to optimize the prompt further. Obviously, you can do this with prompt engineering & program composition. But, where DSPy really shines is when you are optimizing a large pipeline that is prone to brittleness with prompt engineering techniques. DSPy helps you compose the pipeline into isolated set of modules & lets you apply best practice prompting techniques without having to think much about the prompt.

There are two implementations of the framework today. - Python - https://github.com/stanfordnlp/dspy - Typescript - https://github.com/dosco/llm-client/

One specific feature I would like to see is to expose the prompt templates as config and give greater control to the developers in incorporating our “versions” of templates for CoT, Fewshots etc. This will also help with the adoption of DSPy to languages other than English.

Finally, I was able to understand what DSPy is adding to its prompts and what sort of optimizations it is doing by using Langtrace, an open source tracing tool i am building that can be setup with just 2 lines of code.

https://github.com/Scale3-Labs/langtrace

Comments URL: https://news.ycombinator.com/item?id=40386667

Points: 1

# Comments: 0

Categories: Hacker News

Pages