From RAGs to riches.
For those who live under a rock, LLMs have taken over. They haven’t taken over most use cases but they’ve certainly taken over all tech conversation.
Here’s how every conversation goes :
Me : I do X
Them : Yeah but have you tried doing X using LLMs?
LLMs are nice because they function well as reasoning agents. In the computing world, we’ve never had such a well-generalized system capable of reasoning with an input and providing useful output. Existing ML models do this to a certain extent, but they usually need to be trained and tuned on use-case specific data. LLMs, on the other hand, provide quite sophisticated levels of reasoning ability with just an API call. They are useful if used correctly within a system that requires an LLM.
However, there are a few issues with LLMs:
- Hallucination : When LLMs don’t have a perfect answer, they tend to make sh*t up. They act pretty confident about it too.
- Live data : LLMs are trained on large amounts of text. But once the training stops, there’s no easy way to “input” live data and make the LLM knowledge base upto date.
- Proprietary data : Most businesses have data which isn’t publicly available. This limits LLM usage for business use cases as the LLM was never trained on their proprietary data.
Enterprises ❤️ AI. Open the website of any large enterprise right now. AI will 100% be mentioned on their landing page. So business leaders want AI. They want AI that magically fixes customer issues without any of the issues mentioned above.
Enter RAG. RAG stands for retrieval augmented generation. RAG is essentially a software system built around using an LLM as a reasoning agent.
Here’s how rag works : for each query, a retrieval system is queried to get relevant context. The retrieval system could be anything: an SQL database, a Vector DB, a search engine like Elastic or Solr, or even John the librarian. The relevant context is then used as input to an LLM, alongside a surrounding prompt and the original query. The LLM takes all this information, applies its “reasoning” ability, and generates a nice text response that answers the user’s query using the provided context.
So… is RAG the solution?
Well, RAG solves all 3 problems above :
- Hallucinations: Using prompt engineering, you can force the LLM to use only the provided context and not make things up.
- Live data: As long as your retrieval system is up to date, the LLM will only get the context you want to give it.
- Proprietary data: Since the retrieval system is controlled by you, your data is never made public. Plus, LLM providers do not train on data they receive in API inputs.
RAG has emerged as the easiest way for enterprises to use LLMs and provide “AI driven solutions” to customers.
BUT, beware!
Is RAG useful? Yes. Does everyone need RAG? No. Is RAG easy to implement? Hell No. RAG is a chain of systems with many complexities at each stage like data indexing, retrieval and generation. Additionally, typical things like reliability and evaluation of RAG applications remains an ongoing challenge.
In the coming days, I will be explaining each component of a RAG system in depth, with examples. Hopefully, these articles give you an insight about how difficult is it to build awesome RAG systems. Follow for more!
— —
— —
I am looking for a job. If you’re hiring, reach out on LinkedIn.