RAG Fusion with a grain of salt
If you’ve created a chatbot that uses Retrieval Augmented Generation (RAG) you’ve probably come across RAG Fusion - a method advertised to improve the performance of such chatbots. In theory using Reciprocal Rank Fusion (RRF) to improve retrieval in RAG makes complete sense. However, there are cases where this approach doesn’t really fit into a production system. What is RAG? Retrieval Augmented Generation is a technique used to improve the accuracy of LLMs by providing them with an external knowledge base for response generation. While processing the user’s query, relevant information is retrieved from this external data source and passed to the LLM as a source of truth for generating the response. The data source is generally divided into documents and stored in a vector database as embeddings. ...