Zero Budget RAG (Part 1)
What is RAG (Retrieval-Augmented Generation) in Large Language Models?
RAG stands for Retrieval-Augmented Generation, a technique used in Large Language Models (LLMs) to enhance performance and accuracy. Unlike traditional language models that generate text based solely on internal knowledge, RAG incorporates external knowledge retrieval into the generation process.
How RAG Works
When a user asks a question or provides a prompt, the RAG model first retrieves relevant information from a vast database or knowledge graph. This retrieved information augments the model's internal knowledge, enabling it to generate more accurate and informative responses. RAG is especially effective for tasks requiring specific domain knowledge, such as answering complex questions, generating summaries, or creating content requiring expertise.
Simplified RAG Process
The RAG process can be divided into three steps:
- Text Embedding: Transform Q&A texts into vectors and store them in a database.
- Retrieval: Use semantic or machine learning search to find relevant answers based on the user query.
- Generation: Send the question and retrieved answer to the LLM with an instructional prompt.
However, for simplicity, we'll skip the text embedding and retrieval steps, replacing them with a sentence transformer model called "all-MiniLM-L6-v2." This model maps sentences and paragraphs to a 384-dimensional dense vector space and is useful for tasks like clustering or semantic search.
Using "all-MiniLM-L6-v2" with Hugging Face API
For this example, we'll use the "all-MiniLM-L6-v2" model via the Hugging Face inference API. This method is ideal for small datasets, like our example dataset about August Host.
Dataset:
const sentences = [
"August Host is a tech company based in Taunggyi, Myanmar, uses advanced technology to help clients achieve their software and hardware solutions.",
"August Host is located at No 13/13, Myat Lay Street, Taunggyi; Myanmar.",
"August Host Dev Team use Git, Trello, React, Javascript, Typescript, GraphQL, Laravel, Nextjs, Wordpress, Vue, Cockpit CMS, Headless CMS, Serverless, SSG, SSR.",
"August Host was founded in 2018 by tech enthusiasts Ronald Aug and Kyaw Swar Aye, August Host emerged as the first web agency in Taunggyi, Myanmar. Ronald, with a background in tech and a passion for remote work culture, decided to establish the agency in his hometown, where he met Kyaw Swar Aye, a budding programmer eager to learn app development. Together, they envisioned filling the gap in the market by becoming pioneers in web development in the region.",
"Success Story: While August Host is still on its journey to success, the agency has made significant strides in the industry. Despite facing challenges such as the political instability in Myanmar, the team remains dedicated to their vision of delivering top-quality digital solutions to clients across South East Asia.",
"Vision: August Host aspires to become a renowned web agency not only in Myanmar but also in the wider South East Asia region. By consistently providing innovative and exceptional services, the agency aims to establish itself as a leader in the industry, setting new standards for digital excellence.",
"The mission of August Host is to help clients transform their tasks from paper to digital, enhancing efficiency and speeding up daily operations. By offering high-quality digital services from Taunggyi to clients across South East Asia, the agency aims to make a positive impact on businesses seeking to thrive in the digital age.",
"August Host has had the privilege of working with esteemed clients such as Plan International, Myanmar Private Equity and Venture Capital Association (MPEVCA), and others. One notable project includes the development of the Yangon Evacuation Map for Plan International, providing vital safety information for residents in Myanmar.",
"The talented team at August Host includes Ronald Aug, Kyaw Swar Aye, Poe Eain, and Thura, each bringing a unique set of skills and expertise to the table. From full stack development to app programming and design, the team collaborates to deliver tailored digital solutions to meet client needs.",
"Challenges: Despite facing challenges in the form of political instability and other obstacles, August Host remains committed to pushing boundaries and overcoming hurdles to achieve success. The agency continues to strive for excellence in the face of adversity, demonstrating resilience and determination in their pursuit of digital innovation.",
"August Host offers a range of services including web development, app programming, digital marketing, SEO, and tech consulting. Using modern tools and technologies such as PHP, JS, Typescript, and Serverless, the agency creates custom websites and applications for clients, while also providing consultation on advertising, SEO, and domain email creation."
];
Example Question: "Does August Host team offer SEO?"
Using the Hugging Face inference API, the "all-MiniLM-L6-v2" model will find the most similar sentence in our dataset, which is:
August Host offers a range of services including web development, app programming, digital marketing, SEO, and tech consulting. Using modern tools and technologies such as PHP, JS, Typescript, and Serverless, the agency creates custom websites and applications for clients, while also providing consultation on advertising, SEO, and domain email creation.
Sending to LLM
Next, we send the question and found answer to the LLM "Mistral-7B-Instruct-v0.2" with the following payload:
const payload2 = {
inputs : `You are August AI, AI assistant. Your role is to provide human-like responses to address the following "Data from database" as a helpful assistant would.
User Prompt: ${question}
Data from database: ${found_answer}
Your Response:`
}
Question: "Does August Host team offer SEO?"
Found Answer: "August Host offers a range of services including web development, app programming, digital marketing, SEO, and tech consulting."
The LLM will generate a human-like response:
Absolutely, August Host's team does offerSearch Engine Optimization (SEO) services. They employ modern tools and technologies to optimize websites for search engines, ensuring they rank high in search engine results pages and attract more organic traffic.
So, with these two inference API calls, we could able to retrieve what we want. Let's continue to the part 2.