AI
Retrieval-Augmented Generation (RAG)
Last updated
An AI architecture that retrieves relevant documents from a knowledge base and feeds them to an LLM as context to ground its generated answer in factual sources.
Full definition
Retrieval-Augmented Generation (RAG) is the dominant architecture for AI search engines and chatbots that need factual accuracy. Instead of relying purely on the LLM's training data (which can be outdated or hallucinated), RAG retrieves relevant documents from a knowledge base or live web index, passes them to the LLM as context, and asks the model to generate an answer grounded in those sources. Perplexity, ChatGPT Search and Google AI Overviews all use variations of RAG. From a creator's perspective, RAG is why AEO matters: your content must be retrievable (good metadata, schema, llms.txt) and citable (atomic facts, direct answers, freshness) for LLMs to surface it.
Examples
- ·Perplexity using RAG to answer 'best AI voice tool' by retrieving recent review sites and summarizing their findings.
- ·An internal customer support bot using RAG over a knowledge base to answer product questions accurately.
- ·ChatGPT Search retrieving fresh content from indexed sources to answer time-sensitive queries.