What is RAG (Retrieval-Augmented Generation)?

Definition

A technique that augments LLM generation by retrieving relevant context from a vector store before generating responses. For local LLMs with limited context windows, RAG enables processing of large codebases by storing vectors in a vector store, allowing the AI to understand code meaning without exceeding context limits.

Examples in the Wild

Example 1:Storing millions of lines of code as vectors to enable semantic search
Example 2:Retrieving relevant code snippets before generating fixes
Example 3:Loading large repositories without blowing up context window

See it in action

View local-rag-knowledge-graph-agent Template →