Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is a pattern where a language model retrieves relevant documents from a knowledge store and grounds its answer in that retrieved context.
In plain English
RAG is how you get a model to answer with knowledge it does not have in its weights. Instead of "please know everything," you keep the knowledge in a search index (vector database, Postgres full-text, Elastic), and on each question the system retrieves the most relevant chunks and drops them into the model's context. The answer is generated against a combination of the model's pretraining and the retrieved documents.
RAG caught on because it solved two real problems: hallucinated answers on private or recent data, and the sheer cost of fine-tuning a model for every new corpus. Modern agent stacks often combine RAG with long context — retrieve the top candidates, stuff them in, and let the model reason. The quality of retrieval (embeddings, chunking, reranking) matters as much as the quality of generation.
Why it matters for Black Box
Black Box uses RAG over the owner's own knowledge base — docs, prior drafts, company memory — so agents ground answers in real context instead of guessing. Skill Packs ship with their own retrieval indexes so installing a pack instantly gives specialists domain knowledge.
Examples
- Answering a support question by retrieving three relevant help-center articles first.
- Drafting a newsletter by pulling five prior issues as voice references.
- Answering a legal question by retrieving the relevant contract clauses.