Skip to content
GlossaryConceptUpdated May 2026

RAG

noun · also: vector-store, embedding, openai

What is rag?

RAG (Retrieval-Augmented Generation) is the technique of giving an LLM relevant context to ground its answer, instead of relying on its training knowledge alone.

Definition

Full definition of rag

Standard LLM call: user asks a question, model answers from training data. RAG: user asks a question, you first retrieve relevant docs (from a vector store), pass those docs as context, model answers from them. RAG dramatically reduces hallucination on domain-specific questions and lets you keep proprietary knowledge out of the model. Tiny Command makes RAG flows two steps: query vector store, pass results into the AI step.

In practice

RAG examples

RAG flow
User question → embed → search Pinecone → top-3 docs → prompt Claude with docs as context → grounded answer
Used by

Apps that exemplify rag

See rag in action across real integrations.

FAQ

Common questions about rag

Do I need a vector database for RAG?
For >1000 docs, yes. For small corpora, a simple keyword search or even loading all docs into the prompt works.
Can RAG fully prevent hallucination?
No — even with retrieved context, models sometimes invent details. Instruct the model: 'answer only from the sources; say I don't know if not found'.