🔍 Click image to zoom

RAG pipeline — full visual walkthrough
Share

Frequently Asked Questions

What does RAG stand for in AI?

RAG stands for Retrieval-Augmented Generation. RAG is an AI technique that augments a language model's output by first retrieving relevant documents from an external knowledge base, then using those documents as context when generating a response. The term was coined by Meta AI researchers Patrick Lewis et al. in a 2020 paper.

When should I use RAG instead of fine-tuning?

Use RAG when your knowledge base changes frequently, when you need to cite specific source documents, when you want to avoid the cost of retraining, or when your data is confidential and cannot be used in a training dataset. Choose fine-tuning when you need to change the model's style, tone, or output format, and when the knowledge to add is stable and non-sensitive.

What is a vector database and why does RAG need one?

A vector database stores document embeddings — numerical representations of text — and supports fast approximate nearest-neighbour search. RAG requires a vector database because semantic retrieval is based on mathematical similarity between query and document embeddings, not keyword matching. Popular vector databases for RAG include Pinecone, Weaviate, Qdrant, ChromaDB, and pgvector (PostgreSQL extension).

See Also