Overview
Retrieval-Augmented Generation (RAG) is a common AI pattern that combines document retrieval with LLM generation. Tracing RAG pipelines helps you debug retrieval quality and generation issues.Basic RAG with Tracing
Here’s a complete RAG pipeline with comprehensive tracing:RAG with Reranking
Improve retrieval quality with two-stage retrieval:Hybrid RAG (BM25 + Vector)
Combine multiple retrieval strategies:What to Trace
Document Corpus
Document Corpus
Capture which documents were available at inference time:
Retrieved Chunks
Retrieved Chunks
Store what the retriever found:
LLM Input
LLM Input
Record what you send to the LLM:
Configuration
Configuration
Capture settings that affect behavior:
Debugging RAG with Traces
Use traces to debug common RAG issues:| Issue | What to Check in Trace |
|---|---|
| Wrong answer | Check retrieved_chunks - are the right docs there? |
| Hallucination | Check prompt - is context actually included? |
| Low quality | Check retrieval_scores - are scores too low? |
| Inconsistent | Check corpus state - did docs change? |