← All Tracks

RAG Systems

Build Retrieval Augmented Generation pipelines — from understanding embeddings to building and evaluating production RAG systems.

intermediate 4 stages 15 hours not started

Prerequisites: Prompt Engineering

Claude APINode.jsVector Database

Stage 1: What is RAG?

You’ll know this when… you can explain why RAG exists, when it’s better than fine-tuning or long context windows, and sketch out how a basic RAG pipeline works.

Key Concepts

  • The problem RAG solves: giving LLMs access to your specific data without retraining
  • RAG vs. fine-tuning vs. large context windows — when to use each
  • The basic pipeline: chunk → embed → store → retrieve → generate
  • Why “garbage in, garbage out” applies to retrieval more than generation
  • Real-world use cases: documentation search, customer support, knowledge bases

Practice Project

Analyze your own data for RAG potential. Take your Intelligence Hub briefings from src/content/intelligence/ and answer: How would you chunk them? By article? By section? By sentence? Write out a chunking strategy document with examples from 3 real briefings, explaining your rationale.


Stage 2: Embeddings & Vector Databases

You’ll know this when… you understand what embeddings are, can generate them via API, and can store/query them in a vector database.

Key Concepts

  • Embeddings: turning text into numbers that capture meaning
  • Similarity search: cosine similarity, nearest neighbors
  • Vector database options: Neon pgvector, Pinecone, Chroma, Weaviate
  • Embedding models: Voyage AI, OpenAI, Cohere
  • Chunking strategies: fixed-size, semantic, recursive splitting
  • Metadata filtering — combining vector search with traditional filters

Practice Project

Embed your briefings into Neon. Use Voyage AI (or another embedding API) to generate embeddings for each briefing item across all your Intelligence Hub data. Store them in a Neon Postgres database with pgvector. Write a script that takes a natural language query and returns the 5 most relevant briefing items. Test with: “What happened with Claude recently?” and “MCP server updates.”


Stage 3: Building a RAG Pipeline

You’ll know this when… you can build a complete RAG system: user asks a question, you retrieve relevant context, and Claude generates an answer grounded in your data.

Key Concepts

  • The retrieval step: query → embed → search → rank → select top-k
  • Context window management: how much retrieved context to include
  • Prompt design for RAG: instructing Claude to use only the provided context
  • Handling “I don’t know” — when retrieved context doesn’t answer the question
  • Citation and source attribution in generated answers
  • Hybrid search: combining vector similarity with keyword matching

Practice Project

Build “Ask the Intelligence Hub.” Create an endpoint or CLI tool where you can ask questions about AI trends and get answers grounded in your briefing history. Pipeline: embed the question → search your Neon vector DB → take top 5 results → send to Claude with the prompt “Answer based only on these briefing items, cite your sources.” Test with 10 questions and evaluate answer quality.


Stage 4: Evaluation & Optimization

You’ll know this when… you can measure RAG system quality, identify failure modes, and systematically improve retrieval and generation.

Key Concepts

  • RAG evaluation metrics: retrieval precision/recall, answer faithfulness, relevance
  • Common failure modes: wrong chunks retrieved, hallucination despite context, lost in the middle
  • Chunking optimization: testing different sizes and overlap strategies
  • Re-ranking: using a second model to re-score retrieved results
  • Contextual retrieval: prepending chunk-level context before embedding
  • Evaluation frameworks: RAGAS, manual scoring rubrics

Practice Project

Evaluate and improve your “Ask the Hub” system. Create a test set of 20 questions with known answers (manually verified from your briefings). Run your RAG pipeline on all 20, score each answer for correctness and faithfulness. Identify the worst 5 answers, diagnose why they failed (bad retrieval? bad chunking? bad prompt?), fix the root cause, and re-test. Track your score improvement.