Retrievers in LangChain | Generative AI using LangChain | Video 13 | CampusX
Based on CampusX's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Retrievers are runnable LangChain components that take a user query and return multiple LangChain Document objects from a data source.
Briefing
RAG systems live or die by retrieval quality, and LangChain’s retrievers are the modular “search engines” that pull the most relevant documents from a data source in response to a user query. In this CampusX walkthrough, retrievers are framed as runnable components: they take a query as input and return multiple LangChain Document objects, while internally searching a data store (vector database, API, or other sources). That modularity matters because it lets developers swap retrieval strategies and plug retrievers into larger chains without rewriting the whole pipeline.
The session begins by placing retrievers as the fourth core RAG component after document loading, text splitting, and vector stores. A retriever is defined as a LangChain component that fetches relevant documents from a data source for a given user query. The walkthrough emphasizes that LangChain doesn’t rely on a single retriever type: multiple retrievers exist for different use cases, and all are “runnable,” meaning they can be composed into chains for end-to-end RAG workflows.
From there, retrievers are categorized in two practical ways. First, by the data source they query: examples include a Wikipedia retriever that calls the Wikipedia API and selects relevant articles (using keyword-based matching rather than semantic search), and a vector-store retriever that performs semantic similarity search using vector embeddings. Second, by retrieval strategy: the lecture previews advanced approaches such as MMR (Maximum Marginal Relevance), Multi-Query retrieval, and Contextual Compression.
The code demos start with the Wikipedia retriever. A retriever object is created with parameters like the number of top results (k) and language. Calling the retriever’s invoke method sends the query to the Wikipedia API, then returns a list of Document objects with page content and metadata. A key clarification follows: this is not merely a document loader that bulk-fetches everything; it behaves like a search mechanism that decides which articles to return based on relevance.
Next comes the vector-store retriever using Chroma and OpenAI embeddings. Documents are embedded into dense vectors, stored in a vector database, and retrieved by comparing the query embedding against stored document embeddings. The lecture also addresses why a retriever wrapper can still be useful even when a vector store can run similarity search directly: the retriever becomes a standardized runnable interface that enables swapping in more advanced retrieval strategies later.
Three advanced retrievers are then unpacked conceptually and with examples. MMR tackles redundancy: instead of returning the top-k most similar documents that may repeat the same idea, it selects documents that are both relevant to the query and diverse from each other. Multi-Query retriever handles ambiguous questions by sending the original query to an LLM to generate multiple focused sub-queries, retrieving for each, then merging and deduplicating results—improving coverage when a single query could mean several things. Contextual Compression retriever improves answer quality by trimming retrieved documents: it first retrieves candidate documents with a base retriever, then uses an LLM-based compressor to keep only the parts relevant to the query, discarding unrelated sections to reduce noise and context length.
The takeaway is forward-looking: many retrievers exist because RAG performance often needs iterative upgrades. When a baseline RAG system underperforms, swapping in advanced retrievers—rather than changing the entire architecture—can meaningfully improve relevance, diversity, and user experience.
Cornell Notes
Retrievers are LangChain components that fetch relevant documents from a data source in response to a user query, returning multiple LangChain Document objects. They’re runnable, so they can be plugged into chains and swapped to change retrieval behavior without rebuilding the whole RAG pipeline. The lecture groups retrievers by (1) data source—like Wikipedia API vs vector stores—and (2) retrieval strategy—like MMR, Multi-Query, and Contextual Compression. MMR reduces redundancy by selecting relevant yet diverse documents. Multi-Query improves ambiguous queries by generating multiple sub-queries with an LLM, retrieving for each, then merging results. Contextual Compression trims retrieved documents to keep only query-relevant content, reducing noise and context length.
What exactly does a retriever do inside a RAG pipeline, and what does it return?
How can retrievers be categorized in LangChain?
Why does MMR matter when similarity search returns redundant results?
How does Multi-Query retriever handle ambiguous user questions?
What problem does Contextual Compression retriever solve, and how?
Review Questions
- In what ways can retrievers be swapped to improve a RAG system without changing the rest of the pipeline?
- Describe how MMR differs from standard similarity search in terms of relevance and diversity.
- Explain the end-to-end flow of Multi-Query retrieval from an ambiguous user query to merged final results.
Key Points
- 1
Retrievers are runnable LangChain components that take a user query and return multiple LangChain Document objects from a data source.
- 2
LangChain retrievers can be categorized by data source (e.g., Wikipedia API vs vector store) and by retrieval strategy (e.g., MMR, Multi-Query, Contextual Compression).
- 3
Wikipedia retriever behavior is search-like: it queries the Wikipedia API and selects relevant articles rather than loading everything.
- 4
Vector-store retrievers use embeddings for semantic similarity search by converting both documents and queries into dense vectors.
- 5
MMR reduces redundancy by selecting documents that are relevant to the query while also being diverse from each other.
- 6
Multi-Query retriever improves ambiguous queries by using an LLM to generate multiple sub-queries, retrieving for each, then merging and deduplicating.
- 7
Contextual Compression retriever improves answer quality by trimming retrieved documents to only query-relevant content, reducing noise and context length.