6-Building Advanced RAG Q&A Project With Multiple Data Sources With Langchain
Based on Krish Naik's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Wrap each knowledge source as a LangChain tool so an LLM can call it consistently during Q&A.
Briefing
A multi-source RAG Q&A setup becomes practical by combining LangChain “tools” with an agent that can route questions to the right retrieval backend. Instead of forcing a single knowledge source, the workflow builds separate retrieval tools for Wikipedia, a research-paper repository (referred to as “RVE/RF” in the transcript), and a LangSmith documentation search. An agent then decides which tool to call—first trying Wikipedia, then falling back to the research repository—so a single chat interface can answer questions across different domains.
The build starts with LangChain tools as the integration layer. Tools are described as interfaces an LLM can use to interact with external systems. The transcript lists many built-in options (including search and finance-related tools), but the implementation focuses on three retrieval-oriented tools. For Wikipedia, it uses LangChain’s Wikipedia query runner and Wikipedia API wrapper to create a configurable “top-k” document retriever, with a character limit (e.g., 200 characters) to control how much context gets returned.
For custom content, the process shifts to document loading, chunking, and vector indexing. A web page is fetched using a web-based loader, then split into overlapping chunks using a recursive character text splitter (example settings include chunk size around 1000 characters with overlap around 200). Those chunks are embedded with OpenAI embeddings and stored in a vector database. The vector store is then converted into a retriever interface, and wrapped into a LangChain retrieval tool using create_retriever_tool so it can be invoked by the agent with a natural-language instruction like “search for information about LangSmith.”
The agent layer is where routing happens. An OpenAI tools agent is created with the combined tool list (Wikipedia tool, the custom/research retriever tool, and the LangSmith retrieval tool). A prompt is pulled from LangChain Hub (after installing the required LangChain Hub dependency), and an AgentExecutor is used to run the system end-to-end. With agent_executor.invoke, the same user question triggers tool selection and retrieval: a query like “tell me about LangSmith” routes to the LangSmith search tool, while a broader question like “tell me about machine learning” can route to Wikipedia, and a research-paper-related prompt routes to the custom repository retriever.
The practical takeaway is that “advanced RAG” here isn’t a new model—it’s orchestration. By wrapping each data source as a tool and letting an agent choose the sequence of tool calls, the system delivers multi-source Q&A without hardcoding one retrieval path. The transcript also emphasizes debugging via verbose execution: enabling verbose output reveals which tool the agent invoked and helps validate the routing logic.
Cornell Notes
The project builds a multi-source RAG Q&A system by turning each knowledge backend into a LangChain “tool” and using an agent to route questions to the right tool. Wikipedia is wrapped using LangChain’s Wikipedia query runner/API wrapper with configurable top-k and context length. A custom web/research source is loaded, chunked with overlap, embedded using OpenAI embeddings, stored in a vector database, and converted into a retriever tool via create_retriever_tool. A combined tool list is passed into an OpenAI tools agent, with a prompt sourced from LangChain Hub. AgentExecutor then runs the pipeline so queries like “LangSmith” hit the LangSmith retriever, while other questions can fall back to Wikipedia or the custom repository.
What does LangChain mean by “tools,” and why are they central to multi-source RAG?
How is the Wikipedia retrieval tool constructed in the transcript?
What are the key steps to convert a custom web page into a retriever tool?
How does the agent decide which source to query?
Why does AgentExecutor matter, and what does verbose output help with?
Review Questions
- If a user asks a question that could be answered by both Wikipedia and the custom PDF/web corpus, what mechanisms in this design influence which tool gets called?
- Where in the pipeline do chunk size and chunk overlap affect retrieval quality, and how would you expect changing them to impact answers?
- What is the role of create_retriever_tool compared with directly using a vector store retriever?
Key Points
- 1
Wrap each knowledge source as a LangChain tool so an LLM can call it consistently during Q&A.
- 2
Use Wikipedia query runner/API wrapper to create a configurable Wikipedia retrieval tool with top-k and context-length limits.
- 3
For custom sources, load content, split into overlapping chunks, embed with OpenAI embeddings, store in a vector database, then convert to a retriever.
- 4
Use create_retriever_tool to expose a retriever as an agent-callable tool with an instruction prompt.
- 5
Combine multiple tools into a single tool list and let an OpenAI tools agent route questions to the most relevant backend.
- 6
Run everything through AgentExecutor and enable verbose output to verify which tool was invoked for each query.