2-Building Multi Agentic AI RAG With Vector Database
Based on Krish Naik's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Set the Gro API key explicitly to prevent unintended fallback to OpenAI models in multi-agent configurations.
Briefing
Agentic AI can be made to answer questions by pulling knowledge from a vector database that’s populated from PDFs—turning raw documents into a searchable “knowledge base” the assistant can query on demand. The core build here wires an assistant to a local PG Vector instance running in Docker, then loads PDF content (via a URL) into vector embeddings so the assistant can retrieve relevant passages and generate grounded responses.
The workflow starts with a practical fix: when using Gro-based setups, the code must explicitly provide a Gro API key rather than relying on defaults that may fall back to OpenAI. With that environment configuration in place, the project shifts to infrastructure—running PG Vector locally through Docker Desktop. Once the database is up, the system uses a “knowledge base” layer that accepts one or more PDF URLs, extracts text from those PDFs, converts the text into vector embeddings, and stores them in a named collection inside PG Vector.
From there, the assistant is created as a function (named consistently with the configuration) and connected to three capabilities: search the knowledge base, read chat history, and expose tools in responses. The assistant is configured with a run ID (initially none, then assigned after the first start), a user identifier, and the knowledge base object. Key toggles—such as enabling knowledge search and chat-history reading—allow the assistant to both retrieve relevant document chunks and maintain conversational context.
A concrete example uses a recipe PDF hosted on Amazon S3 (the URL points to a “recipes.pdf” document). After the program runs, it reports that documents have been added to the vector database. Then the assistant answers questions like ingredients and preparation steps—for example listing ingredient quantities (e.g., chicken, roasted peanuts) and returning directions for making specific dishes. The accuracy comes from retrieval: responses are generated using the most relevant chunks from the embedded PDF content stored in PG Vector.
Implementation details matter because the build depends on several libraries. The setup includes installing dependencies such as SQLAlchemy, PG Vector, psycopg binary, and PyPDF for PDF reading. The project also emphasizes that the same pattern can be repeated with other vector databases (the discussion names alternatives like Qdrant, Pinecone, LanceDB, ChromaDB, and SingleStore), but the walkthrough focuses on PG Vector and PG Vector 2.
The takeaway is less about a single chatbot and more about building an end-to-end agentic RAG pipeline: Docker-hosted vector storage → PDF ingestion via URL → embeddings + collection creation → assistant configured to search and respond. The assignment at the end pushes the same pipeline into a user-facing app using Streamlit, turning the backend workflow into an interactive front end for inputs and chat-style answers.
Cornell Notes
This build creates an agentic RAG system where an assistant answers questions by searching a vector database populated from PDFs. A local PG Vector instance runs in Docker, and a knowledge-base component ingests PDF URLs, extracts text, converts it into embeddings, and stores it in a named collection. The assistant is then configured with tools to (1) search the knowledge base, (2) read chat history, and (3) generate responses grounded in retrieved document chunks. A Gro API key is required to avoid defaulting to OpenAI models. The result is a chatbot that can answer questions like ingredients and directions from a recipe PDF, and the same pattern can be extended to other vector databases and wrapped in Streamlit.
Why does the code need a Gro API key instead of relying on defaults?
How does the system turn a PDF URL into something the assistant can search?
What role does Docker play in this setup?
What does the assistant configuration enable for retrieval and conversation?
What libraries are required to make the pipeline work end-to-end?
How is the recipe example validated in practice?
Review Questions
- What steps are required to go from a PDF URL to a searchable vector collection in PG Vector?
- Which assistant settings are necessary to enable knowledge-base retrieval and chat-history context?
- How would you adapt the same pipeline if you swapped PG Vector for another vector database mentioned in the walkthrough?
Key Points
- 1
Set the Gro API key explicitly to prevent unintended fallback to OpenAI models in multi-agent configurations.
- 2
Run PG Vector locally using Docker Desktop and capture the correct DB URL (including the exposed port).
- 3
Create a knowledge base that ingests PDF URLs, extracts text, generates embeddings, and stores them in a named PG Vector collection.
- 4
Instantiate an assistant wired to the knowledge base with retrieval enabled and chat-history reading turned on.
- 5
Install required dependencies (SQLAlchemy, PG Vector, psycopg binary, PyPDF) to support database access and PDF ingestion.
- 6
Use targeted questions (e.g., ingredients, directions) to verify that answers are grounded in retrieved PDF chunks.
- 7
Extend the backend RAG pipeline into an end-to-end app by wrapping it with Streamlit for a front-end chat experience.