How to build MCP Clients | MCP Trilogy | CampusX
Based on CampusX's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Build the MCP client around MultiServerMCPClient so one client can connect to multiple MCP servers.
Briefing
The core takeaway is a working blueprint for building an MCP-powered chat client that can automatically discover tools from one or more MCP servers, call the right tool with the right arguments, and then return the result to the user—optionally through a Streamlit GUI. Instead of limiting the client to “normal chat,” the setup turns the LLM into a tool-using agent: it receives a question, decides whether a tool call is needed, executes the tool on the connected MCP server(s), and feeds the tool output back into the model to produce the final answer.
The walkthrough starts by validating a local MCP “maths” server using MCP Inspector, confirming that arithmetic tools like add, subtract, multiply, divide, power, and modulus are available. From there, the client is built in Python using LangChain MCP adapters. The client uses an asynchronous main function and a MultiServerMCPClient to connect to MCP servers via transport configuration (local uses stdio). Once connected, the client fetches the server’s tools and converts the returned tool list into a named dictionary keyed by tool name, making it easy to look up and invoke the correct tool later.
Next comes the agent loop. An OpenAI Chat model is instantiated (via ChatOpenAI) and bound to the discovered tools. When the model is prompted (e.g., “What is 12 * 15?”), it may respond with tool call metadata rather than a direct answer. The client extracts the tool call(s), identifies which tool the model selected (including the tool name and tool call arguments), and then directly invokes that tool from the named dictionary. The tool result is wrapped into a LangChain ToolMessage, and the client re-invokes the LLM with the original prompt, the model’s intermediate response, and the tool message—so the model can generate the final natural-language answer. A guard clause handles non-tool questions (like “What is the capital of India”) by checking whether tool-call attributes exist; if not, the client returns the model’s content immediately.
The client is then upgraded to handle multiple tool calls in one response by moving tool execution into a loop. After that, the client connects to additional MCP servers. A remote “expense tracking” server is added using transport via Streamable HTTP, enabling tools like add expense, list expenses, and summarize expenses. The same client can also attach a third-party Manim MCP server to generate animations; the prompt is routed to the Manim tool, and the resulting video is produced.
Finally, the logic is wrapped into a Streamlit app (client2.py), replacing terminal-only interaction with a simple GUI. The demo shows expense queries and a Manim-based animation prompt working through the interface. The result is a flexible MCP client architecture where any number of MCP servers—local or remote—can be attached, and the LLM can dynamically choose the correct tool(s) to fulfill user requests.
Cornell Notes
The build focuses on an MCP client that can connect to one or more MCP servers, fetch their available tools, and let an OpenAI Chat model decide when to call those tools. After connecting (local via stdio, remote via Streamable HTTP), the client retrieves the server’s tool definitions and stores them in a dictionary keyed by tool name. The model is bound to these tools; when a user asks for something requiring computation or data access, the model returns tool-call metadata instead of a direct answer. The client then executes the selected tool with the model-provided arguments, wraps the output in a ToolMessage, and re-invokes the model with the full history to generate the final response. A Streamlit wrapper turns the same logic into a GUI.
How does the client discover what it can do on an MCP server?
What is the end-to-end flow when the LLM decides a tool call is required?
How does the client handle questions that don’t require tool usage?
How can one client connect to both local and remote MCP servers?
Why was a loop added for tool execution?
What does the Streamlit GUI change in the architecture?
Review Questions
- Describe how the client converts fetched MCP tools into a structure that makes tool invocation straightforward.
- Walk through the sequence of calls needed to turn an LLM tool_call response into a final natural-language answer.
- What changes are required to add a second MCP server, and how do local stdio and remote Streamable HTTP differ in configuration?
Key Points
- 1
Build the MCP client around MultiServerMCPClient so one client can connect to multiple MCP servers.
- 2
Fetch tools from the MCP server and store them in a dictionary keyed by tool name for easy lookup.
- 3
Bind an OpenAI Chat model to the discovered tools so the model can return tool_calls when needed.
- 4
Execute the selected tool using the model-provided arguments, then wrap the result in a ToolMessage and re-invoke the model with full history.
- 5
Add a guard for non-tool questions by checking whether tool_calls exist in the initial LLM response.
- 6
Support multiple tool calls by iterating over tool_calls items rather than assuming only one.
- 7
Use stdio for local MCP servers and Streamable HTTP with a URL for remote MCP servers; the same client logic works for both.