Build MCP Servers With Tools From Scratch With Langchain
Based on Krish Naik's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
MCP servers package tools plus the context/prompts needed for tool use, while the host app relies on an MCP client to connect to those servers.
Briefing
MCP servers can be built from scratch as tool backends—then wired into a single LangGraph/LangChain-powered agent that decides when to call those tools. The practical takeaway is that the same agent can talk to multiple MCP servers, while each server can expose its tools over different transport layers: local stdio for quick testing and Streamable HTTP for running as a reachable API.
The setup starts with three components: one or more MCP servers (each bundling tools plus the context/prompts needed to use them), an MCP client (maintaining a one-to-one connection to each server inside the host app), and the host application (a chatbot-style system). In the example, the host chatbot uses an LLM to interpret user input and decide whether it needs external tool calls. When the user asks for something like “weather in New York or Bangalore,” the LLM can’t answer from live knowledge, so it triggers a tool call through MCP. The MCP server then runs the appropriate tool—such as a weather function—and returns the result back to the agent.
From there, the tutorial shifts into implementation details using LangChain adapters and LangGraph. A project is initialized with uv, a virtual environment is created, and dependencies are installed via a requirement.txt file. The core libraries used are LangChain adapters (to integrate MCP into LangChain/LangGraph) and Fast MCP (a Pythonic way to build MCP servers and clients). Two MCP servers are created.
The first server, math_server.py, exposes two tools—add and multiply—typed as integers. It runs with transport set to stdio, meaning the server communicates via standard input/output. That design makes local testing straightforward: the server runs from the command line, and a client can interact with it without needing an HTTP endpoint.
The second server, weather.py, exposes a weather tool that accepts a location string and returns a string. It runs with transport set to Streamable HTTP, turning the MCP server into an HTTP-accessible service. When started, it binds to a local URL (defaulting to localhost:8000 in the example), so the client can reach it over HTTP routes such as /mcp.
A multi-server MCP client is then assembled in client.py using langchain_mcp adapters. The client is configured with two entries: one for math_server.py over stdio, and one for the weather server over HTTP (pointing at the localhost URL and the MCP endpoint). A LangGraph pre-built “create react agent” is created with a Grok chat model (loaded via an environment variable for the Grok API key) and the combined tool list from both servers.
Finally, the agent is invoked with user messages like “what is 3 + 5 * 2” and “what is the weather in NYC or California.” The agent routes the request to the correct MCP tool based on the message, returning computed math results from the stdio-backed server and the weather string from the HTTP-backed server. The end result is a working pattern for interview-ready MCP integration: multiple MCP servers, one agent, and transport choices tailored to local development versus API-style deployment.
Cornell Notes
The core pattern is building MCP servers as tool providers, then connecting them to a single LangGraph/LangChain agent that decides when to call which tool. The tutorial constructs two Fast MCP servers: a math server (add/multiply) using stdio transport for local command-line testing, and a weather server using Streamable HTTP so it runs as an API at a localhost URL. A multi-server MCP client is configured with both transports, then a LangGraph “create react agent” is created with a Grok chat model and the combined tool set. Invoking the agent with user prompts triggers the correct MCP tool calls and returns results back to the user.
What are the three main components in an MCP-based chatbot setup, and how do they interact?
Why does transport matter, and what’s the difference between stdio and Streamable HTTP in this example?
How are the math tools implemented and exposed by the first MCP server?
How does the weather MCP server differ from the math server in both functionality and transport?
How does the multi-server client route requests to the correct MCP server?
What does the agent invocation demonstrate about end-to-end MCP integration?
Review Questions
- How would you decide whether to use stdio transport or Streamable HTTP transport for a new MCP server?
- Describe the steps required to connect two MCP servers (one stdio, one HTTP) to a single LangGraph agent.
- What role do tool docstrings and type hints play in enabling the agent to choose the correct MCP tool?
Key Points
- 1
MCP servers package tools plus the context/prompts needed for tool use, while the host app relies on an MCP client to connect to those servers.
- 2
A single LangGraph/LangChain agent can integrate multiple MCP servers by using a multi-server MCP client and passing the combined tool set to the agent.
- 3
Use stdio transport for fast local development and testing because the server communicates via standard input/output from the command line.
- 4
Use Streamable HTTP transport when the MCP server should behave like a reachable API service with a URL and HTTP-based discovery.
- 5
Fast MCP provides a straightforward Pythonic way to define MCP servers and tools, including typed tool signatures and docstrings.
- 6
When invoking the agent, tool selection happens based on the user prompt and the tool metadata, routing requests to the correct MCP server automatically.