Get AI summaries of any video or article — Sign up free
Build MCP Servers With Tools From Scratch With Langchain thumbnail

Build MCP Servers With Tools From Scratch With Langchain

Krish Naik·
5 min read

Based on Krish Naik's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

MCP servers package tools plus the context/prompts needed for tool use, while the host app relies on an MCP client to connect to those servers.

Briefing

MCP servers can be built from scratch as tool backends—then wired into a single LangGraph/LangChain-powered agent that decides when to call those tools. The practical takeaway is that the same agent can talk to multiple MCP servers, while each server can expose its tools over different transport layers: local stdio for quick testing and Streamable HTTP for running as a reachable API.

The setup starts with three components: one or more MCP servers (each bundling tools plus the context/prompts needed to use them), an MCP client (maintaining a one-to-one connection to each server inside the host app), and the host application (a chatbot-style system). In the example, the host chatbot uses an LLM to interpret user input and decide whether it needs external tool calls. When the user asks for something like “weather in New York or Bangalore,” the LLM can’t answer from live knowledge, so it triggers a tool call through MCP. The MCP server then runs the appropriate tool—such as a weather function—and returns the result back to the agent.

From there, the tutorial shifts into implementation details using LangChain adapters and LangGraph. A project is initialized with uv, a virtual environment is created, and dependencies are installed via a requirement.txt file. The core libraries used are LangChain adapters (to integrate MCP into LangChain/LangGraph) and Fast MCP (a Pythonic way to build MCP servers and clients). Two MCP servers are created.

The first server, math_server.py, exposes two tools—add and multiply—typed as integers. It runs with transport set to stdio, meaning the server communicates via standard input/output. That design makes local testing straightforward: the server runs from the command line, and a client can interact with it without needing an HTTP endpoint.

The second server, weather.py, exposes a weather tool that accepts a location string and returns a string. It runs with transport set to Streamable HTTP, turning the MCP server into an HTTP-accessible service. When started, it binds to a local URL (defaulting to localhost:8000 in the example), so the client can reach it over HTTP routes such as /mcp.

A multi-server MCP client is then assembled in client.py using langchain_mcp adapters. The client is configured with two entries: one for math_server.py over stdio, and one for the weather server over HTTP (pointing at the localhost URL and the MCP endpoint). A LangGraph pre-built “create react agent” is created with a Grok chat model (loaded via an environment variable for the Grok API key) and the combined tool list from both servers.

Finally, the agent is invoked with user messages like “what is 3 + 5 * 2” and “what is the weather in NYC or California.” The agent routes the request to the correct MCP tool based on the message, returning computed math results from the stdio-backed server and the weather string from the HTTP-backed server. The end result is a working pattern for interview-ready MCP integration: multiple MCP servers, one agent, and transport choices tailored to local development versus API-style deployment.

Cornell Notes

The core pattern is building MCP servers as tool providers, then connecting them to a single LangGraph/LangChain agent that decides when to call which tool. The tutorial constructs two Fast MCP servers: a math server (add/multiply) using stdio transport for local command-line testing, and a weather server using Streamable HTTP so it runs as an API at a localhost URL. A multi-server MCP client is configured with both transports, then a LangGraph “create react agent” is created with a Grok chat model and the combined tool set. Invoking the agent with user prompts triggers the correct MCP tool calls and returns results back to the user.

What are the three main components in an MCP-based chatbot setup, and how do they interact?

The architecture has MCP servers, an MCP client, and an app (chatbot/host application). MCP servers bundle tools plus the context/prompts needed for tool use. The MCP client maintains a one-to-one connection to each MCP server inside the host app. When a user sends input, the LLM decides whether it needs a tool call; if so, the agent routes the request through the MCP client to the correct server, which executes the tool and returns the response.

Why does transport matter, and what’s the difference between stdio and Streamable HTTP in this example?

Transport defines how the MCP client talks to the MCP server. With stdio transport, the server communicates via standard input/output, so it’s easy to run locally from a terminal and test tool calls without an HTTP endpoint. With Streamable HTTP transport, the server runs as an HTTP service with a URL (localhost:8000 by default in the example), so the client can reach it like an API and discover tools via MCP routes such as /mcp.

How are the math tools implemented and exposed by the first MCP server?

The math_server.py Fast MCP server defines two integer-typed tools: add(a: int, b: int) returns a + b, and multiply(a: int, b: int) returns a * b. Each tool includes a docstring so the LLM/agent can understand what the tool does. The server is started with mcp.run(..., transport=stdIO), enabling local stdio-based tool execution.

How does the weather MCP server differ from the math server in both functionality and transport?

The weather server (weather.py) defines a tool that takes a location string and returns a string (the example uses a constant response like “always rainy in California” to demonstrate the flow). Unlike the math server, it runs with transport=streamable_http, so it becomes an HTTP-accessible service. When running, it appears at a local URL, and the client can query it to retrieve MCP tool information.

How does the multi-server client route requests to the correct MCP server?

client.py creates a multi-server MCP client configured with two server entries: one for math_server.py using stdio transport (running the Python command with the math server file), and one for the weather server using an HTTP URL (including the MCP endpoint path). The agent then uses the combined tool list returned by client.get_tools(). When invoked, the agent selects the appropriate tool based on the user’s message and the tool descriptions.

What does the agent invocation demonstrate about end-to-end MCP integration?

Invoking the agent with prompts like “what is 3 + 5 * 2” triggers the math tools and returns computed results (e.g., 96). Invoking with prompts like “what is the weather in NYC or California” triggers the weather tool and returns the server’s weather string. Running both prompts in the same client session demonstrates that one agent can call tools across multiple MCP servers with different transports.

Review Questions

  1. How would you decide whether to use stdio transport or Streamable HTTP transport for a new MCP server?
  2. Describe the steps required to connect two MCP servers (one stdio, one HTTP) to a single LangGraph agent.
  3. What role do tool docstrings and type hints play in enabling the agent to choose the correct MCP tool?

Key Points

  1. 1

    MCP servers package tools plus the context/prompts needed for tool use, while the host app relies on an MCP client to connect to those servers.

  2. 2

    A single LangGraph/LangChain agent can integrate multiple MCP servers by using a multi-server MCP client and passing the combined tool set to the agent.

  3. 3

    Use stdio transport for fast local development and testing because the server communicates via standard input/output from the command line.

  4. 4

    Use Streamable HTTP transport when the MCP server should behave like a reachable API service with a URL and HTTP-based discovery.

  5. 5

    Fast MCP provides a straightforward Pythonic way to define MCP servers and tools, including typed tool signatures and docstrings.

  6. 6

    When invoking the agent, tool selection happens based on the user prompt and the tool metadata, routing requests to the correct MCP server automatically.

Highlights

Transport is the key operational difference: stdio keeps MCP servers local and terminal-driven, while Streamable HTTP turns them into URL-addressable services.
One agent can call tools from multiple MCP servers at once by building a multi-server MCP client and feeding its tools into a LangGraph “create react agent.”
Tool definitions (typed inputs/outputs plus docstrings) are what let the agent reliably choose between math operations and an external “weather” tool call.

Topics

Mentioned