Tool Calling in LangChain | Generative AI using LangChain | Video 17

TL;DR

Tools are explicit Python functions wrapped with metadata so an LLM can use external capabilities without directly executing code.

Briefing Cornell Notes

Briefing

LangChain tool calling turns an LLM from a text-only assistant into a system that can use external functions safely—by letting the model *suggest* which tool to use and with what structured inputs, while the application code performs the actual execution. The core workflow is built from four linked steps: create tools (Python functions wrapped with metadata), bind them to an LLM (so the model knows what tools exist and their input schemas), tool calling (the LLM emits a structured “tool call” with arguments when a task requires it), and tool execution (the program runs the chosen tool and returns the result back to the model). This separation matters because it avoids letting the LLM directly run code or hit APIs on its own; instead, the developer retains control over what gets executed and with which parameters.

The transcript begins by revisiting why tool use is necessary for AI agents. LLMs are strong at reasoning and generating outputs, but they can’t directly perform actions like updating databases, posting on social platforms, or querying live services. Tools provide the missing “hands and feet”: explicit functions such as a DuckDuckGo search tool or a shell command tool. The missing piece is how to connect an LLM to those tools and how the LLM decides when to use them.

Tool binding is introduced as the registration step where each tool’s name, description, and input schema are provided to the LLM. Once bound, the model can format future tool calls correctly—sending inputs in the exact structure the tool expects. The transcript then demonstrates tool calling using a simple multiply tool. When asked a normal conversational question (“how are you?”), the LLM responds directly without tool use. When asked a computation (“what is 8 * 7?”), it generates a structured output indicating the tool name (e.g., “multiply”) and the argument values (A=8, B=7). A key clarification follows: tool calling does not execute the tool. The LLM only suggests the tool and arguments; LangChain (and the developer’s code) handles execution.

Tool execution is shown next. The application takes the tool call structure, extracts the arguments, and invokes the underlying Python function. The result is returned as a special “tool message,” which can then be appended to the conversation history. That conversation history—human message, AI message, and tool message—is sent back to the LLM so it can produce the final answer grounded in the tool’s output.

To make the concepts concrete, the transcript builds a small real-world application: real-time currency conversion. Because LLMs lack live exchange rates, a tool is created to call an external Exchange Rate API to fetch the conversion factor between a base currency and a target currency. A second tool multiplies the user’s amount by that factor. The LLM is bound to both tools and asked questions like converting 10 USD to INR.

A subtle failure mode appears when the LLM tries to answer two related questions in sequence: it may call the “get conversion factor” tool correctly, but then guess the conversion rate for the “convert” tool instead of using the freshly fetched value. The fix is LangChain’s “Injected Tool Arguments,” where the developer marks the conversion-rate parameter as injected so the LLM does not fill it. Instead, the application extracts the conversion rate from the first tool’s result and injects it into the second tool call. The final flow loops through tool calls, executes them in order, appends tool messages to the history, and returns a grounded final conversion.

The transcript closes by distinguishing this setup from a fully autonomous agent. The currency app uses tool calling and execution, but the developer still orchestrates the execution order and injection logic. A true agent would autonomously plan and execute the multi-step process without manual intervention—something promised for the next video.

Cornell Notes

The transcript explains how LangChain enables “tool calling” so an LLM can use external functions without directly executing code. The workflow has four parts: create tools (Python functions with name/description/input schema), bind tools to an LLM (so it knows what tools exist and how to call them), tool calling (the LLM emits structured tool-call instructions with arguments when needed), and tool execution (application code runs the tool and returns a tool message). A key safety point: the LLM suggests tools and arguments; it does not run the tools itself. A real currency-conversion app demonstrates the full loop, including a fix for incorrect intermediate values using Injected Tool Arguments.

Why can’t an LLM alone perform tasks like database updates or live API queries, and how do tools change that?

LLMs can reason and generate text, but they can’t directly take actions in external systems (e.g., updating a database, posting on LinkedIn/Twitter, or fetching live weather). Tools provide explicit “functions” the system can run. When the LLM decides a task requires external data or computation, it generates a structured tool call; the application then executes the tool and feeds the result back to the LLM.

What does tool binding accomplish, and what information must each tool provide?

Tool binding registers tools with the LLM so it knows what tools are available, what each tool does, and how to format inputs. Each tool needs a name, a description (so the model understands the tool’s purpose), and an input schema (so the model sends arguments in the correct structure). After binding, the LLM can produce tool calls that match those schemas.

What is the difference between tool calling and tool execution?

Tool calling is when the LLM outputs a structured instruction: the tool name plus the arguments to use. Tool execution is when the application code actually runs the underlying Python function (or API call) using those arguments. The transcript emphasizes that the LLM does not execute the tool; it only suggests which tool to use and with what inputs.

How does the conversation history work after tool execution?

After execution, the tool result is wrapped as a special “tool message.” The system appends the human message, the AI message (which contained the tool call), and the tool message into a messages list. That full context is then sent back to the LLM so it can generate a final, tool-grounded answer (e.g., computing “product of 3 and 10” after the multiply tool runs).

Why did the currency-conversion logic fail initially, and how do Injected Tool Arguments fix it?

The LLM sometimes called the first tool (get conversion factor) correctly, but then guessed the conversion rate for the second tool (convert) instead of using the freshly returned value. Injected Tool Arguments mark a parameter (the conversion-rate input) so the LLM must not fill it during tool calling. The developer extracts the conversion rate from the first tool’s tool message and injects it into the second tool call, ensuring the second step uses the real fetched rate.

Review Questions

In the four-step workflow (tool creation, tool binding, tool calling, tool execution), which step actually runs the code, and which step only produces structured instructions?
When using multiple tools in sequence, what problem can occur if the LLM guesses intermediate values, and what LangChain mechanism prevents that?
How does the system ensure the LLM sends arguments in the correct format for a tool call?

Key Points

1
Tools are explicit Python functions wrapped with metadata so an LLM can use external capabilities without directly executing code.
2
Tool binding registers each tool’s name, description, and input schema with the LLM so it can generate correctly structured tool calls.
3
Tool calling produces a structured “tool call” (tool name + arguments) only when the user’s request requires it; normal questions may not trigger tool use.
4
Tool execution is performed by application code (e.g., via an invoke call) using the LLM-suggested arguments, and the result is returned as a tool message.
5
Appending human, AI, and tool messages to a shared conversation history lets the LLM generate a final answer grounded in tool outputs.
6
Injected Tool Arguments prevent the LLM from filling critical intermediate parameters (like conversion rates) so the developer can inject values extracted from earlier tool results.
7
A multi-tool currency converter demonstrates the full loop and highlights the difference between tool-using pipelines and fully autonomous agents.

Highlights

Tool calling is not execution: the LLM emits a structured tool call, while LangChain and the developer run the tool and return a tool message.

Conversation history becomes the bridge between tool results and final answers: human → AI (tool call) → tool message → final AI response.

Currency conversion fails when the LLM guesses intermediate values; Injected Tool Arguments force the system to inject the real conversion rate from the first tool’s output.

The transcript’s currency app uses two tools—one to fetch a conversion factor via an Exchange Rate API and another to multiply by that factor—then orchestrates them in sequence.

Topics

Tool Calling
Tool Binding
Tool Execution
Injected Tool Arguments
Currency Conversion

Mentioned

Nitesh

Tool Calling in LangChain | Generative AI using LangChain | Video 17 | CampusX