Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)

TL;DR

Custom tools in LangChain are callable functions wrapped with a name, description, and input hint so a ReAct-style agent can choose when to use them.

Briefing Cornell Notes

Briefing

Custom tools are the key lever for making LangChain conversational agents more useful—and the biggest practical lesson is that tool use often requires careful prompt and tool-description tuning, especially with chat-based models like gpt-3.5-turbo. The walkthrough starts with the simplest possible tools: a “meaning of life” function that returns a fixed string (42.17658) and a “random number” function that returns a random value. Both are wrapped as LangChain tools so a ReAct-style agent can decide when to call them, based on the tool name, description, and expected input.

When the agent is asked “What is the time in London?”, it correctly selects an external search tool (DuckDuckGo in the earlier context) and returns search results. Asking for “a random number” also works as expected: the agent chooses the random-number tool, receives the output (e.g., 4), and then responds in a chatty style that includes the tool result in its message history. But the “meaning of life” tool reveals a core failure mode: even when a tool exists and the question matches it, the chat model may still answer directly instead of calling the tool. In this run, the agent produced a philosophical explanation rather than returning the tool’s exact value.

The fix is not changing the tool logic—it’s changing the instructions. The transcript shows prompt engineering as a requirement: the system prompt is overwritten to explicitly tell the assistant it “does[es]n’t know” about meaning-of-life answers and must use the meaning-of-life tool for those topics. After that adjustment, the agent reliably calls the tool, uses the action input (“MOL”), and returns the tool’s full string. That also surfaces another nuance: the tool output includes more than the user asked for (it returns both the rounded and unrounded value), and the model may not automatically trim it to match the user’s wording. The takeaway is that tool descriptions and prompts must be precise about both when to call tools and what level of formatting the tool should return.

Next comes a more practical custom tool: a webpage “page getter” that fetches HTML, strips tags using BeautifulSoup, and truncates output to avoid context limits (the transcript notes a ~4,000+ token input constraint). The tool also acknowledges messy real-world text extraction—like excessive newlines—and suggests filtering them. Implemented either as a simple function tool or as a LangChain BaseTool subclass with a run method, it returns cleaned text to the agent.

With the updated prompt—now also instructing the assistant to use the webpage tool when asked about web content—the agent can answer questions like “Is there an article about Clubhouse on TechCrunch today?” and “What are the titles of the top stories on CBSnews.com.” In both cases, the agent fetches the relevant page (sometimes via an intermediate search step), extracts enough text to identify the requested items, and returns titles and dates. The overall message is clear: custom tools make agents dynamic, but reliable tool use depends on aligning prompts, tool descriptions, and output formatting with the model’s chat behavior and context constraints.

Cornell Notes

LangChain custom tools are implemented as callable functions that a ReAct-style agent selects based on tool names, descriptions, and prompts. Simple tools (like a fixed “meaning of life” and a “random number” generator) demonstrate that tool calling can work reliably for some requests but fail for others—especially when gpt-3.5-turbo answers directly instead of using the tool. The transcript shows that overwriting the system prompt to explicitly say the assistant lacks knowledge about certain topics and must use specific tools can force correct tool usage. For real utility, a custom webpage tool strips HTML with BeautifulSoup and truncates content to fit context limits, enabling the agent to answer questions about current articles and top stories on sites like TechCrunch and CBS News.

How are custom tools structured in LangChain for a tool-using agent?

A custom tool is essentially a function the language model can call. In the examples, each tool is wrapped by instantiating a LangChain Tool with (1) a name, (2) a description that tells the model when it’s useful, and (3) an input schema hint (e.g., “input should be MOL” for the meaning-of-life tool). Those tools are then passed into an agent configured with a ReAct-style decision process (the transcript mentions “react framework” for choosing which tool to use).

Why did the “meaning of life” tool fail at first, even though the question matched it?

The chat model sometimes answered directly instead of calling the tool. After asking “What is the meaning of life?”, it returned a philosophical explanation rather than the tool’s exact output (42.17658). The transcript attributes this to chatty behavior and the model’s tendency to “think it knows the answer,” skipping tool calls even when a tool exists.

What change made the agent reliably call the “meaning of life” tool?

Prompt tuning. The system prompt was overwritten to explicitly instruct that the assistant doesn’t know about meaning-of-life answers and should use the meaning-of-life tool for those topics. After that, the agent produced an action input (“MOL”), received the tool’s observation string, and returned it.

What practical constraints shape custom tool design for web content?

Context limits and messy extraction. The transcript notes a ~4,000+ token input constraint; fetching full HTML can exceed it. The webpage tool therefore strips HTML tags with BeautifulSoup, returns text with many newlines (and suggests filtering them), and truncates output to the first 4,000 characters when content is too large.

How does the webpage tool enable answers about current articles and headlines?

The agent uses the webpage getter to fetch and clean page text, then extracts relevant information from that text. Examples include checking TechCrunch for an article about Clubhouse “today” and listing top story titles on CBSnews.com. The agent sometimes attempts a search first, then falls back to fetching the correct page when the initial page attempt fails.

Review Questions

What specific prompt instruction was added to force the agent to call the meaning-of-life tool, and what was the observed effect?
Describe how the webpage getter tool handles both HTML parsing and context-length constraints.
Give one example of a tool-calling success and one example of a tool-calling failure from the transcript, and explain the difference in outcome.

Key Points

1
Custom tools in LangChain are callable functions wrapped with a name, description, and input hint so a ReAct-style agent can choose when to use them.
2
Chat-based models may answer directly instead of calling an available tool, even when the user’s question matches the tool’s purpose.
3
Overwriting the system prompt to explicitly state what the assistant does not know—and which tool must be used for certain topics—can significantly improve tool reliability.
4
Tool output formatting matters: if a tool returns extra details (like both rounded and unrounded values), the model may not automatically trim to the user’s exact request.
5
For web-based tools, stripping HTML and truncating content are essential to avoid context limits and to keep extracted text usable.
6
Implementing tools as BaseTool subclasses (with run/async run) provides a structured way to reuse logic like webpage fetching and cleaning.
7
Dynamic agents become practical when tools fetch targeted external information (web pages, docs, or APIs) and return cleaned, bounded text for downstream reasoning.

Highlights

The “meaning of life” tool initially failed because gpt-3.5-turbo answered directly instead of using the tool, despite the tool being available.

A targeted system-prompt overwrite—telling the assistant it lacks knowledge and must use the tool—made the agent call the correct tool and return the tool’s exact value.

A webpage getter tool using BeautifulSoup plus truncation (to ~4,000 characters) turns raw HTML into agent-consumable text for questions about live sites like TechCrunch and CBS News.

Topics

Custom Tools
LangChain Agents
ReAct Tool Selection
Prompt Engineering
Webpage Scraping

Mentioned

Sam Witteveen
LLM
API
ReAct
HTML
MOL