Build Hour: Built-In Tools

TL;DR

Built-in tools run on OpenAI infrastructure, so developers typically don’t need to execute tool calls in their own code or manually feed results back into the model’s context.

Briefing Cornell Notes

Briefing

Built-in tools let large language models search the web, query private files, and call external services—without developers writing the usual glue code to run those actions. Instead of manually executing function calls and feeding results back into the model, the tools run on OpenAI infrastructure and the outcomes are automatically inserted into the conversation so the model can produce a final answer in the same flow. That shift matters because it removes a major source of engineering overhead (and state-management complexity) when building agentic apps that need live information or access to business systems.

The session breaks “built-in tools” into two parts. “Tools” are required when LLMs must interact with data or take actions—LLMs can generate text, but they can’t natively search internal databases, retrieve current events, or operate on third-party platforms. “Built-in” means developers don’t have to code the tool plumbing themselves; they can enable a set of hosted capabilities and grant the model access. The team contrasts this with traditional function calling: function calling requires three explicit steps—declare functions, execute them in your code, then return results to the model. Built-in tools collapse that middle execution step by running the tool automatically and appending results to the context.

As of the time of the demo, six tools are available. Web search addresses the models’ knowledge cutoff (described as May 2024 for current models), giving real-time access to public information via OpenAI’s own services and index. File search provides retrieval over an uploaded knowledge base using the RAG pattern (retrieval-augmented generation), handling preprocessing like chunking and embedding as well as retrieval and reranking so developers don’t need to build a full retrieval pipeline. The MCP tool (Model Context Protocol) is positioned as a gateway to hundreds of remote tools: by connecting to an MCP server, models can access provider-specific capabilities exposed by that server. Code interpreter executes Python on OpenAI infrastructure for deterministic computation, data analysis, and chart generation, and it can also work with files uploaded into the tool environment.

A live walkthrough uses the OpenAI Playground as an experimentation lab. File search is demonstrated by uploading two PDFs and asking a question; the model returns bullet-point answers with citations to the specific files it used. Web search is demonstrated with a current-events query (including a note that web search can “spiral” when it keeps searching, so prompts should instruct it to search once). MCP is demonstrated with Shopify: the model fetches the tools available on a specific store, asks for approval when configured to do so, and then returns product results without developers manually wiring Shopify APIs. Code interpreter is shown both alone and in combination with web search, including generating a comparison chart from historical weather data.

The second half turns the demos into an application pattern. A sample dashboard app is built by gradually adding tools—Stripe via an MCP server, web search, and code interpreter—so users can ask questions like “What’s my Stripe balance?” and get both answers and visual components. The app uses a custom “generate component” function with structured outputs to render cards, tables, and charts, while hosted tools handle data retrieval and computation. The practical takeaway: built-in tools reduce time-to-prototype, offload retrieval and execution complexity, and support multi-turn tool use where the model can chain actions to solve multi-step tasks.

A customer spotlight from Hebia ties the approach to financial and legal workflows. Hebia uses web search to overcome knowledge cutoffs and to support an explore/exploit research strategy, then deepens analysis using sources like SEC filings and private datasets. In its products, web search is used at scale (e.g., adding columns across thousands of companies) and as part of long-running research plans that can span up to an hour, with MCP used selectively when off-the-shelf servers don’t meet enterprise indexing needs.

Cornell Notes

Built-in tools give LLM-powered apps access to live web information, private documents, and third-party systems without developers manually executing tool calls. Instead of the three-step function-calling loop (declare → execute in code → return results), hosted tools run automatically and their outputs are added to the model’s context. The available tools include web search, file search (RAG with automatic chunking/embedding and retrieval optimization), MCP for connecting to remote tool servers, and code interpreter for deterministic Python execution and chart generation. In practice, the Playground helps prototype quickly, and a dashboard app shows how tool outputs can be turned into UI components via structured outputs. This matters because it cuts engineering overhead and enables multi-step, agentic workflows that combine multiple tools in sequence.

How do built-in tools differ from traditional function calling in an app’s control flow?

Traditional function calling requires three explicit steps: (1) declare the functions and their parameters to the model, (2) execute the chosen function in developer code using the model-provided parameters, and (3) send the function result back into the conversation context so the model can generate a final response. Built-in tools remove the developer execution step for most tools: the model selects a tool, OpenAI infrastructure runs it, and the results are automatically inserted into the conversation history so the model can respond without the developer manually wiring the middle step.

Why is web search a key built-in tool given LLM knowledge cutoffs?

The session highlights that current models have a knowledge cutoff (described as May 2024), meaning they can’t reliably answer about events after that date. Web search grants the model access to real-time public information using OpenAI’s own indexing and services, so questions like “When is GBD5 expected to launch?” can be answered by searching the internet rather than relying on training data.

What does file search automate for RAG pipelines?

File search is presented as a hosted RAG system. Developers upload files to a vector store, and the system handles preprocessing such as chunking and embedding. It also optimizes retrieval and reranking so the model gets the most relevant context at the right time, without building and tuning a full retrieval pipeline. In the demo, uploaded PDFs were searched with citations to the exact files used.

What does MCP add beyond built-in tools like web search and file search?

MCP (Model Context Protocol) is described as a way to connect models to remote MCP servers that expose “hundreds and hundreds” of tools. Instead of only using OpenAI-hosted tools, the model can access provider-specific capabilities exposed by an MCP server. In the Shopify example, the model connects to a specific store, fetches the tools that Shopify exposes (e.g., search, catalog, get cart, update card), and then uses them to retrieve product information.

Why combine web search with code interpreter?

Web search can fetch current or historical data, while code interpreter can compute and visualize it deterministically by running Python. The demo shows web search retrieving weather information and code interpreter generating a comparison chart using historical data. The session also notes a practical prompt tactic: instruct the model to search once to avoid web-search “spiraling” when tool chaining is involved.

How does the dashboard app turn tool results into visual components?

The app uses a custom function called “generate component” with structured outputs to define a schema for UI elements (cards, charts, tables). When the model decides it has displayable results, it calls this function with parameters like titles/values/descriptions for cards, chart data for charts, or row data for tables. The frontend then updates a store so the dashboard renders the new components.

Review Questions

What are the three steps required for traditional function calling, and which step is handled automatically by built-in tools?
In what situations should a prompt instruct the model to “search once,” and why does that matter when chaining tools?
How do file search and code interpreter each reduce different types of engineering work (retrieval vs computation) in a RAG-and-analytics workflow?

Key Points

1
Built-in tools run on OpenAI infrastructure, so developers typically don’t need to execute tool calls in their own code or manually feed results back into the model’s context.
2
Web search addresses LLM knowledge cutoffs by enabling real-time access to public information through OpenAI’s indexing services.
3
File search provides hosted RAG by automating chunking, embedding, retrieval, and reranking over uploaded files, reducing the need to build a custom retrieval pipeline.
4
MCP connects models to remote tool servers, letting apps access provider-specific capabilities (e.g., Shopify store actions) through a standardized protocol.
5
Code interpreter executes Python deterministically for data analysis and chart generation, and it can operate on files available in its tool environment.
6
Tool chaining is supported: combining web search with code interpreter enables multi-step workflows like fetching data and then computing comparisons and visualizations.
7
When building UIs, structured outputs can bridge tool results to dashboard components (cards, tables, charts) via a custom “generate component” function.

Highlights

Built-in tools eliminate the “execute the function in your code” step from function calling by running tools automatically and injecting results into the conversation context.

File search handles the hard parts of RAG—chunking, embedding, retrieval, and reranking—so developers can upload files and ask questions with citations.

MCP turns a model into a connector for external ecosystems like Shopify by exposing the remote server’s available tools to the model.

Code interpreter provides deterministic computation by running Python on OpenAI infrastructure, enabling reliable charts and analysis.

The dashboard demo shows a practical pattern: hosted tools fetch/compute data, while structured outputs drive UI rendering.

Topics

Built-In Tools
Web Search
File Search
MCP Integration
Code Interpreter
RAG
Agentic Workflows
Dashboard UI

Mentioned

Christine
Katya
Will
LLM
RAG
MCP
USD
API