Build Hour: Built-In Tools
Based on OpenAI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Built-in tools run on OpenAI infrastructure, so developers typically don’t need to execute tool calls in their own code or manually feed results back into the model’s context.
Briefing
Built-in tools let large language models search the web, query private files, and call external services—without developers writing the usual glue code to run those actions. Instead of manually executing function calls and feeding results back into the model, the tools run on OpenAI infrastructure and the outcomes are automatically inserted into the conversation so the model can produce a final answer in the same flow. That shift matters because it removes a major source of engineering overhead (and state-management complexity) when building agentic apps that need live information or access to business systems.
The session breaks “built-in tools” into two parts. “Tools” are required when LLMs must interact with data or take actions—LLMs can generate text, but they can’t natively search internal databases, retrieve current events, or operate on third-party platforms. “Built-in” means developers don’t have to code the tool plumbing themselves; they can enable a set of hosted capabilities and grant the model access. The team contrasts this with traditional function calling: function calling requires three explicit steps—declare functions, execute them in your code, then return results to the model. Built-in tools collapse that middle execution step by running the tool automatically and appending results to the context.
As of the time of the demo, six tools are available. Web search addresses the models’ knowledge cutoff (described as May 2024 for current models), giving real-time access to public information via OpenAI’s own services and index. File search provides retrieval over an uploaded knowledge base using the RAG pattern (retrieval-augmented generation), handling preprocessing like chunking and embedding as well as retrieval and reranking so developers don’t need to build a full retrieval pipeline. The MCP tool (Model Context Protocol) is positioned as a gateway to hundreds of remote tools: by connecting to an MCP server, models can access provider-specific capabilities exposed by that server. Code interpreter executes Python on OpenAI infrastructure for deterministic computation, data analysis, and chart generation, and it can also work with files uploaded into the tool environment.
A live walkthrough uses the OpenAI Playground as an experimentation lab. File search is demonstrated by uploading two PDFs and asking a question; the model returns bullet-point answers with citations to the specific files it used. Web search is demonstrated with a current-events query (including a note that web search can “spiral” when it keeps searching, so prompts should instruct it to search once). MCP is demonstrated with Shopify: the model fetches the tools available on a specific store, asks for approval when configured to do so, and then returns product results without developers manually wiring Shopify APIs. Code interpreter is shown both alone and in combination with web search, including generating a comparison chart from historical weather data.
The second half turns the demos into an application pattern. A sample dashboard app is built by gradually adding tools—Stripe via an MCP server, web search, and code interpreter—so users can ask questions like “What’s my Stripe balance?” and get both answers and visual components. The app uses a custom “generate component” function with structured outputs to render cards, tables, and charts, while hosted tools handle data retrieval and computation. The practical takeaway: built-in tools reduce time-to-prototype, offload retrieval and execution complexity, and support multi-turn tool use where the model can chain actions to solve multi-step tasks.
A customer spotlight from Hebia ties the approach to financial and legal workflows. Hebia uses web search to overcome knowledge cutoffs and to support an explore/exploit research strategy, then deepens analysis using sources like SEC filings and private datasets. In its products, web search is used at scale (e.g., adding columns across thousands of companies) and as part of long-running research plans that can span up to an hour, with MCP used selectively when off-the-shelf servers don’t meet enterprise indexing needs.
Cornell Notes
Built-in tools give LLM-powered apps access to live web information, private documents, and third-party systems without developers manually executing tool calls. Instead of the three-step function-calling loop (declare → execute in code → return results), hosted tools run automatically and their outputs are added to the model’s context. The available tools include web search, file search (RAG with automatic chunking/embedding and retrieval optimization), MCP for connecting to remote tool servers, and code interpreter for deterministic Python execution and chart generation. In practice, the Playground helps prototype quickly, and a dashboard app shows how tool outputs can be turned into UI components via structured outputs. This matters because it cuts engineering overhead and enables multi-step, agentic workflows that combine multiple tools in sequence.
How do built-in tools differ from traditional function calling in an app’s control flow?
Why is web search a key built-in tool given LLM knowledge cutoffs?
What does file search automate for RAG pipelines?
What does MCP add beyond built-in tools like web search and file search?
Why combine web search with code interpreter?
How does the dashboard app turn tool results into visual components?
Review Questions
- What are the three steps required for traditional function calling, and which step is handled automatically by built-in tools?
- In what situations should a prompt instruct the model to “search once,” and why does that matter when chaining tools?
- How do file search and code interpreter each reduce different types of engineering work (retrieval vs computation) in a RAG-and-analytics workflow?
Key Points
- 1
Built-in tools run on OpenAI infrastructure, so developers typically don’t need to execute tool calls in their own code or manually feed results back into the model’s context.
- 2
Web search addresses LLM knowledge cutoffs by enabling real-time access to public information through OpenAI’s indexing services.
- 3
File search provides hosted RAG by automating chunking, embedding, retrieval, and reranking over uploaded files, reducing the need to build a custom retrieval pipeline.
- 4
MCP connects models to remote tool servers, letting apps access provider-specific capabilities (e.g., Shopify store actions) through a standardized protocol.
- 5
Code interpreter executes Python deterministically for data analysis and chart generation, and it can operate on files available in its tool environment.
- 6
Tool chaining is supported: combining web search with code interpreter enables multi-step workflows like fetching data and then computing comparisons and visualizations.
- 7
When building UIs, structured outputs can bridge tool results to dashboard components (cards, tables, charts) via a custom “generate component” function.