OpenAI Parallel Function Calling with Assitants API

TL;DR

Parallel function calling lets one assistant trigger multiple tool calls at the same time, then merge results into a single response.

Briefing Cornell Notes

Briefing

Parallel function calling with the Assistants API lets one assistant handle a single user request by triggering multiple external tools at the same time—then merging the results into one coherent answer. Instead of running tasks in a strict sequence (search first, then fetch weather, then generate an image), the assistant can decide which tools are needed, launch them concurrently, and return a combined response faster. That matters because it turns “one prompt, many actions” into a practical workflow for real applications like travel planning, research, and content generation.

A simplified example starts with a prompt like “find the best pizza place in New York and the weather.” The assistant selects two tools: a Google search function for the pizza recommendation and a weather function for current conditions. With parallel execution, both tool calls run simultaneously, and the assistant collects their outputs before responding. The results come back quickly enough to feel interactive: links for pizza places appear alongside a weather summary (including temperature and conditions), and the assistant can even follow up by presenting the relevant search results.

The implementation in Python frames the mechanism more concretely. Four tools are defined: an image generator (DALL·E 3), a weather lookup using OpenWeatherMap, a Google search tool built on SerpAPI, and a “get chat response” tool that uses GPT-4 for writing and reasoning. An assistant is created with instructions tailored to the task—such as fetching articles, searching, generating images, pulling weather, and writing based on user queries. A thread is started, the user message is added, and the assistant run is monitored until completion.

The key operational detail is how tool execution fits into the run lifecycle. When the run requires action, the system identifies which tool calls to make, executes them, and then submits the tool outputs back so the run can finish. A loop keeps checking run status—waiting while the assistant processes and while required tool calls complete—before printing the final response.

Several demonstrations highlight the flexibility. A travel prompt for Los Angeles asks for weather, an image of the Hollywood sign, best sushi places, and an email from “Chris” to “Julie.” The assistant issues tool calls in parallel for weather, image generation, and Google search, then drafts the email using the gathered context. Another test requests “the three most popular tourist attractions” in Paris and produces multiple images by calling the image tool multiple times. A research-style prompt about London in the year 1600 triggers writing plus an image. Finally, a prompt about Antarctica uses the weather tool for McMurdo Station and attempts a Google search for pizza there—showing both the capability and the limits of what external tools can verify.

Overall, the workflow emphasizes adaptability: the assistant chooses tools based on the user’s intent, runs them concurrently, and stitches results into a single output. The practical takeaway is that developers can scale this pattern by adding more tools, enabling richer multi-step responses without manually orchestrating every step in code.

Cornell Notes

Parallel function calling with the Assistants API lets one assistant break a user request into multiple tool actions, run those actions at the same time, then combine the outputs into a final response. In the examples, prompts like “best pizza in New York and the weather” trigger a Google search tool and a weather tool concurrently, speeding up turnaround. The Python setup defines tools such as DALL·E 3 for images, OpenWeatherMap for weather, SerpAPI for Google search, and GPT-4 for writing. A run may enter a “requires action” state, prompting the system to execute tool calls and submit results back until the run completes. This pattern supports travel planning, research summaries, email drafting, and multi-image generation.

How does parallel function calling change the way a multi-part prompt is handled?

Instead of executing tasks one after another, the assistant selects the needed tools and triggers them concurrently. For example, a prompt requesting both “best pizza place in New York” and “the weather” leads to simultaneous tool calls: a Google search for pizza recommendations and a weather lookup for current conditions. The assistant then collects both outputs and produces one combined response.

What tools are set up in the Python example, and what does each one do?

Four tools are defined: (1) DALL·E 3 to generate images, (2) a weather function using OpenWeatherMap to fetch current weather, (3) a Google search function using SerpAPI to retrieve search results, and (4) a get chat response function that uses GPT-4 to help with writing and reasoning. These tools are registered as the assistant’s available actions.

What does the “requires action” state mean during an assistant run?

When a run can’t finish without external tool outputs, it enters a state where the system must execute tool calls. The assistant identifies which actions are needed (e.g., search, weather fetch, image generation), the code runs those tools, and then the tool outputs are submitted back. Monitoring continues until the run completes and the final response can be printed.

How can one prompt trigger different categories of work at once?

A single request can combine retrieval, generation, and writing. In the Los Angeles example, the prompt asks for weather, an image of the Hollywood sign, best sushi places, and an email from Chris to Julie. The assistant runs weather lookup, image generation, and Google search in parallel, then uses the results to draft the email.

What happens when the prompt asks for multiple images?

The assistant can call the image tool multiple times in response to a single request. For Paris, the prompt asks for images of “the three most popular tourist attractions,” and the assistant triggers DALL·E 3 repeatedly to generate separate images for each attraction.

Where do the examples show limitations or uncertainty?

When external verification is unclear, the assistant may still attempt a best-effort answer. The Antarctica prompt requests pizza availability at McMurdo Station; the assistant performs a Google search and returns something resembling pizza-related results, but the content is not confirmed as reliable in the demonstration. This highlights that tool outputs depend on what the external services can actually find.

Review Questions

Describe the lifecycle of an Assistants API run when tool calls are needed, including what triggers “requires action.”
In the Los Angeles example, which tools are used for weather, images, and recommendations, and how are their outputs combined into the final response?
Why does parallel execution matter for user experience in multi-step prompts like travel planning?

Key Points

1
Parallel function calling lets one assistant trigger multiple tool calls at the same time, then merge results into a single response.
2
A prompt is decomposed into tool actions by intent—e.g., weather requests map to a weather tool, while “best places” maps to search.
3
The Python workflow registers tools (DALL·E 3, OpenWeatherMap weather, SerpAPI search, GPT-4 writing) and creates an assistant with task-specific instructions.
4
Assistant runs may enter a “requires action” state, requiring the system to execute tool calls and submit outputs before completion.
5
Multi-modal outputs are supported: the assistant can generate images, fetch live data, and draft text (like emails) in one pass.
6
The assistant can call the same tool multiple times when the prompt demands multiple items (e.g., three tourist-attraction images).
7
Tool reliability depends on external sources; when search results are thin or ambiguous, the assistant’s output may be uncertain.

Highlights

A single prompt can launch weather lookup and Google search simultaneously, returning a combined answer faster than sequential orchestration.

The run lifecycle includes a “requires action” step where tool outputs must be gathered and submitted before the assistant can finish.

One request can mix retrieval, image generation, and writing—like planning a trip and drafting an email—using parallel tool calls.

Topics

Parallel Function Calling
Assistants API
Tool Calls
DALL·E 3
SerpAPI

OpenAI Parallel Function Calling with Assitants API - WOW!!