New Products: A Deep Dive
Based on OpenAI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
GPTs are built from instructions (system message), knowledge (uploaded files with retrieval), and actions/tools that connect to external systems.
Briefing
OpenAI rolled out a hands-on look at two building blocks for an “agents-like” future: GPTs inside ChatGPT and the new Assistants API for embedding agentic experiences into apps. The core message is that developers can now package instructions, external knowledge, and real-world actions into reusable assistants—then either share them as custom ChatGPTs (GPTs) or wire them directly into their own products (Assistants API). That shift matters because it reduces the glue code developers previously had to write for state, retrieval, tool use, and context management.
For GPTs, the demo centered on a new GPT creation workflow that starts conversationally and then exposes deeper controls. A builder can chat with a GPT-in-progress to iteratively shape its behavior, then switch to a configuration view to inspect and edit the underlying “GPT anatomy”: instructions (system message), knowledge, and custom actions/tools. The UI also includes a testing pane to see how the GPT responds to real user prompts before publishing. A pirate-themed GPT (“Captain Coder,” then “salty” variants) illustrated how instructions can define personality and conversation starters, while a “Tasky” GPT demonstrated how actions connect a GPT to external systems.
Tasky used actions wrapped around the Asana API via retool, with OAuth and end-user confirmation built into the flow. In practice, the GPT could read a user’s to-dos and then create an actual Asana task after confirming the user’s intent—turning a chat interaction into a concrete workflow. A separate “Danny DevDay” GPT showcased knowledge: instead of relying on pretraining, it was given a PDF (Sam’s Keynote script) and could answer questions and summarize content using retrieval over the uploaded file. The demo emphasized that knowledge isn’t just for one-off summaries; the GPT can “talk to” the information as part of an ongoing conversation.
The most ambitious combined demo (“Mood Tunes”) stitched together instructions, knowledge, actions, and multimodal capabilities. It generated a mixtape concept from an image input, used browsing when needed to fill gaps not present in its knowledge set, produced album art via DALL·E, and then used an action connected to the Hue API to change lighting based on the chosen mood. It also offered to play a track on Spotify, illustrating how GPTs can orchestrate multiple external tools in a single experience.
The second half of the session moved from ChatGPT-native GPTs to the Assistants API, designed to let developers build similar assistant experiences inside their own apps. The Assistants API introduces three stateful primitives—Assistant (stored instructions, selected model, and tools), Threads (conversation state), and Messages (user/assistant posts)—plus a Runs primitive that packages one invocation of an assistant. Behind the scenes, runs handle context truncation, tool calls, and saving outputs back into the thread, reducing the need for developers to manage message history and retrieval plumbing themselves.
Tooling is central. Code Interpreter lets assistants write and run code in a sandbox to analyze data and generate charts. Retrieval provides built-in document augmentation without developers manually computing embeddings or building semantic search. Function calling lets assistants invoke developer-defined functions with structured arguments. Two upgrades were highlighted: JSON mode for guaranteed valid JSON outputs, and parallel function calling so multiple functions (e.g., play music and set volume) can execute in one pass. The session closed with a roadmap: multimodal support by default, bringing your own code execution, and asynchronous real-time integration via WebSockets/Webhooks.
Cornell Notes
OpenAI presented GPTs and the Assistants API as two paths to build agent-like systems. GPTs let developers create custom ChatGPTs by combining instructions (system message), knowledge (uploaded files with retrieval), and actions (tool integrations like Asana via retool). The Assistants API brings similar capabilities to developers’ own apps using stateful primitives: Assistant (instructions/tools/models), Threads (conversation state), Messages (posts), and Runs (one assistant invocation). Runs handle context truncation and saving outputs automatically, while tools like Code Interpreter, Retrieval, and Function Calling provide sandboxed code, built-in document search, and developer-defined function execution. JSON mode and parallel function calling improve reliability and reduce latency when multiple actions are needed.
What are the three core components of a GPT, and how did the demos make each one concrete?
How does the GPT builder UI change the workflow from “prompting” to “building”?
What problem does the Assistants API solve compared with earlier chat-style APIs?
How do Code Interpreter, Retrieval, and Function Calling differ as tools?
What do JSON mode and parallel function calling change for developers?
How did the “Mood Tunes” demo illustrate end-to-end orchestration across capabilities?
Review Questions
- In the Assistants API, what responsibilities are handled by Runs versus what must developers manage themselves?
- Compare how knowledge grounding works in GPTs versus retrieval in the Assistants API—what automation is emphasized in each?
- Why do JSON mode and parallel function calling matter for building reliable, low-latency assistant features?
Key Points
- 1
GPTs are built from instructions (system message), knowledge (uploaded files with retrieval), and actions/tools that connect to external systems.
- 2
A new GPT creation workflow supports iterative building via conversation, then deeper configuration and testing before publishing.
- 3
Actions can integrate real services (e.g., Asana via retool) with OAuth and end-user confirmation before data is sent.
- 4
Knowledge grounding in GPTs can come from uploaded documents (like a keynote PDF), enabling retrieval-based answers and file-grounded conversation.
- 5
The Assistants API introduces stateful primitives—Assistant, Threads, Messages, and Runs—so developers don’t have to manually manage context truncation or message storage.
- 6
Code Interpreter, Retrieval, and Function Calling provide three distinct tool categories: sandboxed code execution, built-in document search, and developer-defined function execution.
- 7
Function calling improvements include JSON mode for valid JSON outputs and parallel function calling to execute multiple actions in one invocation.