OpenAI DevDay: Opening Keynote

TL;DR

GPT-4 Turbo raises context to 128,000 tokens and improves long-context accuracy, enabling longer documents and multi-step workflows.

Briefing Cornell Notes

Briefing

OpenAI’s DevDay keynote centers on a major shift from “chat” toward practical, agent-like AI—powered by a new GPT-4 Turbo model, new multimodal tools, and two developer platforms (GPTs and the Assistants API) designed to make complex workflows easier to build and safer to deploy.

Altman opened by recapping rapid product momentum: ChatGPT moved from a research preview to mainstream use; GPT-4 arrived as the company’s most capable model; and recent releases added voice and vision so ChatGPT can see, hear, and speak. DALL-E 3 expanded image generation, while ChatGPT Enterprise brought enterprise-grade security, privacy, higher-speed GPT-4 access, and longer context windows. Usage figures—2 million developers on the API, 92% of Fortune 500 companies using OpenAI products, and about 100 million weekly active users—were framed as evidence of real-world utility, reinforced by user stories showing confidence-building, tutoring-like explanations, accessibility benefits, and daily-life assistance.

The keynote then moved to the headline developer announcement: GPT-4 Turbo. The model’s biggest technical leap is a major context expansion to 128,000 tokens—described as roughly 300 pages—paired with improved accuracy over long inputs. Developers also get more control through JSON Mode (guaranteed valid JSON for API responses), better function calling (including the ability to call multiple functions at once), and reproducible outputs via a seed parameter. Retrieval arrives as a platform feature so apps can pull knowledge from external documents or databases, while the knowledge cutoff for GPT-4 Turbo is updated to April 2023. Multimodal capabilities are pushed into the API as well: DALL-E 3, vision-enabled GPT-4 Turbo, and a new text-to-speech model with six preset voices. OpenAI also announced Whisper V3 for speech recognition, with improved multilingual performance.

Customization and scaling follow. Fine-tuning expands from earlier GPT-3.5 work to the 16K version, and active fine-tuning users can apply for experimental GPT-4 fine-tuning access. A new “Custom Models” program offers deeper, company-specific training and RL post-training—positioned as expensive and limited at first. Rate limits for established GPT-4 customers are doubled, and developers can request further quota changes in API settings. Legal risk is addressed with “Copyright shield,” which covers certain copyright infringement claims for both ChatGPT Enterprise and the API—paired with a clear warning not to train on API or Enterprise data.

Pricing is the other major lever. GPT-4 Turbo is pitched as “considerably cheaper” than GPT-4: 3x lower cost for prompt tokens and 2x lower for completion tokens, with specific rates of 1¢ per 1,000 prompt tokens and 3¢ per 1,000 completion tokens. OpenAI also plans speed improvements and reduced costs for GPT-3.5 Turbo 16K.

Finally, the keynote reframes the developer roadmap around agent-like behavior. ChatGPT is updated to use GPT-4 Turbo with the latest knowledge cutoff and capabilities, and the model picker is removed to reduce friction. OpenAI introduces GPTs—tailored ChatGPT versions built with instructions, expanded knowledge, and actions—plus a GPT store launching later this month with revenue sharing for top builders. For developers building inside apps, the Assistants API goes to beta with persistent threads, built-in retrieval, Code Interpreter, and improved function calling. Demos showed assistants that manage state, parse uploaded documents, run code to compute travel costs, and use voice via Whisper and text-to-speech. The overall message: better tools lead to more capable automation, and OpenAI is moving toward agents through gradual, iterative deployment with safety as an ongoing constraint.

Cornell Notes

The keynote argues that OpenAI is moving from conversational AI toward agent-like systems by combining a new GPT-4 Turbo model with developer platforms that support long context, structured outputs, retrieval, and multimodal interaction. GPT-4 Turbo adds 128,000-token context, JSON Mode for valid JSON responses, improved (and more controllable) function calling, reproducible outputs via a seed, and a knowledge cutoff updated to April 2023. The API also gains retrieval, DALL-E 3, vision, text-to-speech, and Whisper V3, alongside customization options like fine-tuning and “Custom Models.” On the product side, GPTs let users build tailored ChatGPT versions with instructions, knowledge, and actions, while the Assistants API (beta) provides persistent threads, retrieval, and Code Interpreter to build assistive agents inside apps. This matters because it lowers the engineering burden for complex workflows while aiming to keep safety and control central.

What are the most important technical upgrades in GPT-4 Turbo, and why do they matter to developers building real apps?

GPT-4 Turbo’s key upgrades are designed for production constraints: (1) context length jumps to 128,000 tokens (about 300 pages), enabling longer documents and multi-step tasks; (2) JSON Mode forces responses to be valid JSON, making API integration more reliable; (3) function calling improves so the model can follow instructions better and call multiple functions at once; (4) reproducible outputs add a seed parameter so repeated runs can match outputs more closely; (5) retrieval is launched so apps can bring in knowledge from external documents/databases; and (6) the knowledge cutoff is updated to April 2023 to reduce staleness. Together, these reduce brittle prompting and make it easier to build dependable workflows.

How does OpenAI plan to improve “world knowledge” and reduce the problem of outdated training data?

OpenAI introduces retrieval at the platform level, letting builders connect external documents or databases into what their app uses. In parallel, GPT-4 Turbo’s built-in knowledge cutoff is updated to April 2023 (with an explicit goal of not letting it drift as far out of date as GPT-4’s 2021 cutoff). The combination means apps can stay current either by updating the model’s cutoff over time or by pulling fresh information from outside sources.

What new multimodal capabilities are entering the API, and what are concrete examples of use?

The API gains DALL-E 3, vision-enabled GPT-4 Turbo, and a new text-to-speech model with six preset voices. DALL-E 3 is used programmatically for image/design generation; the keynote cites a Coca-Cola campaign that lets customers generate Diwali cards. Vision inputs let models accept images and produce captions, classifications, and analysis; Be My Eyes uses this to help people with low vision identify products in daily tasks. Text-to-speech is positioned for more natural interaction and accessibility, including language learning and voice assistance. Whisper V3 is also released for speech recognition with improved multilingual performance.

What does “more control” mean beyond JSON Mode—how do reproducibility and logging fit in?

More control includes JSON Mode for structured outputs, improved function calling, and reproducible outputs via a seed parameter (rolled out in beta) so developers can get consistent results. OpenAI also plans, in the coming weeks, to let developers view logprobs in the API, which would help with debugging, confidence estimation, and more informed decision-making during generation.

How do GPTs and the Assistants API differ, and what problem does each solve for builders?

GPTs are tailored versions of ChatGPT built with instructions, expanded knowledge, and actions; they can be published and discovered (including via a GPT store) and can be customized even by non-coders through conversation. The Assistants API targets developers embedding agent-like behavior inside their own apps. It provides persistent threads (state across long conversations), built-in retrieval, Code Interpreter (a sandboxed Python environment that can write and execute code), and improved function calling. In demos, Assistants manage user state, parse uploaded PDFs, compute travel splits with code, and use voice via Whisper plus text-to-speech.

What safety and legal guardrails were emphasized alongside these new capabilities?

OpenAI highlighted Copyright shield, which steps in to defend customers and pay costs from certain copyright infringement legal claims for both ChatGPT Enterprise and the API. It also stressed a key rule: do not train on data from the API or ChatGPT Enterprise ever. For the broader move toward agents, OpenAI framed safety as a “shift left” priority and emphasized gradual, iterative deployment so capabilities expand carefully.

Review Questions

Which GPT-4 Turbo features directly improve integration reliability (e.g., structured outputs and determinism), and what are their specific mechanisms?
How do retrieval and the updated knowledge cutoff work together to keep answers accurate over time?
What capabilities does the Assistants API add that would otherwise require significant engineering (state management, retrieval, and code execution)?

Key Points

1
GPT-4 Turbo raises context to 128,000 tokens and improves long-context accuracy, enabling longer documents and multi-step workflows.
2
JSON Mode and enhanced function calling aim to make API outputs more dependable and easier to wire into applications.
3
Reproducible outputs via a seed parameter (beta) give developers a way to reduce variability across runs.
4
Retrieval is added at the platform level, and GPT-4 Turbo’s knowledge cutoff is updated to April 2023 to reduce staleness.
5
New API multimodality includes DALL-E 3, vision input/output, and text-to-speech with six preset voices, plus Whisper V3 for speech recognition.
6
OpenAI’s pricing shifts GPT-4 Turbo to 1¢ per 1,000 prompt tokens and 3¢ per 1,000 completion tokens, positioning it as 2x–3x cheaper than GPT-4 depending on token type.
7
GPTs and the Assistants API are positioned as the first steps toward agent-like systems, with persistent threads, retrieval, and Code Interpreter to lower the engineering burden for assistive apps.

Highlights

GPT-4 Turbo supports up to 128,000 tokens of context—roughly 300 pages—plus better accuracy over long inputs.

JSON Mode forces valid JSON responses, and function calling can invoke multiple functions at once, reducing integration friction.

The Assistants API (beta) bundles persistent threads, retrieval, and Code Interpreter so developers don’t have to build those systems from scratch.

OpenAI introduced GPTs with instructions, expanded knowledge, and actions, then paired it with a GPT store and revenue sharing for top builders.

GPT-4 Turbo’s pricing drops to 1¢/1,000 prompt tokens and 3¢/1,000 completion tokens, described as 2x–3x cheaper than GPT-4.

Topics

Mentioned

Azure
GitHub Copilot
DALL-E 3
ChatGPT Enterprise
Code Interpreter
Zapier
Canva
Be My Eyes
Whisper
Coca-Cola
Airbnb
United
Sam Altman
Satya Nadella
Jessica Shieh
Romain
API
GPT
RL
TTS
DC