Get AI summaries of any video or article — Sign up free
Build Hour: Apps in ChatGPT thumbnail

Build Hour: Apps in ChatGPT

OpenAI·
6 min read

Based on OpenAI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

ChatGPT apps rely on the apps SDK for UI injection, MCP servers for tools/actions and contextual data, and optional web components for interactive, stateful experiences.

Briefing

ChatGPT apps are moving from “chat-only” to full, interactive experiences by combining three pieces: an apps SDK that injects UI into ChatGPT, an MCP server layer that supplies tools/actions and contextual data, and (optionally) a web component that can react in real time. The practical payoff is clear in the demos: users can ask for hiking recommendations that pull from AllTrails data, generate a styled flyer through Adobe Express templates, and take dynamically rendered quizzes—without the model needing direct access to those external systems on its own.

Since DevDay, OpenAI has shipped the building blocks for this ecosystem. The apps SDK defines how ChatGPT hosts UI widgets, while MCP servers let apps fetch context and perform actions on the user’s behalf, including read/write capabilities. Developers also get an apps docs site on developers.openai.com to learn the components needed to build experiences, plus a public app submission flow and an app marketplace for discovery and installation. A UI component library and sample repositories aim to reduce the time spent on styling and scaffolding, and a new “Docs MCP server” is positioned as a faster on-ramp for development.

The demos make the architecture tangible. In the AllTrails example, ChatGPT interprets intent (“find me dog friendly trails in Marin”), routes it to an AllTrails MCP app, and returns structured results that the UI presents as a refined, engaging experience. The Adobe Express demo shows how an MCP server can trigger template selection and then fill event details into a generated flyer. The quizzes app demonstrates a different pattern: UI content is dynamically rendered from structured tool output, so generation happens on demand and doesn’t require storing quiz content in a backend.

A key conceptual section explains what MCP contributes. MCP provides tools, actions, and resources so an AI client can call external logic—potentially returning a URI for a web component that ChatGPT renders. The most powerful setup pairs MCP-driven data and logic with an interactive web component that can use the window.openai capability API to update state, react to context, and call tools from inside the widget. Display modes like inline, picture-in-picture, and full-screen are highlighted as ways to keep users engaged.

The build walkthrough then shifts from concepts to workflow. Developers are encouraged to use the traditional path—read docs, grab samples, code, deploy—but Codex plus the Docs MCP server is pitched as an AI-first alternative that can handle scaffolding and documentation ingestion. The live process installs Codex via npm, adds the OpenAI developer docs MCP server into Codex, and then prompts Codex to generate a “ping pong” ChatGPT app with a simple UI and MCP server. Codex produces a minimal app repo, which is run locally and then tunneled so ChatGPT can load it in developer mode.

The culminating demo upgrades the game into a real-time multiplayer experience with scores, a lobby, and postgame analysis powered by additional tool calls. After the match, the model provides stats like win rate, rally length, and targeted improvement tips, turning gameplay data into actionable coaching. The session closes with best-practice guidance—extract value instead of porting entire apps, treat ChatGPT as the “home” experience, and design for multi-turn refinement—along with Q&A on MCP compatibility, local development tooling (tunneling with services like Enro or Cloudflare), design freedom, authentication requirements, and monetization approaches (auth flows, in-app checkout links, and emerging agent commerce protocol/instant checkout).

Cornell Notes

ChatGPT apps become useful when UI widgets, MCP-powered tools/actions, and optional interactive web components work together. OpenAI’s apps SDK defines how ChatGPT hosts app UI, while MCP servers supply contextual data and can take actions on a user’s behalf. The Docs MCP server is designed to speed up development by giving Codex structured access to OpenAI developer documentation. In the live build, Codex scaffolds a ping pong app with an MCP server and web component, then the team extends it into a multiplayer game with real-time scoring and postgame analysis via additional tool calls. This matters because it turns conversational prompts into interactive, stateful experiences that can integrate external systems and produce personalized outcomes.

What are the three core components that make a ChatGPT app feel interactive rather than “chat-only”?

The apps SDK injects a UI widget into the ChatGPT client, MCP servers provide tools/actions and contextual data (including read/write capabilities), and a web component (optional but powerful) renders interactive UI inside ChatGPT. The web component can use window.openai capability APIs to update state, react to context, and trigger tool calls, enabling richer modes like inline, picture-in-picture, and full-screen.

How does MCP change what an app can do when a user asks for something?

When a user asks for an outcome (e.g., “find dog friendly trails in Marin”), ChatGPT routes the request to an MCP server tied to the app. That MCP server can pull data from backends or generate information on the fly, then return structured results (and optionally a URI) so ChatGPT can render a web component. The model can also refine the request across multiple turns using the app’s logic and data.

Why does the quizzes app emphasize structured output and dynamic rendering?

Instead of storing quiz content in a backend, the app generates quiz UI content dynamically based on structured tool output. The UI knows how to render the content directly from the tool response, so generation happens at runtime and the same chat history can be reused to regenerate new quizzes for learning.

What workflow did the team demonstrate for building an app faster with Codex and the Docs MCP server?

They installed Codex with npm, added the OpenAI developer docs MCP server into Codex, and used an agents.md file to instruct Codex to rely on that docs MCP server for OpenAI API/app SDK guidance. Then Codex was prompted to scaffold a ping pong ChatGPT app. Codex referenced the apps SDK quick start and produced a minimal repo with an MCP server and web component, which was run locally and tunneled for ChatGPT developer mode.

How did the ping pong demo turn into a multiplayer and coaching experience?

The initial scaffolded app supported basic gameplay. After further iteration, the app added scores, a lobby, and multiplayer so two users on separate screens could join the same match via a shared match code. Postgame analysis was implemented as an additional tool call that fed match stats back into the model, producing coaching tips like prioritizing defense on early returns.

What UX and product principles were recommended for ChatGPT apps?

Instead of porting an entire standalone app, developers should extract value and distill it into the conversational experience. ChatGPT should be treated as the “home” experience, with the model, UI, and MCP server working together for multi-turn refinement. The session also framed app value around three pillars: helping users know more, helping them do more (actions via MCP), and presenting information more engagingly and visually.

Review Questions

  1. How do apps SDK widgets and MCP servers complement each other in the request/response flow inside ChatGPT?
  2. Describe how a web component can use window.openai capability APIs to create a more immersive app experience.
  3. Why does dynamic UI rendering (as in the quizzes example) reduce backend storage needs, and how does it affect iteration?

Key Points

  1. 1

    ChatGPT apps rely on the apps SDK for UI injection, MCP servers for tools/actions and contextual data, and optional web components for interactive, stateful experiences.

  2. 2

    OpenAI shipped an apps SDK, apps docs site, public app submission flow, and an app marketplace, alongside a UI component library and sample repos to speed development.

  3. 3

    MCP enables apps to route user intent to external logic and return structured results (and optionally URIs) that ChatGPT can render as widgets.

  4. 4

    Codex can scaffold an app faster when paired with the OpenAI developer docs MCP server, which supplies documentation context directly to the coding agent.

  5. 5

    The live ping pong demo showed an end-to-end path: local MCP server + tunneling + developer mode sideloading, then upgrading to multiplayer and postgame analysis via tool calls.

  6. 6

    App UX guidance emphasized extracting value for conversational use, treating ChatGPT as the primary experience surface, and designing for multi-turn refinement between model, UI, and MCP logic.

  7. 7

    Monetization options discussed included authentication flows, in-app checkout links with redirect URLs back into ChatGPT, and emerging agent commerce protocol/instant checkout approaches.

Highlights

AllTrails and Adobe Express demos illustrated how MCP-backed apps can pull external context and trigger actions that ChatGPT presents through a dedicated UI layer.
The quizzes app demonstrated dynamic UI rendering driven by structured tool output, avoiding the need to store quiz content in a backend.
Codex + the Docs MCP server can scaffold a working ChatGPT app repo by ingesting the apps SDK quick start during generation.
The ping pong upgrade combined real-time multiplayer with postgame stats and model-generated coaching tips using additional tool calls.
Best-practice guidance centered on extracting conversational value rather than porting a full standalone app into ChatGPT.

Topics

Mentioned

  • Christine
  • Corey
  • MCP
  • UI
  • SDK
  • URI
  • HTML
  • npm
  • VS code
  • API
  • Q&A
  • UX