OpenAI on OpenAI: Applying AI to Our Own Workflows

TL;DR

OpenAI frames internal agent deployments as a way to amplify expertise by encoding top operators’ craft into reusable skills, not just automating tasks.

Briefing Cornell Notes

Briefing

OpenAI’s internal push for “agent-driven workflows” is being framed less as a quest for efficiency and more as a way to amplify company expertise—turning the best practices of top performers into systems that scale across entire teams. The core claim is straightforward: automation matters, but the bigger payoff comes from capturing the craft of practitioners and distributing it so every employee can operate like the organization’s best operator.

Three internal deployments illustrate how that idea works in practice: a Go-to-Market assistant for sales, Openhouse for HR and onboarding knowledge, and a support system that improves itself as ticket patterns change. Each system follows a similar architecture—connectors to pull in relevant data, an orchestration layer that routes tasks to specialized “skills,” and services that push actions back into tools employees already use. The difference is the center of gravity: sales skills are grounded in meeting prep, product knowledge, demos, and customer research; HR skills are grounded in people and policy context; support skills are grounded in standardized operating procedures (SOPs) derived from expert handling of real conversations.

In sales, the Go-to-Market assistant was built after teams hit a breaking point and needed faster customer research and technical answers without sacrificing customer experience. Developers sat with a top rep, Sophie, and extracted her “formula” for winning—prepping for meetings, generating product champions, planning demos, and following up quickly. That craft was encoded into an agent system with a semantic data layer over customer and strategy documents, plus specialized skills (starting with four and growing to about 10) such as meetings, product knowledge, custom demos, and customer research. The assistant then operates inside existing workflows—primarily ChatGPT and Slack—producing meeting briefs, demo scripts, and follow-up answers. It also closes the loop: when a rep flags a missed takeaway, the system regenerates the output and triggers prompt optimization that flows to developers for approval, effectively distributing improvements across the org. Reps exchange about 20 messages per week with the assistant and report saving roughly one full day weekly.

For HR, Openhouse targets the knowledge gap created by rapid global hiring. It connects HR systems like Workday with internal announcements and role expectations, then surfaces answers through Chatbot and Slack-based Q&A. A real example: during a New York trip, an employee uses Openhouse to get travel policy and office access details with citations, then extends the same question into a directory-driven search for a coworker who can help with a customer-facing use case. The result is a path from policy knowledge to human expertise without silo-hopping.

Support tackles a different kind of scale problem: hypergrowth and rapid product change can spike ticket volume by orders of magnitude, as seen around the launch of image gen. The support system defines expert “gold standards” for when to answer, when to escalate, and when to tag for audit, then codifies them into SOP knowledge. That SOP knowledge is tied to knowledge and eval so interactions update the system over time—especially when new ticket patterns cause automation failures. Reported outcomes are concrete: about 70% of tickets are deflected or handled autonomously, the system outperforms legacy by about 30%, and roughly 80% of manually reviewed tickets earn highly positive QA ratings.

The takeaway is a blueprint for internal builders: find the top operator to model, embed agents into familiar tools rather than creating new software, and use scaled platforms (OpenAI’s agent kit, including agent builder, Chatkit, and eval) to accelerate deployment. The pitch ends with a sprint-style challenge: build something teams can’t live without, starting this week.

Cornell Notes

OpenAI’s internal agent deployments aim to amplify expertise, not just automate tasks. Three examples—Go-to-Market assistant, Openhouse, and a support system—use a shared pattern: connect data, orchestrate specialized agent “skills,” and take actions inside tools employees already use (like ChatGPT and Slack). Sales work was modeled on a top rep’s meeting-prep and demo craft, then scaled via semantic customer data, skills for research and product knowledge, and a feedback loop that triggers prompt optimization for developers to approve. HR work uses people and policy context (including Workday-linked records and announcement content) to answer role-specific questions and route employees to the right coworkers. Support work codifies expert SOPs and ties them to eval-driven self-improvement so ticket handling improves as new product patterns emerge.

What does “amplify expertise” mean in these deployments, and why is it positioned as more than efficiency?

The framing distinguishes automation from craft capture. Instead of only streamlining workflows, the systems extract the best practices of top performers (e.g., a top salesperson’s meeting-prep and demo approach) and encode them into agent skills. Those skills are then distributed so the broader team can consistently perform at a higher baseline. The measurable outcomes cited—time saved for reps, high weekly usage by employees, and large ticket deflection rates—are used to argue that scaling expertise changes day-to-day performance, not just throughput.

How did the Go-to-Market assistant turn one rep’s “formula” into a repeatable system?

Developers started by sitting with Sophie, a top rep, and identifying her workflow for winning: prepping for customer meetings, generating product champions, planning product demos, and following up quickly. They built an agent specialized to that version of excellence, then repeated the approach for additional skills. The system uses a simplified data model plus a semantic layer so GPT5 can understand customer context and strategy documents. It vectorizes key internal resources, then routes tasks through skills like meetings, product knowledge, custom demos, and customer research (growing from four to about 10). It runs inside Slack and ChatGPT, producing meeting briefs and demo scripts, and it improves via user feedback that triggers regeneration and prompt optimization.

What is the “feedback loop” mechanism in sales, and how does it reach developers?

When a rep notices the assistant missed a key next step, they can mark the response as not helpful. That feedback triggers backend regeneration for the specific task. Separately, the same feedback is sent to an eval platform that initiates a prompt optimization flow. The resulting optimization insight is delivered to developers in a developer-only channel, where they review scope and impact and approve the change. This creates a distribution mechanism: improvements made by top operators propagate across the organization over time.

How does Openhouse use HR and internal knowledge to answer questions and connect employees to people?

Openhouse is built on HR systems and internal content. It connects personnel records (including Workday-linked role expectations) and uses vectorized documents plus a CMS-like layer that captures ongoing organizational updates such as announcements and policy changes. Skills then include company knowledge, a people connector, and career growth expectations. In practice, an employee can ask travel and office-access questions and receive chat-ready answers with citations, then pivot to a directory-based search to find a coworker with relevant expertise (e.g., someone who can help build a customer demo). The system supports follow-through by jumping into Slack to message the suggested coworker.

Why is the support system described as “self-improving,” and what inputs drive that improvement?

Self-improvement comes from codifying expert handling into SOP knowledge and then tying that knowledge to eval so new interactions update the system. Developers reviewed conversation logs with specialists, selected thousands of conversations, and defined gold standards for response timing, escalation, and audit tagging. Those standards became SOPs connected to knowledge and eval. When novel ticket patterns appear—cases where automation fails or humans get involved—the system updates SOPs so future handling scales with changing product behavior.

What outcomes are reported for support, and how do they relate to scale events like image gen?

The transcript cites three impact metrics: about 70% of tickets are deflected or handled autonomously; the system outperforms legacy by about 30%; and about 80% of manually reviewed tickets receive highly positive QA ratings. The motivation is illustrated by a scale shock: after the launch of image gen, ticket volume spiked by several magnitudes within days as over 100 million users were added. The system is positioned as the only approach that can keep up with that kind of rapid change without collapsing under operational complexity.

Review Questions

Which part of the architecture (data model/semantic layer, skills, or services) most directly enables the assistant to produce meeting prep and demo scripts in the sales example?
In the sales workflow, what two separate downstream actions happen after a rep marks an answer as not helpful?
How do SOPs and eval work together in the support system to handle new ticket patterns over time?

Key Points

1
OpenAI frames internal agent deployments as a way to amplify expertise by encoding top operators’ craft into reusable skills, not just automating tasks.
2
The Go-to-Market assistant was built by modeling a top rep’s meeting-prep and demo workflow (Sophie) and scaling it into about 10 specialized skills.
3
A semantic data layer over customer and strategy documents helps the sales assistant generate tailored meeting briefs and demo scripts inside Slack and ChatGPT.
4
Sales improvements propagate through a feedback loop: rep corrections trigger regeneration and prompt optimization, which developers review and approve via an eval workflow.
5
Openhouse connects HR systems (including Workday-linked records) with a CMS-style stream of announcements and policies, then surfaces role-specific answers with citations and routes employees to the right coworkers.
6
The support system codifies expert SOPs from thousands of specialist-reviewed conversations and ties them to knowledge and eval so ticket handling improves as new patterns emerge.
7
OpenAI’s agent kit (agent builder, Chatkit, and eval) is positioned as a platform to accelerate internal deployment using familiar tools and repeatable building blocks.

Highlights

The sales assistant scales one rep’s “version of excellence” by turning meeting-prep and demo craft into specialized agent skills, then distributing improvements through eval-driven prompt optimization.

Openhouse uses HR-linked data plus an internal CMS of announcements to answer policy questions with citations and then connect employees to specific coworkers via the directory and Slack.

Support handling is described as self-improving: SOPs derived from expert standards are updated through eval when new ticket patterns break automation, enabling scale during events like image gen.

Topics

Agentic Workflows
Internal Applications
Sales Enablement
HR Knowledge
Customer Support Automation

Mentioned

Sophie
Ken
Alex
Scotty
Max
Maggie
Joe
GPT5
MCP
SOPs
QA