First Block: Interview with Daniela Amodei, Co-Founder & President of Anthropic

TL;DR

Anthropic built a research-first foundation for roughly the first year to year and a half, delaying go-to-market until 2023 to preserve a safety-centered culture.

Briefing Cornell Notes

Briefing

Daniela Amodei, co-founder and president of Anthropic, credits the company’s rapid progress in building Claude to a deliberate sequencing of priorities: heavy investment in research before go-to-market, then a second phase focused on bringing safe, high-performing systems to customers without losing the safety mission. She describes Anthropic’s early structure as “building the airplane while it’s taking off,” but with a distinctive twist—no dedicated go-to-market team until 2023—allowing the research group to establish a cohesive culture and shared goals before scaling product and distribution.

Amodei also frames Claude’s core design around the “HHH” standard—helpful, honest, and harmless—assigning different internal teams to each dimension while acknowledging the trade-offs are not cleanly separable. “Honesty” targets hallucinations, “helpfulness” focuses on whether the model actually accomplishes the user’s intent, and “harmlessness” aims to reduce toxic, biased, or harmful outputs. Even when using constitutional AI to push the model toward all three goals at once, she says businesses still face choices: a use case that rewards creativity may tolerate more risk than a use case that demands strict safety, as long as guardrails remain.

Her account of training also highlights how messy early behavior can be even when the destination is a polished product. She compares the pace of improvement across model generations, saying the gap between GPT-2 and Claude 1 feels smaller than the gap between Claude 1 and Claude 2. She recounts examples of early Claude oddities—getting stuck on bizarre “modes” and producing overly concerned responses—underscoring that evaluation is hard because different customers demand different behaviors.

Beyond model training, Amodei connects Anthropic’s approach to company-building and leadership. She says founding is fundamentally different from scaling: early-stage founders must learn operational basics quickly (payroll, office setup during Covid, fundraising, hiring), and with six co-founders the work can be distributed by time horizon—Dario Amodei focusing on a 5–10 year technical vision while she concentrates on 1–2 year execution that turns research into usable products. She describes the co-founder group as unusually pre-aligned, with long-standing relationships that made division of labor easier.

On hiring, she emphasizes Anthropic’s interdisciplinary talent pool, citing backgrounds spanning neuroscience and biology alongside ML/AI researchers, plus product engineering, policy leadership, and operators. As the company grows, she says it has shifted from generalist-heavy early hiring toward more specialists.

Finally, Amodei argues that safety and capability should not be treated as a zero-sum trade. She points to Anthropic’s “responsible scaling policy” as a set of public commitments about training, testing, and what the company will do if safety concerns arise. Looking ahead, she expects Claude to evolve from a “junior assistant” that summarizes and drafts into a more senior productivity partner across sectors like healthcare, legal services, and climate tech—while insisting that societal impact can’t be left solely to corporations, and that policymakers and civil society must stay in the loop.

Cornell Notes

Daniela Amodei describes Anthropic’s path to Claude as a two-stage build: first, invest deeply in research and safety culture before launching go-to-market, then bring the resulting systems to customers while preserving the mission. Claude’s training is organized around “HHH”—helpful, honest, and harmless—handled by different internal teams, with constitutional AI used to raise all three targets together despite unavoidable trade-offs. Amodei says evaluation is especially difficult because users want different behaviors, so “tunable” guardrails matter: some settings may prioritize creativity or usefulness while still maintaining safety constraints. She also ties model work to company leadership—co-founders split responsibilities by time horizon—and to hiring, emphasizing interdisciplinary talent and a shift from generalists to specialists as the company scales.

Why does Amodei say Anthropic delayed go-to-market, and what did that enable?

She says Anthropic invested in the research organization for the first year to year and a half, without a go-to-market team until 2023. That sequencing let the company build a cohesive culture with clear safety-centered goals before turning to commercialization—described as building one part of the “airplane” (research foundations) before adding “controls” and “wings” (product and market delivery).

How does the “HHH” framework map onto concrete training goals for Claude?

“Honesty” targets hallucinations by trying to reduce them as much as possible. “Helpfulness” focuses on whether the model answers the user’s question or completes the intended task, including when the user is searching or asking for actions. “Harmlessness” aims to reduce toxic or biased outputs and also reduce the chance the model helps with crimes or produces violent content.

What trade-offs remain even when constitutional AI is used to optimize for helpful, honest, and harmless?

Amodei calls it more art than science. Even if Claude is pushed toward all three goals, businesses still choose how to weight them depending on use case. For example, a creative partner might require more permissive language (lower harmlessness score) while still keeping guardrails. She also notes that a perfectly harmless model could become unhelpful by refusing too much, so raising all three together is the core challenge.

What do Amodei’s training anecdotes reveal about evaluation and iteration?

They show that early model behavior can be “messy” and sometimes bizarre—like getting stuck on odd diet advice or inventing “creative modes.” She emphasizes that evaluation is hard because people have different use cases and expectations, so the system must be tuned and assessed across varied real-world demands rather than judged by a single universal standard.

How does Amodei describe the division of labor between co-founders, and why does it matter?

She says Dario Amodei tends to think in a 5–10 year technical horizon, while she focuses on a 1–2 year horizon—turning technical visionary ideas into tangible products people can use now. This split by time horizon helps keep alignment while still allowing disagreements to be managed through clear ownership boundaries.

What does Anthropic’s “responsible scaling policy” commit to?

Amodei describes it as public commitments covering how models are trained and tested and what Anthropic will do if safety concerns arise. She frames it as part of broader ecosystem accountability as models become more capable, not just a company-internal safety process.

Review Questions

How does Anthropic’s early sequencing (research-first, go-to-market later) influence its ability to maintain safety culture during scaling?
In what ways can optimizing for helpfulness, honesty, and harmlessness conflict, and how does constitutional AI help manage those tensions?
Why does Amodei argue that safety and capability should not be treated as inherently opposed goals?

Key Points

1
Anthropic built a research-first foundation for roughly the first year to year and a half, delaying go-to-market until 2023 to preserve a safety-centered culture.
2
Claude’s “HHH” targets are operationalized as separate goals: reduce hallucinations (honesty), improve task completion (helpfulness), and limit toxic or harmful outputs (harmlessness).
3
Constitutional AI is used to push Claude toward helpful, honest, and harmless behavior simultaneously, but trade-offs still depend on customer use cases.
4
Model evaluation is difficult because different users want different behaviors, so tunability and guardrails are essential rather than one-size-fits-all performance.
5
Amodei describes co-founder alignment as strengthened by divided ownership: long-horizon technical vision versus short-horizon product execution.
6
Founding differs sharply from scaling; early founders must rapidly learn operational basics (payroll, office setup, fundraising, hiring) that later become systems.
7
Amodei links safety to capability and points to Anthropic’s responsible scaling policy as a public commitment for training, testing, and escalation when risks emerge.

Highlights

Amodei says Anthropic didn’t add go-to-market until 2023, using the early period to build a cohesive safety culture before commercialization.

Claude’s HHH framework is split across teams—honesty targets hallucinations, helpfulness targets task success, and harmlessness targets toxic, biased, and violent misuse.

She describes constitutional AI as a way to raise helpful, honest, and harmless together, while stressing that real-world deployments still require trade-off decisions.

Amodei recounts early Claude quirks—like inventing odd “modes” or giving overly concerned responses—to illustrate why evaluation and iteration are so hard.

She argues that responsible scaling isn’t just internal process: it’s backed by public commitments and ongoing leadership-level trade-off discussions.

Topics

Anthropic Founding
Claude HHH Training
Constitutional AI
Responsible Scaling Policy
Hiring and Team Structure

Mentioned

Daniela Amodei
Akshay Kothari
Dario Amodei
Jared
GPT-2
HHH
LLM
ML/AI
RSP