Anthropic Does The Unthinkable with Haiku 3.5
Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Claude 3.5 Haiku’s pricing increases to $1 per million tokens in and $5 per million tokens out, roughly a fourfold jump from Claude 3 Haiku.
Briefing
Claude 3.5 Haiku arrives with a major price jump—$1 per million tokens in and $5 per million tokens out—turning what used to be a budget-friendly “workhorse” model into a more premium option. The central question raised by the rollout is whether the intelligence gains are large enough to justify a fourfold cost increase, especially as competitors keep pushing faster, cheaper models.
The update is available not only on Anthropic’s platform but also from day one on Amazon Bedrock and Google Vertex AI. Early testing described in the transcript finds the model promising for agentic workflows, with the claim that it can handle roughly 90% of common LLM tasks. But the pricing change dominates community reaction because the prior Claude 3 Haiku was priced at 25 cents per million tokens in and $1.25 per million tokens out. Moving to Claude 3.5 Haiku is therefore framed as a steep step up rather than a minor adjustment.
Anthropic’s justification is that Claude 3.5 Haiku surpasses Claude 3 Opus on many benchmarks while doing so at a fraction of the cost. That framing implies a shift in how buyers should think about model upgrades: as models get smarter, customers may need to pay more rather than expecting prices to fall. The transcript contrasts this with the broader industry pattern where newer models often come with reduced pricing over time.
A second constraint is capability tradeoffs. Claude 3.5 Haiku cannot accept images, so Anthropic keeps the original Claude 3 Haiku available for image analysis use cases—suggesting the new model is optimized for text-heavy workloads.
Quality-versus-price comparisons add pressure to the pricing decision. In third-party-style “analysis quality” and “quality vs. price” breakdowns cited in the transcript, Claude 3.5 Haiku lands just below Claude 3 Opus but above several open models (including Llama 3.1 and Mistral Nemo variants). Yet it still trails behind stronger closed models like GPT-4o and Gemini 1.5, and—surprisingly—behind GPT-4o-mini in some comparisons. The transcript also notes that speed alone won’t carry the economics: models such as GPT-4o-mini and Gemini 1.5 Flash are described as offering better price performance, with Flash variants potentially costing only a few cents per million tokens.
Finally, the transcript raises an uncertainty that matters for future purchasing: whether the “3.5” naming hides a larger internal compute jump. Because proprietary model families don’t clearly disclose size or architecture changes, the transcript speculates that Claude 3.5 Haiku may be closer to the old Sonnet tier, while the new Sonnet 3.5 may be closer to the old Opus tier—meaning the upcoming “3.5 Opus” could be substantially more expensive.
Over the next few days, the plan is to benchmark where Claude 3.5 Haiku is the best choice (coding and agent orchestration are highlighted) versus when cheaper alternatives—especially Gemini Flash for token-heavy workloads—should win. The broader takeaway is that the market may be entering a phase where new model generations can cost more upfront, even if they remain cheaper than last year’s weaker models.
Cornell Notes
Claude 3.5 Haiku is positioned as a smarter, faster small model, but it comes with a fourfold price increase: $1 per million tokens in and $5 per million tokens out (up from Claude 3 Haiku’s $0.25 in and $1.25 out). Early tests suggest strong performance for agentic workflows and coding, but third-party quality/price comparisons indicate it still lags behind top competitors like GPT-4o and Gemini 1.5, and even behind GPT-4o-mini on some price-performance measures. The model also cannot process images, keeping that capability on the original Claude 3 Haiku. The key learning point is how to decide when the added intelligence is worth the added cost, especially as cheaper, fast alternatives improve.
What changed most with Claude 3.5 Haiku, and why does it matter for everyday use?
What capability tradeoff comes with the new Haiku model?
How does Claude 3.5 Haiku compare on quality and price to other models mentioned?
What does the transcript suggest about why pricing increased so much?
Where does Claude 3.5 Haiku appear to be a strong fit?
What strategy does the transcript propose for choosing models going forward?
Review Questions
- If Claude 3.5 Haiku is 4× more expensive than Claude 3 Haiku, what kinds of tasks would still justify switching to it?
- How does the “no images” limitation change the decision between Claude 3.5 Haiku and the original Claude 3 Haiku?
- What evidence in the transcript suggests that speed alone may not be enough to win on cost-performance?
Key Points
- 1
Claude 3.5 Haiku’s pricing increases to $1 per million tokens in and $5 per million tokens out, roughly a fourfold jump from Claude 3 Haiku.
- 2
Claude 3.5 Haiku is available on Anthropic’s platform and from day one on Amazon Bedrock and Google Vertex AI.
- 3
The new Haiku model is text-only: it cannot accept images, while the original Claude 3 Haiku remains the image-capable option.
- 4
Quality/price comparisons cited place Claude 3.5 Haiku below GPT-4o and Gemini 1.5, and sometimes even behind GPT-4o-mini on price-performance.
- 5
The transcript highlights coding and agentic workflows as early areas where Claude 3.5 Haiku can justify its higher cost.
- 6
Uncertainty remains about how much compute changed behind the “3.5” naming, raising questions about future pricing for a potential “3.5 Opus.”
- 7
A workload-based strategy is recommended: use cheaper fast models for high-token-volume tasks and reserve stronger (often pricier) models for orchestration and complex reasoning.