GPT-4.5 shocks the world with its lack of intelligence...
Based on Fireship's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
GPT-4.5 is portrayed as extremely expensive, with pricing cited at $75 per million input tokens and $150 per million output tokens, and access limited to $200-per-month Pro users.
Briefing
OpenAI’s GPT-4.5 launch lands as a costly, underwhelming step forward—one pitched mainly around “vibes” and a more natural chat style rather than measurable breakthroughs. The model is described as the most expensive OpenAI system yet, with pricing pegged at $150 per million output tokens (and $75 per million input tokens), and access limited to $200-per-month Pro users. That price tag matters because it raises the bar for what counts as progress: users and benchmarks are expected to show clear gains, not just subjective improvements.
The central claim in the rollout is that GPT-4.5 reduces hallucinations and performs better on a new “Vibes Benchmark” meant to capture creative thinking. In practice, the demo still produces obvious errors and inconsistent knowledge. It reportedly makes silly mistakes, isn’t self-aware, and even misstates basic facts about itself—claiming a training cutoff of October 2023 and failing to demonstrate a coherent understanding of what “GPT-4.5” is. The transcript’s examples underline the gap between marketing and reliability: it can answer a trivia-style question about letters in “Strawberry,” but then gives incorrect results for a follow-up about the number of “L”s in “Laap paloa.”
When the discussion turns to technical performance, the disappointment sharpens. The model is framed as weaker than “deep thinking” alternatives for programming and science tasks, and it performs poorly on the AER polyglot coding benchmark—both in quality and in cost. The comparison is not just “slightly worse,” but “hundreds of times more expensive” than the better-performing option mentioned in the transcript.
The broader market context adds pressure. The transcript points to xAI’s Gro as the current top model in a betting-market sense, with OpenAI still favored by the end of 2025 but with declining odds. That matters because OpenAI is portrayed as moving toward a for-profit structure that depends on sustaining a massive valuation while spending heavily on scaling. The transcript also references ongoing calls from tech leaders to regulate or stop training large models, and it criticizes the launch optics—Sam Altman allegedly sent interns to demo the system rather than appearing personally.
Finally, the transcript argues that the industry may be heading toward a “sigmoid of sorrow” rather than a singularity: impressive tools, but no sudden leap to artificial superintelligence. The most optimistic takeaway is narrower and practical—AI coding assistants are already powerful for real programmers, and the “plateau” is good news for students learning fundamentals. The overall verdict is that GPT-4.5 is a competent chat model, but not a benchmark-shattering advance commensurate with its cost and hype.
Cornell Notes
GPT-4.5 is presented as an expensive, hype-light upgrade that leans on subjective “vibes” and a new creativity-oriented “Vibes Benchmark” rather than clear benchmark dominance. Pricing is described as extremely high—$75 per million input tokens and $150 per million output tokens—with access limited to $200-per-month Pro users. In demos, the model still makes silly factual mistakes, misstates its own training cutoff, and shows weak performance on coding-focused evaluations like the AER polyglot coding benchmark. The transcript frames this as a broader market signal: xAI’s Gro is viewed as leading in betting-market terms, while OpenAI’s odds decline despite heavy investment. The practical conclusion is that AI coding tools help, but they don’t replace the need for real programming skill.
Why does GPT-4.5’s pricing become a central part of the criticism?
What is the “Vibes Benchmark,” and how does the demo performance affect its credibility?
What factual and self-referential mistakes are highlighted?
How does GPT-4.5 fare on coding benchmarks compared with alternatives?
What market and business pressures are mentioned, and why do they matter?
What does the transcript suggest about the path toward artificial superintelligence?
Review Questions
- What pricing details are cited for GPT-4.5, and how do they shape expectations for benchmark performance?
- Which specific demo failures (including self-referential claims) are used to challenge the “vibes” narrative?
- How does the transcript compare GPT-4.5’s coding performance and cost against DeepSeek and other alternatives like Gro or deep-thinking models?
Key Points
- 1
GPT-4.5 is portrayed as extremely expensive, with pricing cited at $75 per million input tokens and $150 per million output tokens, and access limited to $200-per-month Pro users.
- 2
The launch emphasis on subjective “vibes” and a “Vibes Benchmark” is challenged by reported demo errors and factual inconsistencies.
- 3
Reported self-referential issues include claiming a training cutoff of October 2023 and lacking coherent awareness of what GPT-4.5 is.
- 4
Coding performance is framed as weaker than “deep thinking” models and particularly poor on the AER polyglot coding benchmark relative to cost.
- 5
Market sentiment is said to favor xAI’s Gro at present, while OpenAI’s odds are described as declining despite continued investment.
- 6
The transcript argues the industry may be moving toward a plateau rather than a singularity, with AI tools improving but not delivering sudden superintelligence.