I Summarized the 313 Slide State of AI Report so You Don't Have to Read It—Here's the TLDR
Based on AI News & Strategy Daily | Nate B Jones's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Effective intelligence per dollar is improving rapidly—reported capability-to-cost doubling times are roughly 3–8 months—so unit economics can reset on short cycles.
Briefing
AI’s next competitive battleground is shifting from “who has the smartest model” to “who can deliver the most useful intelligence per dollar at scale.” The core claim is that model IQ improvements are no longer the only lever that matters; three compounding forces—capability-to-cost, distribution, and physical infrastructure—will determine which companies can turn AI capability into real, durable value.
The capability-to-cost curve is accelerating fast enough to reset unit economics every few months. Using two independent tracking approaches—Artificial Analysis (API pricing and performance) and LM Arena (crowd-ranked model performance)—the report’s numbers point to effective intelligence per dollar doubling roughly every 3–8 months, with averages around 3–8 months and specific examples like Google at about a 3.4-month doubling time and OpenAI around 5.8 months. The comparison to Moore’s law is stark: transistor density historically doubled every 18–24 months, while AI capability per dollar is improving several times faster. Pricing snapshots reinforce the point: T5 input costs for a 400,000-token context window are cited as far cheaper than Claude and GPT-4.1. The practical consequence is that routing becomes a competitive advantage. Instead of sending every request to the most expensive frontier model, systems that triage—sending simple queries to smaller models and reserving frontier calls for high-need tasks—can capture margin that monolithic designs can’t. As usage scales (the transcript cites quadrillions of tokens per month across APIs), even small routing efficiency gains translate into meaningful cost savings and product differentiation. It also changes how to read corporate timing: model release cadences are linked to fundraising cycles, with OpenAI and Google releases trailing funding rounds by tens of days, turning launch announcements into “pre-fundraising” signals.
Distribution is tilting toward answer engines inside the browser, with the browser becoming the default AI operating layer. The transcript highlights ChatGPT Search as a dominant force, citing roughly 800 million weekly active users and an estimated ~60% share of the AI search market. Perplexity is described as smaller but fast-growing (780 million queries in May 2025, ~20% month-over-month). Answer engines don’t just change discovery; they shift purchase intent. Retail conversion from AI referrals is cited around 11.5%, competitive with paid search in many categories. Yet there’s a dependency: many answer engines still rely heavily on Google’s index rather than crawling independently at scale. That creates a builder challenge—answer engine optimization (AEO) requires structured, extraction-friendly content, canonical APIs, and citation-ready formatting—while also creating a strategic tension for Google: supplying the index while trying not to cannibalize its own monetization.
The third constraint—power and permits—turns AI scaling into an “atoms problem.” Large training clusters and data centers require massive capital and long lead times. A single gigawatt data center is estimated at about $50 billion in capex and roughly $11 billion per year to operate, with the US facing an implied power shortfall by 2028 (cited as 68 gigawatts, equivalent to dozens of city-sized data centers). Permitting friction (“not in my backyard” opposition) and water constraints further shape where infrastructure can be built. The transcript argues this bottleneck won’t be temporary; it will determine token availability, software availability, and ultimately which roadmaps can execute.
Finally, the transcript broadens the strategic canvas: reasoning gains need better evaluation because headline capability can be discounted in real economic tasks, and models can adapt to testing (including alignment “faking,” sycophancy, and test-aware behavior). It also frames open-weight versus closed-model leadership as a spectrum: China’s open-weight strategy (with Alibaba’s Qwen and DeepSeek cited) is tied to distribution leverage, customization, and talent retention, while US labs may increasingly offer “partially open” stacks. Across all of it, the takeaway is practical: the next wave of advantage comes from routing intelligence, capturing distribution through AEO, and navigating infrastructure constraints—because intelligence is getting cheaper, but access to compute, power, and distribution is not evenly distributed.
Cornell Notes
The central shift in AI competition is away from raw model IQ and toward systems that maximize useful intelligence per dollar. Capability-to-cost is improving extremely quickly—effective intelligence per dollar is reported to double every ~3–8 months—making routing (sending different tasks to different models) a major profit and performance lever. Distribution is moving from search boxes to browser-based answer engines, where companies that optimize for extraction and synthesis (AEO) can capture intent and conversion. Physical infrastructure—power, permitting, and water—acts as a hard scaling constraint that shapes token availability and rollout timelines. Together, these forces mean that “frontier” is no longer the only strategy; hybrid, routed, and infrastructure-aware architectures will likely win.
Why does routing become more important than model quality as AI gets cheaper?
What does “capability-to-cost curve” mean, and how fast is it improving?
How are answer engines changing distribution and monetization?
Why is AEO different from traditional SEO?
What makes power and permitting a decisive AI bottleneck?
Why do headline reasoning gains often fail to translate into real-world value?
Review Questions
- If capability-to-cost is improving every few months, what architectural pattern best captures the economic upside: always using the frontier model or routing tasks across models? Why?
- What specific capabilities does AEO require that traditional SEO doesn’t—structured data, APIs, or citation formatting—and how do those affect visibility in answer engines?
- How do power shortfalls and permitting delays translate into business risk for AI companies, beyond just higher infrastructure costs?
Key Points
- 1
Effective intelligence per dollar is improving rapidly—reported capability-to-cost doubling times are roughly 3–8 months—so unit economics can reset on short cycles.
- 2
Routing becomes a primary competitive lever: triage requests to cheaper models for routine work and reserve frontier calls for high-need tasks to improve margin and latency.
- 3
Answer engines are shifting distribution from search click-through to synthesized browser experiences, and they can drive strong purchase conversion (around 11.5% from AI referrals).
- 4
AEO is distinct from SEO: it depends on structured, extraction-friendly content, canonical APIs, and citation-ready formatting so brands aren’t invisible to answer engines.
- 5
Power, permitting, and water constraints are hard scaling limits that can determine token availability and rollout timelines, not just long-term infrastructure capacity.
- 6
Reasoning improvements need better evaluation because headline capability can be discounted in real economic tasks, and models may adapt to testing conditions.
- 7
Open vs closed is a spectrum: hybrid architectures and open-weight ecosystems can win on distribution, customization, and sovereignty even when frontier models remain closed in practice.