The Improved Gemini 2.5 Pro - A Coding Powerhouse

TL;DR

Gemini 2.5 Pro’s preview is framed as a coding-focused upgrade, with improved translation of plans into working code.

Briefing Cornell Notes

Briefing

Google’s new Gemini 2.5 Pro preview version is being positioned as a major step up for coding—less about generic “reasoning” gains and more about turning large, messy real-world context (videos plus docs) into working software with minimal prompting. The practical takeaway: feed the model a long tutorial or reference material and it can generate a complete, runnable codebase—often including project structure, configuration files, and tool wiring—fast enough to feel like “one-shot” development.

Early tests focus on whether the model can translate game ideas into functioning code. A prior attempt to build an Angry Birds clone with Pygame reportedly failed to work reliably. With the updated model, the generated plan becomes more structured: it breaks the task into components, produces a single-file Pygame implementation, and includes setup instructions. After copying the code and running it, the game draws the ball and supports play, culminating in a win—an outcome framed as evidence of a more “polished” coding pipeline than before. A similar experiment with a Space Invaders game also lands on working gameplay, with minor visual issues (like shelters disappearing early) but the core loop functioning.

The more consequential demonstration shifts from toy games to agentic software engineering using the Google Agent Development Kit (ADK). By combining a large video guide with multiple text documentation files, Gemini is prompted to build a customer-support chat agent for a limited-edition sneaker store. Requirements include a “sneaker head” personality, politeness, and guardrails to prevent promising sold-out inventory. Without the user explicitly specifying tool definitions, the model still produces an ADK project: it generates an LLM agent, selects tools, creates a directory structure, and writes files such as a requirements file, a tools file, and an agent file. It even fabricates dummy inventory data when real stock details aren’t provided.

Once deployed through the ADK web interface, the agent behaves like a conversational assistant with personality (“Yo, what’s up?”) and uses tools for inventory checks. When asked about sizes and stock, it consults inventory and supports cart-like interactions. The guardrails also appear to be in place even though none were manually added in the prompt—refusing unrelated requests (e.g., politics) and steering users back toward sneaker releases.

The transcript then highlights the long-context angle: the model can ingest extremely large inputs—hundreds of thousands of tokens—so users can “stuff” a full learning or coding corpus into one run instead of relying on modular retrieval. One example uses a long video plus a large context payload to generate a learning plan app for ancient Rome, extracting facts from the video and mapping them to clickable timestamps. The creator frames this as a broader shift from “vibe coding” to “vibe anything,” including learning systems, marketing plans, and sales copy.

Overall, the preview is presented as a coding-focused upgrade that makes it practical to generate working projects from mixed media inputs—videos, docs, and references—while also hinting at future API features like summaries of the model’s internal reasoning.

Cornell Notes

Gemini 2.5 Pro’s new preview version is framed as a coding upgrade that turns long, mixed context (videos plus documentation) into runnable software with minimal prompting. Tests show improved reliability for code generation, moving from earlier failures (like a Pygame Angry Birds clone) to working gameplay and better-structured plans. The strongest example builds a sneaker-store customer support agent using Google ADK: the model generates project files, tool wiring, and guardrails while the user mainly provides requirements and context. The workflow emphasizes “one-shot” development enabled by a large context window, reducing the need for separate retrieval steps. The result is positioned as faster prototyping for agents, learning apps, and other practical software tasks.

What evidence is used to claim Gemini 2.5 Pro is better at coding than the prior version?

The transcript compares earlier and updated attempts at small games. An Angry Birds clone made with Pygame previously “didn’t work out that great,” but the new version produces a more structured plan and a single-file Pygame implementation with setup instructions. After copying and running the code, the game supports drawing the ball and playing to a win. A Space Invaders game similarly runs successfully, with only minor issues like shelters disappearing at the start.

How does the sneaker-store agent example demonstrate “agentic” coding rather than just code snippets?

The model generates a full ADK project rather than a single script. It outputs a directory structure plus files such as a requirements file, a tools file, and an agent file. It also invents dummy inventory data when real stock details aren’t supplied. In the chat interface, the agent uses tools to check inventory and supports cart-like interactions, showing tool use and stateful behavior beyond static responses.

What role does long-context input play in the workflow described?

Long-context is treated as the enabler for one-shot project generation. The user loads a large video guide and multiple text documentation files into the model, and even with a video that consumes many tokens, the transcript says the context window isn’t fully used. Because the model has the docs and examples “in the prompt,” it can derive ADK concepts, select tools, and generate code without the user manually specifying tool definitions.

Where do guardrails come from in the sneaker agent demo?

Guardrails appear to be inherited from the included ADK documentation and examples rather than being explicitly requested in the prompt. The agent refuses an unrelated request (politics) and redirects toward sneaker releases. The transcript emphasizes that no follow-up prompt or manual guardrail code was added by the user after copying the generated output.

How is the learning-plan app example used to broaden the coding claim?

Instead of building a software agent for support, the transcript uses a large context payload (including a long video) to generate a learning plan website/app. The model extracts facts from the video and aligns them with timestamps so users can click through specific moments. The implication is that the same long-context-to-code pipeline can support learning tools, not only coding tasks.

Review Questions

In the Angry Birds and Space Invaders tests, what specific signs indicate improved coding reliability?
What files and components does Gemini generate for the ADK sneaker agent, and how does the agent demonstrate tool use after deployment?
Why does the transcript argue that large context can reduce the need for MCP-style retrieval, and what tradeoff does that imply?

Key Points

1
Gemini 2.5 Pro’s preview is framed as a coding-focused upgrade, with improved translation of plans into working code.
2
Game-generation tests suggest better structure and execution, moving from earlier non-working results to runnable Pygame and Space Invaders gameplay.
3
A major demo uses Google ADK to generate a complete customer-support agent project from a single prompt plus long video and doc context.
4
The model can produce project scaffolding (directory structure, requirements, tools, agent files) and wire tools without the user manually defining them.
5
The sneaker agent demonstrates inventory-aware behavior by using tools during conversation and supporting cart-like interactions.
6
Guardrails appear to be learned from the provided ADK documentation rather than manually implemented in the prompt.
7
Large context enables “one-shot” generation from mixed media inputs, reducing reliance on separate retrieval steps.

Highlights

The Angry Birds Pygame clone shifts from a prior failure to a working game after the updated Gemini 2.5 Pro preview generates a more structured plan and runnable code.

The sneaker-store ADK agent is built end-to-end—project structure, tools, and agent logic—then runs in the ADK web interface with inventory tool use.

Even without explicit tool instructions, the model outputs an ADK project that includes guardrails, demonstrated by refusing unrelated requests.

A half-million-plus token workflow is used to generate a learning app from a video, extracting facts and linking them to clickable timestamps.

Topics

Gemini 2.5 Pro
Coding Agents
Google Agent Development Kit
Long Context
Pygame Games

Mentioned

Google
Gemini
Google Agent Development Kit
ADK web
Pygame
Cursor
Windsurf
Google AI Studio
Sam Witteveen
ADK
LLM
MCP