AI AGENTS Updates From Google, OpenAI and Anthropic
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Google defines an AI agent as a goal-driven system that observes and acts using available tools, not just a text generator.
Briefing
AI agents are increasingly defined less by raw language ability and more by their ability to pursue goals through a loop of tool use—an approach Google lays out in a detailed 42-page framework and that OpenAI and Anthropic are now turning into working developer patterns.
Google’s paper frames a generative AI agent as an application that tries to achieve a goal by observing the world and acting on it using tools available to it. That “act” part is what distinguishes agents from standalone models: models are limited to what they learned during training, while agents extend knowledge and capability by connecting to external systems. Google argues that tools bridge the gap between impressive text generation and real-world interaction, enabling agents to pull in external data and trigger actions beyond the model’s native context.
The paper breaks the agent stack into an orchestration layer and a set of tool types. The orchestration layer governs the iterative process: take in information, perform internal reasoning, decide on the next action, and repeat until a goal is reached or a stopping condition triggers. Stopping can be handled by automated checks—such as an LLM judging whether an answer is good enough—or by escalation to a human for higher-stakes review. Google also separates “agents vs. models” explicitly: models rely on training-data knowledge, while agents gain extended knowledge through tool connections.
On the tool side, Google highlights three categories: extensions, function calling, and data stores. Extensions map to selecting the right external capability—analogous to choosing an API endpoint like a flights service or a maps service based on the user’s request. Function calling lets an agent choose among predefined, reusable code modules and supply the correct arguments according to a schema. Data stores provide runtime access to structured or unstructured information—often via vector databases for retrieval-augmented generation—so agents can query relevant documents instead of stuffing large corpora into the context window.
A key practical takeaway is that agent quality scales with model quality. Google’s framework implies that a strong orchestration layer and well-defined tools can’t compensate for a weak model that can’t reason, follow instructions, or select the right tools. The paper also points toward more complex “agent chaining” and multi-agent setups, where specialized agents can hand off tasks to one another and collectively solve harder problems.
The transcript then shifts from theory to implementation. OpenAI shares a reference implementation for orchestrating agentic patterns using its Realtime API, including sequential agent handoffs defined as an agent graph and escalation logic for high-stakes decisions. A live demo shows a voice-based authentication flow that spells out personal details letter by letter for confirmation.
Finally, Anthropic’s tool-use learning materials and a hands-on example demonstrate how to add tools to Claude via function calling and structured outputs. The walkthrough builds a simple “execute python file” tool by defining a JSON schema, wiring it into a Claude client, and having the model decide when to call the tool—producing deterministic results like returning the output of a test script. Together, the updates converge on the same theme: agents become useful when they can reliably choose tools, run them with correct inputs, and iterate toward a goal.
Cornell Notes
Google’s agent framework defines an AI agent as a goal-driven system that observes and acts using available tools, not just a model that generates text. The core mechanism is an orchestration layer that loops through reasoning and tool-based actions until a goal or stopping condition is reached, sometimes with LLM-based self-checks or human escalation. Google groups tools into extensions (choose the right API capability), function calling (invoke predefined functions with schema-defined arguments), and data stores (often vector databases for retrieval). The transcript emphasizes that agent performance scales with model quality: better reasoning and instruction-following improves tool selection and outcomes. OpenAI and Anthropic updates show how these ideas translate into working Realtime voice agent patterns and Claude tool-use with structured outputs.
How does Google define an “agent,” and what makes it different from a plain language model?
What role does the orchestration layer play in agent behavior?
What are Google’s three tool categories—extensions, function calling, and data stores—and how do they differ?
Why does Google argue that agent usefulness depends on model quality, not just tooling?
How does OpenAI’s Realtime API reference implementation demonstrate agentic patterns in practice?
What does the Anthropic tool-use example add, beyond “just calling a model”?
Review Questions
- What components must exist for an AI system to behave like an agent under Google’s framework (and what does each component do)?
- How do extensions, function calling, and data stores each contribute to an agent’s ability to act in the real world?
- In the Claude tool-use walkthrough, what information does the model need to call the tool correctly, and how is that represented?
Key Points
- 1
Google defines an AI agent as a goal-driven system that observes and acts using available tools, not just a text generator.
- 2
An orchestration layer runs an iterative loop of reasoning and action selection until a goal is reached or a stopping condition triggers.
- 3
Google’s tool categories—extensions, function calling, and data stores—map to choosing external capabilities, invoking schema-defined functions, and retrieving runtime knowledge.
- 4
Agent performance depends on the underlying model’s reasoning and instruction-following; strong tooling can’t fix a weak model.
- 5
OpenAI’s Realtime API reference implementation demonstrates multi-agent voice flows with sequential handoffs and escalation for high-stakes decisions.
- 6
Anthropic’s tool-use workflow shows how structured outputs and JSON schemas can make tool calling more deterministic and reliable.