PydanticAI - The NEW Agent Builder on the Block
Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
PydanticAI aims to make LLM agent outputs reliably conform to Pydantic-defined schemas so downstream code can consume results safely.
Briefing
PydanticAI positions itself as a new, Pydantic-first agent and LLM application framework built to make model outputs reliably conform to structured schemas—so results can be used programmatically instead of treated as free-form text. The core idea is familiar from earlier “LLM + validation” patterns: define a data model, validate the model’s response against it, and—when needed—prompt the model to correct its output. What’s new is the shift from “plug Pydantic into an agent framework” to “build an LLM framework on top of Pydantic,” with the surrounding features—system prompts, tool use, chat history, and RAG-style workflows—designed to work with that validation layer.
The pitch matters because most agent frameworks already rely on schema validation somewhere in the pipeline, but teams still face friction: keeping outputs consistent, wiring structured results into downstream code, and managing production control flow. PydanticAI claims to address these issues with model-agnostic support (OpenAI, Google Vertex AI/Gemini, Grok, and planned Anthropic support), type-safe design, and “vanilla Python” control flow for agent composition. That Pythonic approach is presented as a production advantage: rather than relying on complex orchestration abstractions, developers can restructure state and workflow logic directly in Python.
From there, the walkthrough demonstrates three practical capabilities. First is straightforward chat prompting: create an agent with a system prompt and a user prompt, then swap models (e.g., using Gemini 1.5 Flash) and adjust the system prompt dynamically via injection. Second is structured output. By defining a Pydantic class for the expected response (example fields include city, country, and a reason), the model returns neatly formatted data. The example extends the schema to include a “famous person from the city,” showing how adding fields to the schema changes the output contract while keeping the rest of the workflow intact. The results are also inspectable—down to token usage—and accessible as Python objects or JSON.
Third is tool use / function calling. A weather agent is built with two tools: one tool resolves a location description into latitude and longitude, and a second tool uses those coordinates to fetch weather details (temperature and a natural-language description derived from API codes). When API keys are absent, the tools return dummy responses, but the tool-calling trace still shows the model deciding which tools to call and in what order. The example demonstrates multiple tool calls in one run (London and Singapore), followed by a final natural-language response that combines the tool outputs.
Overall, PydanticAI is framed as a compact alternative to heavier agent stacks: schema-driven reliability, dynamic prompt/tool injection, explicit message history management, and streaming-friendly structured output—implemented in a way that aims to stay readable and easy to adapt for real deployments. The accompanying mention of Logfire adds an observability option for tracking inputs and outputs, though it’s treated as optional rather than required.
Cornell Notes
PydanticAI builds an LLM agent framework around Pydantic so model outputs can be validated against explicit schemas. That approach targets a common pain point in agent development: free-form text is hard to use reliably in downstream code, so structured results need enforcement. The framework supports multiple model providers (OpenAI, Gemini/Vertex AI, Grok, with Anthropic support coming) and emphasizes type safety and “vanilla Python” control flow for easier production adaptation. In practice, it supports dynamic system prompts, structured output via Pydantic classes (including JSON/Python access), chat-style message history, and tool/function calling with dependency injection. The weather example shows the model making multiple tool calls and then producing a final consolidated response.
Why does schema validation matter for LLM agents, and how does PydanticAI use it?
How does PydanticAI handle dynamic prompting and model switching during a conversation?
What does “structured output” look like in the examples, and what changes when the schema changes?
How does tool use / function calling work, and what is the role of dependencies?
What does the weather tool-calling trace reveal about multi-step reasoning in practice?
Review Questions
- How would you design a Pydantic response schema for an agent that must return both user-facing text and machine-readable fields?
- What are the practical benefits of using “vanilla Python” control flow for agent composition compared with more abstract orchestration layers?
- In the weather example, what information must flow from the first tool to the second tool, and how does dependency injection support that pipeline?
Key Points
- 1
PydanticAI aims to make LLM agent outputs reliably conform to Pydantic-defined schemas so downstream code can consume results safely.
- 2
The framework is built around Pydantic rather than treating validation as an add-on inside another agent framework.
- 3
Model support is presented as model-agnostic, with OpenAI, Google Vertex AI/Gemini, Grok supported and Anthropic support described as forthcoming.
- 4
Structured outputs are produced by defining Pydantic classes for the expected response fields, and the results can be retrieved as Python objects or JSON.
- 5
System prompts and tools can be injected dynamically, enabling changes in behavior without rebuilding the entire agent graph.
- 6
Chat-style message history can be passed into subsequent calls, including switching LLMs mid-conversation while preserving context.
- 7
Tool/function calling is implemented with dependency injection, and multi-step tool sequences are visible through tool-call traces and token usage metadata.