What Are Deep Agents? Shallow Agents Vs Deep Agents

TL;DR

Shallow agents typically rely on a single request-to-tool loop, which limits decomposition and continuity for complex tasks.

Briefing Cornell Notes

Briefing

Deep agents are built for complex, multi-step work that shallow agent loops struggle to handle—by adding explicit planning, task decomposition into sub-agents, shared system instructions, and persistent memory. Instead of sending a single query to an LLM and letting it decide whether to call a tool, deep agents run a structured workflow that can coordinate research, writing, verification, and other steps in parallel.

In the most basic “shallow agent” setup, an LLM acts like a decision layer: it either generates an answer directly or calls external tools (weather APIs, search APIs, database tools, and similar services) to fetch missing information. That produces a simple request → tool → output loop. The limitation is that there’s no explicit planning stage and no robust context retention across steps. For straightforward questions—like retrieving the current temperature of a city—this works well. But for complex requests (such as compiling today’s AI news and tying it to economics and physics), the task needs decomposition into sub-questions and coordinated execution, which the shallow loop doesn’t handle reliably.

A step up is the ReAct-style agent pattern, where the LLM can alternate between reasoning and acting. The model receives a system prompt, chooses among multiple tools, and then uses tool observations as context for further tool calls. This enables multi-step problem solving, and the loop can run repeatedly until a final answer emerges. Yet it still lacks deeper structure: it’s essentially LLM + tools operating in a continuous cycle, without explicit planning artifacts, state management, or persistent memory shared across steps.

Deep agents change the architecture. They’re described as having four core components: (1) a planning tool that converts the user request into a to-do list, (2) sub-agents that execute each planned step, (3) a system prompt that governs how the agent(s) should behave, and (4) a file system that acts as persistent memory accessible to all sub-agents. The planning stage is the key shift—turning a vague goal into an ordered set of tasks that can be assigned and tracked.

A concrete example uses a Cloud Code deep research workflow. The system prompt instructs the assistant to behave as an interactive CLI tool for software engineering tasks and to refuse malicious code. When given a travel-planning request (e.g., booking a Paris trip within a budget and duration), the deep agent first produces a day-by-day to-do list (travel, lodging, activities, return). It then spawns sub-agents to execute each item. The shared file system lets sub-agents store and retrieve intermediate results so the overall effort stays coherent.

The same structure applies to content workflows. For a blog request, a deep research agent can plan tasks like researching the topic, doing additional research from papers or other sources, drafting the blog, and running a copyright check. Sub-agents can work in parallel—one focused on internet research, another on archive/paper sources, another on writing, and another on compliance checks—before producing a consolidated final output.

Finally, the transcript outlines an implementation approach using an open-source library called “LangChain” examples and an “agents” library (referred to as “D agents” in the narration). It demonstrates creating a deep agent by defining tools (such as an internet search tool using Tavily), providing a system prompt, selecting a model, and invoking the agent to perform streaming deep research results. The tradeoff is time: deep agents take longer, but they generate more complete multi-step outputs suited to complex tasks.

Cornell Notes

Deep agents are designed for complex tasks by adding structure that shallow agents lack. Instead of a single request-to-tool loop, deep agents first create an explicit plan (a to-do list), then assign each planned step to sub-agents. A shared system prompt guides behavior, while a file system provides persistent memory so sub-agents can coordinate and reuse intermediate results. This architecture enables workflows like deep research and content generation—researching from the internet and archives, drafting, and running checks—often with parallel execution. The payoff is better handling of multi-part requests that require decomposition, coordination, and continuity across steps.

What makes a “shallow agent” limited when tasks get complicated?

A shallow agent typically follows a simple loop: an LLM decides whether to answer directly or call a tool, then returns the tool output. There’s no explicit planning stage, so complex requests aren’t decomposed into sub-questions. The transcript also highlights limited context retention—after one pass, there isn’t a robust mechanism for maintaining and sharing state across steps—so multi-step goals (like connecting today’s AI news to economics and physics) become hard to execute reliably.

How does a ReAct-style agent improve over a shallow loop, and what still holds it back?

ReAct-style agents let the LLM alternate between reasoning and acting. After tool calls, observations are fed back into the LLM, and the loop can repeat multiple times until a final answer is produced. That helps with multi-step problem solving. However, the transcript emphasizes that it still doesn’t add deeper structure such as explicit planning artifacts, state management, or persistent memory shared across steps—so it remains largely “LLM + tools” rather than a coordinated system.

What are the four core components of a deep agent?

The transcript describes four properties: (1) a planning tool that converts the user request into a to-do list, (2) sub-agents that execute planned steps, (3) a system prompt that defines behavior and constraints, and (4) a file system that serves as persistent memory accessible to all sub-agents. Together, these components enable decomposition, coordination, and continuity across a multi-step workflow.

How does planning work in the deep agent example about booking a trip?

Given a travel request (Paris, budget, and trip length), the deep agent first generates a day-by-day to-do list: travel to Paris and lodging on day one, specific activities like visiting the Eiffel Tower on day two, other sightseeing on day three, and returning via flight on day four. After that plan exists, sub-agents are created to execute each to-do item, while shared memory (file system) helps keep results coordinated.

How can deep agents handle a blog-writing workflow end-to-end?

The transcript gives a blog example where the deep agent plans tasks such as: researching the blog topic, doing additional research from papers or other materials, writing the blog draft, and performing a copyright check. Sub-agents can specialize—one with internet access, another with archive/paper access, another focused on writing, and another on compliance—then run in parallel before producing the final output.

What implementation pattern is shown for building a deep agent with tools?

The transcript describes creating tools (notably an internet search tool using a Tavily client), defining a system prompt, and using a function like “create deep agent” with the tools, system prompt, and model. When invoked with a research topic, the agent performs deep research and streams results, taking longer depending on internet speed but producing more complete multi-step output.

Review Questions

Why does the transcript claim shallow agents struggle with complex tasks, even when they can call tools?
Compare ReAct-style looping with deep agents in terms of planning, memory, and coordination.
List the four core components of a deep agent and give one example of how each component is used in the travel or blog scenario.

Key Points

1
Shallow agents typically rely on a single request-to-tool loop, which limits decomposition and continuity for complex tasks.
2
ReAct-style agents improve multi-step behavior by feeding tool observations back into the LLM, but they still lack explicit planning artifacts and persistent shared state.
3
Deep agents add explicit planning (to-do lists) before tool use, turning vague goals into structured steps.
4
Sub-agents execute planned steps in parallel or sequence, enabling specialized work like research, drafting, and compliance checks.
5
A shared file system functions as persistent memory so sub-agents can coordinate and reuse intermediate results.
6
System prompts constrain and guide agent behavior, including safety or refusal rules (as illustrated by the Cloud Code prompt).

Highlights

Deep agents replace the “LLM decides → tool → answer” loop with a planned to-do list that drives execution.

Persistent memory via a shared file system is presented as a core mechanism for coordination across sub-agents.

A blog workflow can be decomposed into research, additional research from papers, drafting, and copyright checking—handled by specialized sub-agents. 

Topics

Deep Agents
Shallow Agents
ReAct Agents
Agent Planning
Persistent Memory

Mentioned

ChatGPT
Cloud Code
Anthropic
LangChain
Tavily
Manus AI
Zenodox
Krishna Naik
LLM
API
CLI
ReAct