The 5 Types of LLM Apps

TL;DR

LLM apps can be organized into five categories: conversational agents, Copilots/duets, chat-with-data (RAG), traditional NLP task automation, and autonomous agents.

Briefing Cornell Notes

Briefing

LLM apps can be sorted into five practical categories—ranging from chat-style assistants to fully autonomous agents—so builders can more clearly match product design, data strategy, and defensibility to the kind of outcome they’re targeting. The biggest shift behind all five is that modern large language models can handle open-ended conversation and reasoning, making capabilities that were previously limited to narrow, rules-based systems suddenly usable at scale.

The first category is conversational chatbots and agents, which turn back-and-forth dialogue into a product feature. These range from virtual friends and dating partners to customer support, appointment setting, and sales outreach. Earlier attempts often worked only in closed domains and relied on rigid heuristics; the change came with widely capable models such as ChatGPT and Google’s Lambda (described as a Google internal model), which made coherent, general-domain conversation feel interactive. For builders, the key design questions are what personality the bot should have, what it should and shouldn’t discuss, what “success” looks like for each conversation, and whether it needs memory—especially for long-running relationships where follow-ups depend on storing prior user context.

The second category is Copilots and “duets,” inspired by Microsoft and Google’s framing of assistants that help users accomplish goals faster. Two subtypes dominate. One embeds into existing software and SAS tools, using company-held customer data (analytics, knowledge, usage patterns) to guide users through specific, bounded tasks. This model tends to favor large incumbents because they already possess the data needed to personalize guidance and build interactive discovery flows. The other subtype focuses on education and learning: curating exercises and guidance as users progress through a topic. Khan Academy’s Khanmigo is cited as an example that adapts to a learner’s level, and the transcript suggests startups have room here because learning journeys can be guided without requiring the same depth of proprietary customer behavior data.

The third category is “chat with data” apps built on retrieval augmented generation (RAG). These let users ask questions in natural language while a retrieval layer pulls relevant information from sources like PDFs or a SQL database. The promise is quick, conversational access to large internal or document-based datasets. But defensibility is hard: simple RAG wrappers can be replicated quickly, and higher-quality retrieval augmentation requires careful, topic-specific customization. The transcript points to a market lesson—chat-with-PDF apps lost momentum after GPTs made similar functionality broadly accessible.

The fourth category moves beyond user-facing chat into traditional NLP tasks performed in the background. Named entity recognition, sentiment analysis, coreference resolution (tracking what “it/they/he” refers to), and data extraction are increasingly handled well by large language models, often outperforming specialized models—especially when fine-tuned.

The fifth and most ambitious category is autonomous agents. Instead of merely answering, these systems autonomously execute multi-step tasks using reasoning and decision-making, often built from modular “mini agents” rather than one all-powerful unit. A notable technique is self-critique and recursive improvement: another agent (using the same model) reviews work and catches weaknesses, producing better results than a single pass. Production remains challenging because strong reasoning/decision-making often depends on state-of-the-art models (the transcript cites latest Gemini and GPT-4-class models) or heavy fine-tuning. Still, it’s framed as the area likely to see the fastest growth over the next year or two.

Overall, the five categories—conversational agents, copilots/duets, chat-with-data RAG, NLP task automation, and autonomous agents—provide a roadmap for aligning product goals, data inputs, and competitive moats with the right LLM application pattern.

Cornell Notes

The transcript groups LLM applications into five categories: conversational chatbots, Copilots/duets, chat-with-data (RAG), traditional NLP task automation, and autonomous agents. Chatbots focus on interactive dialogue and may require personality and memory to sustain long-term usefulness. Copilots come in two forms: task-focused assistants embedded in software (often data-rich incumbents) and education-style guidance that curates learning paths (where startups can compete). RAG apps enable natural-language access to documents or databases, but they struggle with defensibility because simple implementations are easy to copy and high-quality retrieval needs customization. Autonomous agents aim to automate outcomes through modular reasoning, self-critique, and recursive improvement, but they require top-tier reasoning models or specialized fine-tuning.

Why did conversational agents become broadly viable, and what design choices matter most?

Conversational agents shifted from narrow, rules-based systems to broadly usable products because modern large language models can sustain coherent, open-domain dialogue. The transcript highlights ChatGPT and Google’s Lambda as key enablers. For product design, the most important choices are the bot’s personality, its conversation boundaries (what it will and won’t discuss), the desired outcome of each interaction (e.g., resolve a support issue quickly vs. keep a virtual relationship going), and whether it needs memory to support follow-ups over time.

How do Copilots differ from chatbots, and what two subtypes dominate?

Copilots are positioned as goal-oriented assistants that help users achieve outcomes faster by leveraging existing data and context. Two subtypes are emphasized: (1) embedded Copilots inside software/SaaS tools that guide users through specific tasks using company customer data (analytics, knowledge, usage patterns), which tends to favor large incumbents; and (2) education/learning Copilots that curate exercises and adapt guidance to a learner’s level, such as Khan Academy’s Khanmigo.

What makes RAG (“chat with data”) useful, and what threatens its business moat?

RAG combines retrieval from a knowledge source (like PDFs or a SQL database) with a large language model to answer questions in natural language. The utility is fast, conversational access to large datasets. The transcript identifies two moat problems: simple RAG setups can be replicated quickly and become “just a wrapper around an API,” and high-quality retrieval augmentation is difficult because it often requires topic-specific customization. It also notes that chat-with-PDF apps lost momentum after GPTs made similar capabilities widely available.

Which traditional NLP tasks are increasingly handled by large language models, and why does that matter?

The transcript points to named entity recognition, sentiment analysis, data extraction, and co-reference resolution. Co-reference resolution is used to determine what pronouns like “it,” “they,” or “he/she” refer to earlier in a sentence. Sentiment analysis is framed as classifying neutrality/positivity/negativity and prioritizing responses based on customer emotion. The implication is that teams can consolidate multiple specialized models into one or a few LLM-based pipelines, often achieving state-of-the-art results when fine-tuned.

What defines autonomous agents, and how do self-critique and modularity improve results?

Autonomous agents aim to automate outcomes by using reasoning and decision-making rather than only producing conversational replies. They’re typically modular—built from many mini agents instead of one agent doing everything—because current models shouldn’t be overloaded with too many responsibilities. The transcript also highlights recursive improvement: an additional agent using the same model reviews and critiques prior work, often catching reasoning flaws and improving final results compared with a single-pass response.

Review Questions

Which category best fits an app that answers questions over a company’s internal documents, and what two issues make it hard to defend competitively?
What are the two subtypes of Copilots, and why does one tend to favor incumbents while the other may be more startup-friendly?
How do autonomous agents differ from conversational chatbots, and what role do modular mini agents and self-critique play?

Key Points

1
LLM apps can be organized into five categories: conversational agents, Copilots/duets, chat-with-data (RAG), traditional NLP task automation, and autonomous agents.
2
Conversational agents became practical at scale because modern LLMs can sustain open-domain dialogue, replacing earlier rules-based, closed-domain systems.
3
Copilots split into task-focused assistants embedded in software (often requiring proprietary customer data) and education-style guidance that curates learning paths.
4
RAG enables natural-language Q&A over documents or databases, but simple implementations are easy to copy and high-quality retrieval often needs topic-specific customization.
5
Traditional NLP tasks like sentiment, co-reference resolution, and data extraction increasingly benefit from LLMs, reducing the need for separate specialized models.
6
Autonomous agents target automation through modular reasoning and recursive self-improvement, but production readiness depends on strong reasoning models or heavy fine-tuning.

Highlights

Conversational agents moved from closed, heuristic systems to broadly usable products once models like ChatGPT and Google’s Lambda could sustain coherent, open-domain dialogue.

RAG apps often struggle with defensibility: either they’re easy to replicate as API wrappers or they require deep, customized retrieval work to perform well.

Autonomous agents are framed as modular systems that improve output by having one agent critique another’s reasoning, often outperforming single-pass generation.

Copilots come in two distinct forms—software-embedded task helpers and education-focused learning guides—each with different data and competition dynamics.

Topics

LLM App Categories
Conversational Agents
Copilots
RAG Retrieval
Autonomous Agents

Mentioned

Sam Witteveen
LLM
GPTs
RAG
NLP
SAS