Most Popular Framework-Langchain vs LangGraph

TL;DR

LangChain structures LLM applications around a retrieve → summarize → output pipeline with retrieval driven by document loaders, text splitters, and vector databases.

Briefing Cornell Notes

Briefing

LangChain and LangGraph both help build LLM-powered applications, but they’re optimized for different kinds of workflows: LangChain is built around a mostly linear, sequential pipeline for retrieval and response generation, while LangGraph is designed for stateful, multi-agent systems where tasks can branch, loop, and share memory.

In LangChain, the core structure is a three-part flow. First comes “retrieve,” which is where data is ingested and prepared for the model. Retrieval starts with data injection via document loaders that can pull content from many sources—PDFs, spreadsheets (Excel/CSV), websites (including web scraping), and even third-party sources like Wikipedia. Next is chunking through a text splitter, because LLMs have context-window limits and can’t ingest entire documents at once. The chunks are then converted into vector embeddings and stored in a vector database. From there, semantic search (and related retrieval methods) pulls the most relevant context back out so the LLM can generate an accurate answer.

After retrieval, LangChain moves to “summarize,” where the workflow runs in a sequential order: once execution moves forward, it doesn’t naturally return to earlier steps. This is implemented through chaining—prompt → LLM → context—where the context comes from the vector database. The output can also be enhanced with additional components such as persistent memory, and the design supports chaining multiple prompts or even using separate LLMs, but the overall execution remains one-way.

LangGraph shifts the emphasis from linear chains to stateful multi-agent orchestration. Instead of a strictly sequential pipeline, it uses a graph of tasks made up of nodes and edges. Each node can represent a task executed by a separate AI agent, and edges define how information flows between tasks. Crucially, the graph can include conditional edges, loops, and backtracking—allowing the system to revisit earlier tasks when needed. Outputs from one node can feed into others in either direction, enabling feedback mechanisms and even human feedback.

A key differentiator is shared persistent memory across the graph. Because memory is shared among nodes, updates made in one part of the workflow can be accessed by other tasks, making complex coordination more efficient than a purely sequential setup. This makes LangGraph particularly suited to agentic workflows such as multi-step software development life cycle processes—requirements gathering, documentation, code creation, unit testing, peer review, quality checks, and implementation—where decisions and revisions are expected.

The practical takeaway ties back to RAG and agentic RAG. Traditional RAG typically routes user input to an LLM that consults a retrieval database for context. Agentic RAG adds decision-making: agents determine whether to call the database, use tools, or take other actions automatically, turning retrieval and tool use into a workflow rather than a fixed pipeline. The result is that LangChain fits well when retrieval-to-answer can be handled in a straightforward sequence, while LangGraph fits when the application needs branching logic, iterative refinement, and coordinated multi-agent behavior with shared state.

Cornell Notes

LangChain is geared toward LLM applications built as a mostly sequential pipeline. Retrieval in LangChain relies on document loaders for data injection, text splitters for chunking to fit LLM context windows, and vector embeddings stored in a vector database to enable semantic search and context retrieval. Summarization/answer generation then runs through chaining (prompt → LLM → context), typically moving forward without returning to earlier steps.

LangGraph is designed for stateful multi-agent workflows. It represents work as a graph of nodes and edges, where tasks can branch, loop, and revisit earlier steps via conditional edges. A shared persistent memory lets updates made in one node be accessible across the graph, supporting coordinated, feedback-driven workflows. This makes LangGraph especially useful for agentic RAG and complex multi-step processes like iterative software development.

What are the three main components of LangChain’s workflow, and what happens inside “retrieve”?

LangChain’s workflow is organized into three components: retrieve, summarize, and output. Inside retrieve, data injection loads documents from sources such as PDFs, Excel/CSV files, websites (including web scraping), and third-party sources like Wikipedia. Then a text splitter breaks the loaded text into smaller chunks to handle LLM context-window limits. Finally, vector embeddings convert chunks into vectors and store them in a vector database, enabling semantic search to fetch the most relevant context for the LLM.

Why does LangChain require chunking before sending context to an LLM?

LLMs have a limited context window, so entire documents can’t be provided at once. LangChain uses a text splitter to divide retrieved documents into smaller chunks. Those chunks are embedded and stored in a vector database, so semantic search can later retrieve only the most relevant pieces that fit within the model’s context constraints.

How does LangChain’s execution style differ from LangGraph’s in terms of control flow?

LangChain follows a sequential execution order: once the chain moves forward (prompt → LLM → context), it doesn’t naturally return to earlier steps. LangGraph, by contrast, uses a graph structure where tasks can move along edges with conditional logic, including the ability to go back and re-execute earlier tasks. This supports iterative refinement and feedback loops.

What does “stateful” mean in LangGraph, and how does persistent memory change coordination?

In LangGraph, stateful behavior is enabled by persistent memory shared across nodes. Because memory is accessible to all tasks, updates to variables in one node can be read by other nodes. This shared state makes multi-step coordination more efficient than isolated, one-way chaining.

How do traditional RAG and agentic RAG differ in workflow design?

Traditional RAG typically connects an LLM to a RAG database: the LLM consults retrieved context from the database to generate an answer. Agentic RAG introduces agents that decide what to do—whether to call the database, use tools, or take other actions—automatically. That decision-making turns retrieval and tool usage into a workflow rather than a fixed retrieval-to-answer path.

Why is LangGraph well-suited for complex processes like software development workflows?

LangGraph’s node-and-edge graph supports branching, loops, and conditional edges, which mirror real software development cycles. Tasks like requirements gathering, documentation, code creation, unit testing, peer review, quality checks, and implementation often require revisiting earlier steps based on feedback. Shared persistent memory also helps keep the evolving state consistent across the workflow.

Review Questions

When building a RAG system in LangChain, which components handle (1) loading data from sources, (2) chunking for context limits, and (3) enabling semantic retrieval?
In LangGraph, how do nodes, edges, and conditional edges enable backtracking or re-execution compared with LangChain’s sequential chaining?
What practical capability does shared persistent memory provide across LangGraph tasks, and why does that matter for agentic workflows?

Key Points

1
LangChain structures LLM applications around a retrieve → summarize → output pipeline with retrieval driven by document loaders, text splitters, and vector databases.
2
Accurate retrieval in LangChain depends on correct parsing/loading and chunking strategies that respect each LLM’s context-window limits.
3
LangChain’s chaining execution is largely one-way and sequential, moving forward through prompt → LLM → context without natural backtracking.
4
LangGraph models workflows as a graph of nodes and edges, enabling branching, loops, and conditional re-execution for iterative problem-solving.
5
LangGraph supports stateful coordination through shared persistent memory, letting updates in one task be available to other tasks.
6
Agentic RAG differs from traditional RAG by adding agents that decide when to call the database or tools, turning retrieval into a dynamic workflow.

Highlights

LangChain’s retrieval pipeline hinges on document loaders (data injection), text splitters (chunking), and vector embeddings stored in a vector database for semantic search.

LangChain’s chaining runs in a sequential order—execution moves forward rather than returning to earlier steps.

LangGraph’s graph structure allows conditional edges and backtracking, making it suitable for iterative, feedback-driven workflows.

Shared persistent memory across LangGraph nodes supports efficient coordination in multi-agent systems.

Agentic RAG replaces fixed retrieval with agent-driven decisions about database calls and tool usage.

Topics

LangChain vs LangGraph
RAG Systems
Agentic RAG
Vector Databases
Multi-Agent Workflows

Mentioned

Krish Naik