What are Runnables in LangChain | Generative AI using LangChain | Video 8

TL;DR

LangChain runnables standardize LLM workflow steps into reusable “units of work” with a common interface, enabling generic composition.

Briefing Cornell Notes

Briefing

LangChain’s “runnables” are the missing abstraction that turns a pile of LLM-related components into a composable system. Instead of manually wiring prompts, models, retrieval steps, and output parsing together with custom glue code, runnables provide a common interface so each step can be treated like a reusable “unit of work.” The payoff is the ability to build complex LLM workflows by connecting standardized blocks—much like assembling Lego—while keeping the integration logic consistent.

The need for runnables traces back to how LangChain grew. Early on, LangChain became popular by standardizing access to different LLM providers: each vendor’s API behaves differently, so LangChain wrapped them so developers could switch models with minimal code changes. But once teams started building real applications—like PDF question answering—the work didn’t stop at calling an LLM. Developers also had to load documents, split them into chunks, generate embeddings, store them in a vector database, run semantic retrieval, and then format the retrieved context into a prompt for the LLM. In other words, “LLM calls” were only one small part of the pipeline.

LangChain responded by creating many components for these recurring tasks: document loaders, text splitters, embedding models, vector stores, retrievers, and output parsers. Even with these building blocks, developers still had to manually connect them. A key insight emerged: across many workflows, the same pattern repeats—create a prompt (often from a template), send it to an LLM, and then pass the result onward. LangChain introduced “chains” to automate these repeated wiring steps. For example, an “LLM chain” can take an LLM plus a prompt template and handle formatting and prediction automatically. A “retrieval QA chain” can bundle the retrieval step (query → semantic search in a vector database → retrieved text) with prompt construction and LLM answering.

However, the chain approach ran into a scaling problem. As more chains were added to cover more use cases, the codebase became heavier and the learning curve steeper—new users had to memorize which chain to use when. The deeper issue was compatibility: components weren’t designed with uniform interfaces from the start. Different components used different method names and calling conventions (e.g., LLMs use predict-style calls, prompt templates use format-style calls, retrievers use their own retrieval functions). Because of that, developers needed custom glue code to connect components for each new workflow.

Runnables address this by standardizing the interface across components. Every runnable follows a common contract with methods like invoke (single input), batch (multiple inputs), and stream (streaming outputs). Once standardized, components can be connected generically: the output of one runnable becomes the input of the next. Crucially, a composed workflow is itself a runnable, enabling nested and arbitrarily long pipelines.

The transcript then demonstrates the idea from scratch: dummy LLM and dummy prompt-template classes are converted into runnable-compatible components via an abstract runnable base class. A runnable connector composes multiple runnables in sequence by looping through them and feeding each step’s output into the next. Finally, two smaller chains (generate a joke, then explain it) are connected into a larger chain without writing new wiring logic—showing how runnables make complex multi-step workflows flexible and reusable. The instructor closes by noting that LangChain’s real implementation follows the same structure, with concrete runnable classes inheriting from the runnable interface and implementing invoke.

Cornell Notes

Runnables in LangChain are standardized “units of work” that let developers compose LLM pipelines without custom glue code. Each runnable performs one task and exposes a common interface—especially invoke for single inputs—so outputs can feed directly into the next step’s inputs. LangChain previously relied on many “chains” to automate common wiring patterns (like prompt formatting + LLM prediction, or retrieval + QA), but too many chains made the library harder to maintain and learn. The runnable abstraction fixes the root compatibility problem by forcing components (LLM calls, prompt templates, parsers, retrievers) into a shared method contract. Once components are runnables, chaining them becomes generic, and even multi-step workflows become runnables themselves.

Why did LangChain move from components to chains, and then to runnables?

Components (document loaders, splitters, embeddings, vector stores, retrievers, parsers) covered many building blocks, but developers still had to manually wire them together. Chains automated repeated wiring patterns: an LLM chain handles prompt-template formatting and LLM prediction; a retrieval QA chain bundles semantic retrieval from a vector database with prompt construction and answering. Over time, adding many chains for many use cases made the codebase heavy and the learning curve steep. Runnables address the deeper issue: components weren’t standardized to connect cleanly, so custom glue code was still needed. Runnables standardize the interface so generic composition works across workflows.

What is the core runnable interface that enables composition?

Every runnable follows a common interface with methods such as invoke (process one input and return output), batch (process multiple inputs), and stream (produce streaming outputs). Because the method names and calling pattern are consistent, the output from one runnable can automatically serve as the input to the next. This uniform contract is what makes it possible to build arbitrarily complex pipelines by connecting runnables rather than writing bespoke integration code for each workflow.

How does the runnable connector conceptually work?

A runnable connector takes a list of runnables and an invoke input. It loops through the runnables in order: for the first runnable, it passes the original input; for each subsequent runnable, it passes the previous runnable’s output as the next runnable’s input. After the loop, it returns the final runnable’s output. This turns a sequence of steps into a single runnable-like workflow that can itself be composed further.

How do runnables simplify multi-step workflows like “generate a joke” then “explain the joke”?

Instead of manually calling two separate chains and stitching results, the workflow can be built by connecting runnables: one chain/runnable produces a joke from a topic using a prompt template + LLM, and the next chain/runnable consumes that joke to produce an explanation. Because both steps are runnables with compatible interfaces, the output key from step one becomes the input to step two, and the combined pipeline can be invoked once to get the final string output.

What problem did “too many chains” create?

Two major downsides were highlighted. First, the codebase grew large, increasing maintenance burden. Second, newcomers faced a steep learning curve because they had to know which chain to use for which scenario. The underlying reason was that components weren’t standardized for seamless connection, so each new use case often required additional custom wiring or new chain variants.

What does “standardizing components” mean in this context?

Standardizing means converting components into runnable-compatible classes that inherit from a shared abstract runnable base and implement the required common methods (notably invoke). In the transcript’s simplified example, dummy LLM and dummy prompt-template classes are adapted so they can be invoked through the same interface. Once standardized, they can be composed generically by a connector runnable without writing special-case glue code.

Review Questions

How do invoke, batch, and stream contribute to runnable composability in LangChain workflows?
Why did adding many chains become problematic, and how does runnable standardization address the root cause?
In the runnable connector loop, what exactly is passed from one runnable to the next, and why does that matter for building long pipelines?

Key Points

1
LangChain runnables standardize LLM workflow steps into reusable “units of work” with a common interface, enabling generic composition.
2
Chains automated common wiring patterns (prompt formatting + LLM prediction, retrieval + QA), but proliferating chains increased maintenance and learning complexity.
3
The deeper integration issue was component incompatibility: different components used different calling conventions, forcing custom glue code.
4
Runnables fix this by enforcing a shared contract (notably invoke, plus batch and stream), so outputs can feed into subsequent steps automatically.
5
A composed workflow becomes runnable-like itself, allowing nested and arbitrarily long pipelines without new wiring logic.
6
Runnable connectors can sequence multiple runnables by looping through them and passing each step’s output as the next step’s input.
7
Standardizing components is what makes multi-step tasks (e.g., generate → explain) flexible without writing bespoke integration for each new workflow.

Highlights

Runnables turn LLM pipelines into Lego-like blocks: standardized inputs/outputs let developers connect steps without custom glue code.

The “too many chains” problem wasn’t just scale—it reflected underlying component interface incompatibility that forced bespoke wiring.

A runnable connector composes steps by repeatedly feeding each runnable’s output into the next runnable’s input.

Once components share a runnable interface, even multi-step workflows can be treated as a single runnable and reused inside larger chains.

Topics

LangChain Runnables
LLM Chains
Retrieval QA
Prompt Templates
Composable Pipelines

Mentioned

Nitesh
LLM
API
RAG

What are Runnables in LangChain | Generative AI using LangChain | Video 8 | CampusX