Get AI summaries of any video or article — Sign up free
NEW LangChain Expression Language!! thumbnail

NEW LangChain Expression Language!!

Sam Witteveen·
5 min read

Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

LangChain’s Expression Language makes chain construction more readable by using a declarative, pipe-based syntax that exposes how data flows between steps.

Briefing

LangChain’s new Expression Language is a more declarative way to build LLM “chains,” making the flow of data through prompts, models, tools, and retrievers much easier to read and debug. After months of criticism aimed at confusing documentation and an overly complex API, the update focuses on clarity: developers can now see how inputs move step-by-step under the hood, using a syntax built around piping components together.

At the core, LangChain treats applications as compositions of prompts and LLM calls, with optional steps that incorporate other capabilities—like vector search for context, function calling, streaming/batching/async execution, and tool usage. The Expression Language introduces a “new syntax” that defines chains as connected components. In the simplest form, a prompt is piped into a model, producing an output. That same chain can then be executed in different modes using explicit methods: invoke for standard runs, batch for batch processing, and stream for streaming outputs. The result is less guesswork about what each chain does and fewer surprises about execution behavior.

The transcript walks through practical examples that show how the syntax changes day-to-day development. A basic chain can be defined as `prompt | model`, then optionally followed by an output parser to control the returned type—either an AI message object or a plain string. The same structure works across chat models and older text-style models, including Hugging Face models and OpenAI-style models like `text-davinci-003`, with the chain definition staying largely the same while the model changes.

Bindings add another layer of control. One example uses a stop binding to limit generation—such as stopping after the first line so a “three facts” prompt returns only one item. Another binding integrates OpenAI function calling by attaching a function schema to the model. In that setup, the model returns structured JSON-like arguments (e.g., a joke split into required fields like setup and punchline). Output passes then let developers extract either the full structured result or specific keys from it.

Retrievers demonstrate how the language handles data reuse inside a chain. When a question must be used both to fetch vector-store context and again to ask the model, the syntax uses pass-through mechanisms (like item getters) to route the same input to multiple steps without transformation. A Chroma vector store with OpenAI embeddings is used to retrieve context for questions like “who is James Bond,” then feed that context and the original question into a prompt. The example extends to multilingual answering by passing a language parameter and selecting it via dictionary-based getters.

Tools and runnable functions round out the picture. A DuckDuckGo example shows a non-agent workflow: a prompt rewrites a query for search, the search runs, and the results can be fed forward. Another example uses arbitrary runnable functions to compute values (like string length and arithmetic composition) before sending the final prompt to the model.

Overall, the Expression Language replaces much of the older chain-building approach with a clearer, pipe-based syntax. Agents aren’t the focus here, but the update positions future agent work to reuse the same declarative style, while delivering immediate benefits for summarization, chat workflows, and tool-augmented pipelines.

Cornell Notes

LangChain’s Expression Language introduces a declarative, pipe-based syntax for building LLM chains. Instead of wrestling with complex APIs and unclear behavior, developers define how prompts, models, output parsers, bindings, retrievers, and tools connect—making the data flow easier to understand and debug. The same chain can run via `invoke`, `batch`, or `stream`, and output can be shaped into either AI message objects or plain strings. Bindings enable stop-token control and OpenAI function calling, including extracting structured JSON fields. Retrievers and pass-through getters let the same input (like a question) feed multiple steps, such as vector search for context and the final prompt.

How does the Expression Language represent a basic LLM workflow, and why does that matter for debugging?

A chain is built by piping components together—most simply, a prompt is piped into a model (conceptually `prompt | model`). That makes the transformation path explicit: the prompt’s output becomes the model’s input. Because each step is a named component in the chain, it’s easier to trace what data is being produced and consumed at each stage, instead of relying on implicit behavior buried in older chain abstractions.

What changes when developers want different execution modes like single run vs batch vs streaming?

The transcript highlights that execution is handled with dedicated methods on the chain: `invoke` for normal runs, `batch` for batch processing, and `stream` for streaming outputs. Crucially, the chain definition itself doesn’t need to be rewritten just to change execution mode—only the execution method changes.

How do bindings like stop tokens and OpenAI function calling work in this syntax?

Bindings attach extra control to the model step. A stop binding can halt generation after a specific token sequence (example: stopping after a newline so a “three facts” prompt returns only the first item). For OpenAI function calling, a function schema is bound to the model; the model then returns structured arguments in JSON-like form (e.g., a joke with required `setup` and `punchline`).

How does the syntax handle cases where the same input must be used in multiple chain steps?

Retrieval workflows often need the question twice: once to query the vector store for context, and again to ask the model. The transcript describes using pass-through mechanisms (like item getters) so the original question can be routed to multiple steps without being transformed away. That enables a chain where the retriever consumes the question to produce context, while the prompt later consumes the same question again.

What role do output parsers and output passes play when function calling returns structured data?

Output parsers and output passes control what the chain returns. After function calling, the model can produce a structured result; output passes can either return the full structured object (so fields like `setup` and `punchline` can be accessed) or extract a specific key (e.g., returning only the `setup` string). This reduces manual parsing work and keeps downstream steps consistent.

How do tools fit into the Expression Language workflow without turning everything into an agent?

Tools can be used as deterministic steps inside a chain. The transcript’s DuckDuckGo example rewrites a query via a prompt, runs the search, and returns search results—without using a ReAct-style agent loop. Those results can then be piped into another prompt/model step for a more grounded answer.

Review Questions

  1. When would you choose `invoke` versus `batch` versus `stream`, and how does the Expression Language keep the chain definition stable across those choices?
  2. In a retrieval-augmented chain, how does the syntax ensure the question is used both for vector search and again for the final prompt?
  3. What are two different uses of model bindings shown in the transcript, and how do they change the shape of the model output?

Key Points

  1. 1

    LangChain’s Expression Language makes chain construction more readable by using a declarative, pipe-based syntax that exposes how data flows between steps.

  2. 2

    Chains can be executed in different modes—`invoke`, `batch`, and `stream`—without rewriting the chain structure.

  3. 3

    Output can be controlled by adding an output parser, letting developers choose between AI message objects and plain strings.

  4. 4

    Model bindings provide practical control, including stop-token behavior and OpenAI function calling with structured JSON-like outputs.

  5. 5

    Function calling can be paired with output passes to extract either the full structured result or specific keys like `setup`.

  6. 6

    Retrieval workflows rely on pass-through routing (e.g., item getters) so the same input (like a question) can feed both the retriever and the final prompt.

  7. 7

    Tools like DuckDuckGo can be integrated as straightforward chain steps rather than requiring agent-style orchestration.

Highlights

The pipe syntax (`prompt | model`) turns chain logic into an explicit data pipeline, making it easier to see what’s happening at each step.
Execution mode becomes a method choice (`invoke`, `batch`, `stream`) rather than a redesign of the chain itself.
Stop bindings can truncate generation—so a “three facts” prompt can be forced to return only the first item.
OpenAI function calling fits naturally into the chain via bindings, returning structured arguments that can be selectively extracted.
Pass-through getters enable retrieval chains where the question is reused for both context fetching and the final answer prompt.

Topics

  • LangChain Expression Language
  • Declarative Chains
  • OpenAI Function Calling
  • Retrievers and Vector Search
  • Tool-Augmented Pipelines

Mentioned

  • LLM