LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab

TL;DR

LLMs generate text by predicting the next token from input tokens, but they are stateless and can’t directly access tools, databases, or APIs.

Briefing Cornell Notes

Briefing

Large language models are powerful at generating text, but they don’t plug cleanly into the rest of software—especially when apps need state, external data, or tool use. LangChain is presented as the framework that bridges that gap by managing prompts and connecting models to APIs, databases, and other parts of a traditional software stack, turning raw model calls into more complete applications like chatbots and search-enabled assistants.

The core problem starts with how LLMs work: each request produces the next-token probability distribution conditioned on the input tokens, which makes them excellent at conditional generation. Fine-tuning approaches such as reinforcement learning from human feedback (RLHF) and related prompt-focused add-ons have improved results, but the models still can’t directly access normal software capabilities. A key limitation is state: LLM calls are stateless, so maintaining a conversation requires resending prior context. That quickly runs into token limits—often on the order of a few thousand tokens (the transcript mentions ranges like 2048 and up to 10k+ in newer models), making long chats and large documents harder without summarization or other strategies.

LangChain is introduced as a tool/framework designed to manage both the model and the prompt layer while integrating with external systems. Instead of building everything from scratch, developers can rely on a growing open-source ecosystem that has become a de facto standard for building LLM-powered apps. The emphasis is that everything begins with prompts: prompt engineering has evolved from simple question prompts (e.g., asking for a historical fact) to richer instructions and structured examples. The transcript points to the InstructGPT paper as a major milestone showing that reinforcement from human feedback can make models follow instructions more reliably.

Prompt templates are the mechanism that makes this practical inside LangChain. A prompt template defines the static instruction text plus variables that get injected at runtime—similar to formatting an f-string in Python. The tutorial uses a restaurant-naming example: a template instructs the model to act as a naming consultant, then inserts a user-provided restaurant description to generate a list of short, catchy names tied to the cuisine. The same template can be reused across different inputs (Greek restaurant vs. a burger place themed with baseball memorabilia), demonstrating how prompt injection changes outputs without rewriting the prompt.

The tutorial then adds few-shot prompting via few-shot prompt templates. Instead of only giving instructions, it includes example input-output pairs inside the prompt so the model can infer the pattern. A concrete example asks for antonyms: the prompt includes demonstrations like “happy → sad” and “tall → short,” then requests the antonym for a new word. The transcript notes that in real apps, users often shouldn’t see the full prompt; templates let developers keep the examples internal while still benefiting from them.

Finally, the walkthrough moves into Colab code setup: installing packages for OpenAI and LangChain, plus Hugging Face Hub access, then running simple prompt calls against an OpenAI model (text-davinci-003) and a Hugging Face model (Google flan-t5-xl). It culminates in a basic “LLM chain” that ties together a chosen model and a prompt template, producing outputs from injected inputs—setting the stage for more advanced LangChain concepts like tools and chains in later videos.

Cornell Notes

LangChain is positioned as the framework that turns stateless LLM calls into usable applications by managing prompts and connecting models to external systems like APIs and databases. Because LLMs can’t reliably maintain state or access the traditional software stack on their own, apps need an orchestration layer—especially when token limits make long context difficult. The tutorial emphasizes that prompts are the foundation: prompt templates define reusable instruction text with variables injected at runtime, like generating restaurant names from a user’s description. Few-shot prompt templates extend this by embedding example input-output pairs (e.g., antonyms) so the model follows a pattern without changing the underlying template. A basic LLM chain then combines a selected model with a prompt template to produce outputs for new inputs.

Why can’t a large language model alone serve as a complete app backend?

LLMs generate text by predicting the next token from the input tokens, which makes them good at conditional generation. But they’re stateless: each call is independent, so maintaining a conversation requires resending prior context. That quickly collides with finite token windows (the transcript mentions limits like ~2048 tokens and larger but still finite contexts). Also, LLMs don’t inherently access external tools, databases, or APIs—so chatbots that need search or other data require an interface to the traditional software stack.

What does LangChain add to the workflow around LLMs?

LangChain is described as a framework for building fully featured apps that interact with the normal software stack. It manages how prompts are constructed and how models are invoked, and it supports integration with external components such as APIs, calculators, and databases. Instead of reinventing prompt handling and orchestration, developers can reuse LangChain’s abstractions to manage model + prompt + external integrations.

How do prompt templates work in LangChain?

A prompt template contains instruction text plus input variables that get injected at runtime—similar to formatting an f-string in Python. In the restaurant-naming example, the template tells the model to act as a naming consultant and to return short, catchy names related to the restaurant type. The user’s restaurant description becomes the injected variable, so the same template can generate different outputs for different cuisines or themes.

What is few-shot prompting, and how does a few-shot prompt template differ from a basic template?

Few-shot prompting adds example input-output pairs inside the prompt so the model can infer the desired pattern. In the antonym example, the prompt includes demonstrations like “happy → sad” and “tall → short,” then asks for the antonym of a new word. LangChain’s few-shot prompt templates assemble a prefix, the embedded examples, and a suffix so the model sees the pattern while the app can keep the full prompt hidden from end users.

What is an LLM chain in this tutorial’s terms?

The tutorial uses a simple “large language model chain” that combines (1) a chosen LLM (e.g., OpenAI text-davinci-003) with (2) a prompt template (basic or few-shot). The chain then runs the prompt template with injected inputs (like a restaurant description) and returns the model’s output. This is framed as the first step toward more complex LangChain compositions later.

How are models accessed in the Colab setup shown?

The walkthrough installs packages for OpenAI and LangChain, plus Hugging Face Hub access. It then uses API keys for OpenAI and Hugging Face endpoints. It demonstrates raw prompting with OpenAI’s text-davinci-003 and with a Hugging Face model such as Google flan-t5-xl, showing that different models produce different responses even for similar prompts.

Review Questions

How does LangChain’s prompt templating reduce the need to rewrite instructions for every new user input?
What practical issues arise from LLM statelessness and token limits when building chat or document-heavy applications?
In few-shot prompting, why do example pairs (input → expected output) often improve consistency compared with a single instruction?

Key Points

1
LLMs generate text by predicting the next token from input tokens, but they are stateless and can’t directly access tools, databases, or APIs.
2
Conversation state must be managed externally by resending or summarizing prior context, which is constrained by finite token windows.
3
LangChain is positioned as an orchestration framework that manages prompts and model calls while integrating with the traditional software stack.
4
Prompt templates make prompts reusable by defining instruction text plus runtime variables injected into the prompt.
5
Few-shot prompt templates improve task consistency by embedding example input-output pairs inside the prompt.
6
A basic LLM chain combines a selected model with a prompt template to produce outputs from injected inputs.
7
The tutorial demonstrates model access via OpenAI (text-davinci-003) and Hugging Face (Google flan-t5-xl) using API keys in Colab.

Highlights

LangChain is framed as the missing layer between stateless LLM calls and real applications that need state, external data, and tool use.

Prompt templates turn prompt engineering into reusable components by injecting variables at runtime rather than rewriting instructions each time.

Few-shot prompting is implemented by embedding example input-output pairs (like antonyms) so the model follows a pattern for new inputs.

The restaurant-naming example shows how the same prompt template can generate cuisine-specific name lists simply by changing injected descriptions.

Topics

Mentioned

Sam Witteveen
RLHF