LLM JSON Output - Get Valid JSON with Pydantic and LangChain Output Parsers
Based on Venelin Valkov's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Use native JSON response formatting when the API/model supports it; it reduces cleanup and parsing errors.
Briefing
Getting reliable JSON from large language models—especially ones that don’t natively support structured outputs—requires more than “please output JSON.” The core approach here is to pair a strict schema (via Pydantic) with output-parsing logic (via LangChain-style parsers), and to enforce “pure JSON only” formatting so downstream code can safely consume the result.
The walkthrough starts with Gro’s API, which can request a JSON object directly. Using Gro’s client, it sets an API key and selects a model (defaulting to “Lama 370b”). A custom predict function builds a messages array (optionally prepending a system prompt), calls the completions endpoint, and—when JSON output is requested—passes a parameter that tells the API to return a JSON object rather than free-form text. For models that support this mode, the workflow is straightforward: include a system prompt that demands JSON, provide a sample JSON shape, and set the response format to JSON. The result is a response that can be printed as JSON and parsed without extra cleanup.
The more fragile case is when the model only returns text. For that, the method shifts to a two-step pipeline: (1) generate text that contains JSON in a predictable wrapper, then (2) extract and validate it against a Pydantic schema. The example defines a Pydantic BaseModel with two required fields—readability and conciseness—both scored from 0 to 10. It then leverages LangChain/Ragas-style schema prompting patterns: the prompt includes the schema, instructs the model to return only a pure JSON string (no preamble or explanation), and often specifies that the JSON should be surrounded by triple backticks. After receiving the model output, the code strips the backticks, parses the remaining string into a Python dictionary, and uses Pydantic’s parsing/validation to ensure the fields and types match.
A key practical detail is the prompt engineering itself. The “text-only” prompt is long and prescriptive: it repeats the evaluation task (scoring tweet writing style for readability and conciseness), includes both a correctly formatted example and a negative example where the JSON object properties are not well formatted, and ends with explicit instructions to output JSON only. In the example, running this prompt against “Lama 3” yields a response that can be cleaned (removing the backticks) and parsed into the Pydantic model.
Finally, the workflow can be simplified further by using LangChain’s Pydantic output parser directly. When integrated into LangChain chains, that parser can also trigger a repair loop: if parsing fails due to invalid JSON, LangChain can call the model again with instructions to fix the output. The end result is a reusable pattern: use native JSON support when available; otherwise, enforce schema-driven prompting plus strict parsing/validation so applications can reliably consume structured LLM outputs.
Cornell Notes
The transcript presents a practical method for getting valid JSON from LLMs, even when they don’t support structured outputs natively. When Gro’s API supports JSON mode, a system prompt plus a JSON response format yields directly parseable JSON. For text-only models, the method switches to schema-driven prompting using Pydantic: define a model (e.g., readability and conciseness, both required), instruct the L to output pure JSON (often wrapped in triple backticks), then strip wrappers and parse/validate with Pydantic. LangChain’s Pydantic output parser can further improve reliability by re-asking the L to repair invalid JSON when parsing fails.
Why does JSON-only prompting often fail with smaller or older models, and what workaround is used here?
How does the Gro-based approach produce JSON without manual extraction?
What role does Pydantic play in the text-only fallback strategy?
Why include examples (including a negative example) inside the prompt?
How does LangChain’s Pydantic output parser improve reliability beyond basic parsing?
Review Questions
- What changes when moving from a model/API that supports JSON response format to one that only returns text?
- How do Pydantic schema requirements (field names and required-ness) influence the parsing and validation step?
- What specific prompt constraints (e.g., “pure JSON only,” backticks, examples) are used to reduce malformed outputs?
Key Points
- 1
Use native JSON response formatting when the API/model supports it; it reduces cleanup and parsing errors.
- 2
For text-only models, generate JSON inside a predictable wrapper (often triple backticks) and then strip the wrapper before parsing.
- 3
Define a strict Pydantic schema for the expected fields (e.g., readability and conciseness) so invalid outputs fail validation instead of silently propagating.
- 4
Use schema-driven prompting (include the expected structure and required fields) to guide the model toward correct JSON formatting.
- 5
Add explicit “JSON only” instructions and include both correct and incorrect formatting examples to improve adherence.
- 6
When using LangChain, rely on its Pydantic output parser and repair behavior to re-ask for corrected JSON if parsing fails.