Get AI summaries of any video or article — Sign up free
Python Pydantic Tutorial: Complete Data Validation Course (Used by FastAPI) thumbnail

Python Pydantic Tutorial: Complete Data Validation Course (Used by FastAPI)

Corey Schafer·
5 min read

Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Pydantic replaces hand-written validation by using BaseModel classes with type hints that enforce expectations at runtime.

Briefing

Pydantic is positioned as a practical replacement for hand-written validation: define a model with Python type hints, and Pydantic enforces those expectations at runtime—collecting all validation errors in one pass instead of failing on the first problem. That shift matters for real applications because it improves user experience (multiple issues reported together), reduces brittle boilerplate, and scales to complex inputs like nested objects, lists, and datetimes.

The tutorial starts with a manual validation example for a “create user” function that checks types field-by-field. When invalid data arrives (e.g., email is None and age is a string), the manual approach reports only the first error encountered. After fixing that, the next run reveals the next issue—an inefficient loop for both developers and users. Pydantic’s BaseModel approach replaces that code with a class that declares fields using type annotations. When the same invalid input is provided, Pydantic returns multiple validation errors at once (e.g., email must be a valid string and age must be a valid integer), along with structured error details.

From there, the focus moves to getting started with Pydantic v2, including installation via pip or UV and a key compatibility warning: method names changed between v1 and v2 (for example, model_dump and model_dump_json replace older conversion helpers). The tutorial then builds a “User” model from scratch, demonstrating how required fields work (no defaults), how optional fields behave (defaults like empty strings or booleans), and how true optional values use Union syntax with None. It also shows core ergonomics: accessing fields via dot notation, converting models to dictionaries or JSON with model_dump and model_dump_json, and handling validation failures via ValidationError.

A major theme is how Pydantic validates and transforms data. Type coercion is enabled by default, so inputs like UID="123" may be converted to an integer automatically, while incompatible conversions (like UID="test") produce errors. The tutorial expands into supported types (primitives, collections, and datetime types), unions, and Literal constraints for fields that must match exact allowed values.

Next comes field constraints using Pydantic v2’s recommended annotated pattern: annotated types combined with metadata such as greater-than/less-than bounds, min/max string lengths, and regex patterns for formats like URL-friendly slugs. When invalid values violate those constraints, Pydantic reports all failures together with clear messages.

The tutorial then highlights Pydantic’s built-in “special types” for common validation needs—email strings, HTTP URLs, secret strings for passwords, and UUID generation—often requiring optional extra dependencies (like email-validator). It also demonstrates custom validation and normalization using field_validator and model_validator, including the difference between running logic after type validation (default) versus before it (mode="before") when preprocessing is required.

Finally, it covers computed fields (derived values included in serialization), nested models (recursive validation for models inside models), and serialization controls through model configuration: aliases for external naming conventions (e.g., ID vs UID), include/exclude rules for outputs, loading from JSON with model_validate_json, and model config options like strict type checking, extra field handling, validate assignment, and frozen models for immutability. The end result is a workflow where type hints become both documentation and enforcement, with Pydantic acting as the validation backbone for ecosystems like FastAPI and SQLModel.

Cornell Notes

Pydantic turns Python type hints into runtime data validation. Instead of failing on the first bad field, it aggregates multiple validation errors, making inputs easier to debug and improving UX. The tutorial builds models from BaseModel, shows required vs optional fields (including Union[..., None]), and demonstrates serialization with model_dump and model_dump_json. It also covers constraints via annotated types (bounds, lengths, regex), special types like EmailStr/HttpUrl/SecretStr/UUID4, and custom logic with field_validator and model_validator (including mode="before" preprocessing). Finally, it explains computed fields, nested models, aliases and include/exclude during serialization, and model configuration options like strict mode, extra handling, validate assignment, and frozen models.

Why does Pydantic feel different from manual validation, and what concrete improvement does it provide?

Manual validation often stops at the first failure, so fixing one issue reveals the next on a rerun. In the example, invalid user input (email=None and age="string") triggers only the first manual error at a time. With Pydantic, a BaseModel with typed fields validates automatically and returns multiple errors together—e.g., it reports both that email must be a valid string and age must be a valid integer in a single response.

How do required fields, optional defaults, and “optional but None allowed” differ in Pydantic models?

Required fields are those without default values (e.g., username: str and email: str). Optional fields with defaults become non-required at input time (e.g., bio: str = "" and is_active: bool = True). For values that may legitimately be None, the type must include None using Union syntax (e.g., full_name: str | None = None, and verified: datetime | None = None).

What is the practical effect of type coercion, and how can strict mode change it?

Type coercion is enabled by default, so some inputs are converted when it makes sense—e.g., UID="123" (string) can be coerced into an integer 123, avoiding an error. But UID="test" can’t be converted, so validation fails. Setting strict=True in model_config disables coercion, so a string like "39" is rejected when an integer is required.

How do annotated constraints and regex patterns enforce rules beyond basic types?

Constraints are attached to types using annotated. Examples include UID must be greater than zero (GT/GE), username length must be between 3 and 20 (min_length/max_length), and age must be within a range (GE/LE). For format rules, a slug field can use a regex pattern that allows only lowercase letters, digits, and hyphens; inputs not matching the pattern fail validation.

When should custom validators run in “before” mode instead of the default “after” mode?

Use mode="before" when you must preprocess the raw input before Pydantic’s built-in validation runs. The tutorial’s website example shows this: the built-in HttpUrl validation rejects URLs missing an HTTP/HTTPS prefix. A before-mode validator can add the prefix first, so the subsequent URL validation succeeds.

What roles do computed fields and nested models play in structuring complex data?

Computed fields derive values from other fields and are included during serialization. For example, display_name is computed from first_name and last_name (falling back to username), and is_influencer is computed from follower_count > 10,000. Nested models let one model contain other BaseModel instances (e.g., a blog post with an author user and a list of comment models), with recursive validation applied to all nested structures.

Review Questions

  1. When would you choose Union[T, None] over a default value like "" or True in a Pydantic model?
  2. How does Pydantic’s error aggregation change the debugging workflow compared with manual validation that stops at the first error?
  3. What is the difference between field_validator and model_validator, and why would you use model_validator for password/confirm-password matching?

Key Points

  1. 1

    Pydantic replaces hand-written validation by using BaseModel classes with type hints that enforce expectations at runtime.

  2. 2

    Pydantic aggregates validation errors so multiple field problems are reported together instead of one-at-a-time failures.

  3. 3

    Pydantic v2 uses updated conversion helpers like model_dump and model_dump_json, and method names differ from v1.

  4. 4

    Use annotated types with metadata (GT/GE/LT/LE, min/max length, regex patterns) to enforce constraints beyond basic type checking.

  5. 5

    Special types like EmailStr, HttpUrl, SecretStr, and UUID4 provide common validation and security behaviors with minimal custom code.

  6. 6

    Custom validation can preprocess inputs with field_validator(mode="before") or validate cross-field logic with model_validator after individual fields validate.

  7. 7

    Model configuration options—strict type checking, alias handling, extra field behavior, validate assignment, and frozen models—control how data is accepted and serialized.

Highlights

Pydantic reports multiple validation errors in one run, avoiding the repetitive “fix one field, rerun, discover the next” loop common in manual validation.
Type coercion is on by default (e.g., UID="123" can become 123), but strict=True disables coercion for tighter correctness.
Constraints in Pydantic v2 are most commonly expressed with annotated types, enabling bounds, length limits, and regex-based format checks.
SecretStr hides password values in logs/serialization while still allowing retrieval via get_secret_value when needed.
Aliases and serialization controls (populate_by_name, by_alias, include/exclude) let internal Python naming differ from external API naming conventions.

Topics

  • Pydantic Data Validation
  • Pydantic v2 Models
  • Field Constraints
  • Custom Validators
  • Serialization and Aliases

Mentioned