Please Stop Using Booleans.

TL;DR

Treat booleans as logic results in application code, not as durable representations of business state in databases.

Briefing Cornell Notes

Briefing

Booleans belong in logic, not in durable data models—and the cost of treating them as “state” shows up as inconsistent records, brittle code, and security mistakes at scale. The core claim is blunt: in most real systems, every time a boolean gets stored (especially in databases), it should usually be replaced by something that preserves meaning and prevents contradictory states. The payoff is cleaner architecture, fewer “split brain” bugs, and easier evolution when product requirements change.

A common failure pattern is storing a boolean that represents an event, then later realizing the timestamp matters. Email confirmation is the example: teams often start with a boolean like is_confirmed, then later add confirmed_at. Once both fields exist, updates can drift—one code path sets the timestamp while another flips the boolean without it. That produces “split brain” data where the record says confirmed=true but confirmed_at is null (or vice versa). The fix is to store the higher-resolution fact (a nullable datetime such as confirmed_at) and derive the boolean from it, or otherwise avoid duplicating the same underlying truth in two coupled fields.

The same logic applies to status-like booleans. Instead of a single job lifecycle, teams end up with multiple columns such as completed, failed, and in_progress. Those flags can contradict each other when partial updates happen—e.g., failed=true while in_progress remains false and completed stays false, leaving the UI to show the wrong spinner or client logic to misinterpret the state. The recommended replacement is an enum-style status field (ideally enforced by constraints and type validation) so mutually exclusive states can’t overlap. When timestamps matter too, the model should capture event times alongside status—often as started_at and finished_at—so durations and failure timing can be inferred rather than hard-coded into additional booleans.

Normalization and constraints are presented as the mechanism that keeps the database honest. With a well-shaped schema, the database effectively becomes a state machine: constraints enforce which combinations are valid, and the application derives what it needs from the stored facts. Even “created_at” and “updated_at” are treated as near-universal design primitives because they support auditing and debugging without requiring ad hoc logic.

The transcript also draws a sharp line between permissions and data. Storing authorization rules as boolean fields or relying on database-level row permissions (as seen in systems like Firebase and Supabase) is framed as logic embedded in data, which makes it easier to misconfigure and harder to reason about. Permissions should be derived by application logic from stored data, not stored as the logic itself.

Finally, booleans still have a legitimate role as temporary variables: caching the result of a complex conditional for readability or performance, or naming intermediate checks before branching. But once the boolean becomes part of the persisted model, it risks coupling database structure to application logic. The practical takeaway: store the facts, enforce valid states, and derive booleans when needed—because the database should represent data, while code should represent decisions.

Cornell Notes

The transcript argues that booleans are often misused as persisted “state.” When teams store booleans in databases, they frequently duplicate meaning (e.g., is_confirmed plus confirmed_at) and create inconsistent records (“split brain”). Status is another common trap: multiple flags like completed/failed/in_progress can contradict each other after partial updates, leading to wrong UI behavior. The proposed alternative is to store a single enum-like status (enforced via constraints or runtime validation) plus the relevant event timestamps (such as started_at and finished_at), then derive other facts from those stored fields. Booleans still make sense as temporary variables in code for readability or intermediate results, but durable data models should store facts and prevent impossible combinations.

Why does storing both a boolean and a timestamp for the same event often backfire?

It duplicates the same underlying truth at two resolutions. The example is email confirmation: a database might store is_confirmed plus confirmed_at. If different code paths update these fields inconsistently, records can end up with has_confirmed=true while confirmed_at is null (or the reverse). That inconsistency is described as “split brain,” where coupled fields drift out of sync. The fix is to store the higher-resolution fact (e.g., a nullable confirmed_at datetime) and derive the boolean from it, avoiding the need to update two fields everywhere.

What goes wrong when job state is represented by multiple booleans like completed, failed, and in_progress?

Multiple flags allow contradictory combinations after partial updates. For instance, a failure path might set failed=true but forget to update in_progress, leaving completed=false and in_progress=false while the UI logic checks the wrong combination. The transcript emphasizes that conflicting states should be structurally prevented: if completed and failed can’t both be true, the schema shouldn’t permit it. A single status field (enum-style) is presented as the cleaner model.

How do started_at and finished_at improve a status model?

They let the system infer derived information instead of storing extra booleans. With status plus started_at and finished_at, you can compute duration and determine when a job failed (e.g., when status=failed and finished_at is set). The transcript argues that storing only the key event times and status yields more extensibility and avoids architectures like completed_at/failed_at/in_progress_at proliferating into messy, redundant columns.

What does “data normalization” mean in this context?

Normalization is treated as using database constraints to enforce which states are valid. When constraints prevent impossible combinations, the database behaves like a state machine. The transcript notes that even if mistakes can still happen (e.g., finished_at set while status is in_progress), constraints make those errors less likely than the chaos of many independent booleans that can drift independently.

Why are permissions framed as “logic” rather than “data”?

Permissions are decisions derived from user and resource data, not raw facts to store as booleans. The transcript criticizes approaches that embed authorization rules into stored structures or rely on database-level row permissions configured outside the application’s codebase (citing Firebase and Supabase as examples). The risk is misconfiguration: once permissions are logic stored as data, it becomes easy to gloss over what’s allowed and security issues can follow.

When is a boolean acceptable in code?

The transcript allows booleans as temporary variables—especially when caching the result of a complex conditional for readability or to avoid repeating a large expression. It also recommends naming intermediate checks (e.g., splitting a giant conditional into named const variables or extracting it into a function like canUserDoThis) so the branching logic stays clear. The caution is against persisting those booleans as part of the database model.

Review Questions

What specific inconsistency is described as “split brain,” and how does switching to a nullable datetime prevent it?
How does a single enum-like status field reduce contradictions compared with multiple independent status booleans?
What’s the difference between storing permissions as data versus deriving permissions as logic from underlying facts?

Key Points

1
Treat booleans as logic results in application code, not as durable representations of business state in databases.
2
Avoid duplicating the same meaning across multiple fields (e.g., is_confirmed plus confirmed_at); store the higher-resolution fact and derive the boolean.
3
Replace sets of overlapping status booleans (completed/failed/in_progress) with a single constrained status value to prevent impossible combinations.
4
When event timing matters, store event timestamps like started_at and finished_at and infer durations and failure timing from them instead of adding more booleans.
5
Use database constraints and normalization to enforce valid state transitions, effectively making the schema behave like a state machine.
6
Prefer deriving permissions from data in code rather than embedding authorization logic into stored structures or database-level permission rules.
7
Use booleans in code only as temporary named intermediates (or extracted functions) when they improve readability or avoid repeated complex conditionals.

Highlights

Storing both a confirmation boolean and a confirmation timestamp invites drift: one update path can set the boolean without the timestamp, creating “split brain” records.

Multiple status booleans (completed/failed/in_progress) can contradict each other after partial updates, producing wrong UI behavior like spinners showing on failed jobs.

A single status field plus started_at/finished_at lets systems infer durations and failure timing, reducing redundant columns.

Permissions are framed as logic derived from data; embedding them as stored rules or database-level row permissions increases the chance of misconfiguration.

Booleans are acceptable as temporary variables for readability—especially when naming intermediate results or extracting checks into functions.

Topics

Boolean Misuse
Database Schema
Enums and Status
Data Normalization
Permissions Logic

Mentioned

Nicole