Bad Code vs Good Code

TL;DR

Code quality is best judged by whether software fulfills its purpose and whether it remains easy to modify when requirements change.

Briefing Cornell Notes

Briefing

“Bad code” isn’t a single technical category so much as a mismatch between what software must do and what it costs to change when reality shifts. Across the discussion, the most consistent through-line is that code quality should be judged by two practical outcomes: whether it fulfills its purpose and whether it’s easy to modify when new requirements arrive. When those outcomes fail—because the problem is inherently hard, because the system’s assumptions no longer hold, or because the design makes small changes cascade into big rewrites—people experience the code as “bad,” even if it was reasonable at the time.

A key disagreement centers on the common claim that “hard to work with” automatically means “bad.” The counterpoint is that difficulty often comes from the domain and constraints, not just the code. A concrete example comes from building accessibility focus behavior for Netflix on TV. Web platforms provide focus and focus order; TV platforms don’t, so teams had to create their own focus ring and focus system. That work is inherently complicated, and labeling the resulting code as “bad” confuses the complexity of the topic with the quality of the implementation.

The discussion then pivots to a data-structure story about movie recommendations and badging. When two lists point to the same underlying movie object, changing the “reason you liked it” badge for one list would unintentionally change it for the other. The fix required restructuring the graph—splitting fields so the badge reason could vary per list rather than per shared object. That meant adding a new node type and updating the data model across the system. The takeaway: if a change is difficult because the underlying assumptions of the model don’t support the new requirement, the “badness” may belong to the problem framing and constraints, not to the programmer’s skill.

From there, the conversation broadens into a pragmatic definition of quality. Lists of “bad code” symptoms—poor readability, convoluted logic, performance problems, resource leaks, security vulnerabilities, and higher debugging costs—are treated as reminders rather than true definitions. The more foundational test remains: does the code do what it’s meant to do, and can it be changed without excessive friction? Even security and reliability are framed as context-dependent: a system that isn’t resilient may be “bad” for a business where uptime matters more than scalability.

The transcript also challenges the idea that “good code” is always about elegance or minimalism. Over-engineering can happen when teams chase abstractions, patterns, and testability scaffolding that don’t pay off. At the same time, under-engineering can create technical debt that slows future work. The practical middle ground is iterative design, modularity, and tests that support refactoring—without turning mocks and abstractions into a maze that makes debugging harder than the bug itself.

Ultimately, the debate lands on subjectivity: different engineers experience “easy to change” differently based on familiarity, domain knowledge, and the system’s age. But the shared standard is concrete: code quality should be measured by outcomes—correct behavior and manageable change—rather than by aesthetic preferences or whether the code looks “nice” on first inspection.

Cornell Notes

“Bad code” is treated less like a fixed label and more like a symptom of two failures: the software doesn’t meet its purpose, or it becomes painful to change when requirements evolve. Complexity can come from the domain itself (e.g., TV accessibility focus behavior) rather than from poor implementation. A data-model example shows how shared objects can block per-list customization, forcing a structural redesign—making the change hard without necessarily proving the code was “bad.” The discussion also critiques simplistic checklists (readability, bugs, performance) and argues they’re context-dependent reminders. The most useful yardstick remains: does the code fulfill its purpose and stay easy to modify, given real constraints and assumptions.

Why does “hard to work on” not automatically mean “bad code” in this discussion?

Difficulty often reflects the problem’s inherent constraints. The Netflix TV accessibility example highlights this: web focus behavior exists as a platform feature, but TV platforms don’t provide focus order, so teams had to build their own focus ring and focus system. That work is complicated because the domain lacks built-in support, not because the implementation is inherently sloppy.

How does the movie-badging story illustrate the “badness” of code vs. the hardness of the problem?

When two lists point to the same movie object, a badge reason meant for one list would incorrectly apply to the other. The fix required changing the data structure so the badge reason could vary per list (e.g., splitting fields like “0.video.title” vs. a shared “0.title” concept). The change was difficult because the original model’s assumptions didn’t support the new requirement, not necessarily because the original code was incompetent.

What two criteria does the transcript repeatedly return to for defining quality?

Code quality is anchored to (1) whether the code does what it should and (2) whether it’s easy to change when needed. The transcript treats other concerns—readability, bugs, performance, security, maintainability—as important but secondary, because they often serve those two core outcomes depending on context.

What’s the critique of “lists of bad code” symptoms like readability, bugs, and security?

Those lists are described as reminders rather than true definitions. For example, security vulnerabilities are framed as hard to reason about as “bad code” versus “hard target,” and debugging cost depends on familiarity and system structure. The transcript argues that missing the foundational criteria (purpose + changeability) makes checklists misleading.

Why does the transcript warn against excessive abstraction and mocking?

Abstractions can create a debugging maze where the “real” logic is buried under layers, and mocks can diverge from reality—especially around HTTP semantics and error handling (redirect codes, retry behavior, status-specific flows). The concern is that test scaffolding and polymorphic layers can add complexity that makes understanding and fixing issues harder than necessary.

How does engineer familiarity affect whether code feels “easy to change”?

“Easy to change” is partly proportional to experience. A new engineer may take much longer to understand the same codebase, so the perceived change difficulty isn’t purely objective. The transcript treats this as a reason “good vs. bad” can feel subjective, even when the underlying criteria are concrete.

Review Questions

What are the two core outcomes used to judge code quality, and how do the focus and movie-badging examples support that framework?
In what ways can domain complexity or outdated assumptions make a required change difficult without proving the original code was poorly written?
How do abstractions and mocks risk turning testability into additional complexity, and what does that imply for how to design for maintainability?

Key Points

1
Code quality is best judged by whether software fulfills its purpose and whether it remains easy to modify when requirements change.
2
Domain constraints can create complexity; labeling code “bad” just because it’s hard to work with confuses problem difficulty with implementation quality.
3
Shared data objects can block per-context customization, forcing data-model redesigns that make changes difficult even when the original design was reasonable.
4
Checklists of “bad code” symptoms (readability, bugs, performance, security) are context-dependent reminders, not definitive definitions.
5
Over-abstraction can make debugging harder by hiding the real logic behind layers of indirection.
6
Mocks can diverge from real behavior (e.g., HTTP status/redirect semantics and retry flows), potentially increasing complexity rather than reducing risk.
7
Perceived change difficulty varies with engineer familiarity and experience, making “good vs. bad” partly subjective even under objective criteria.

Highlights

“Bad code” is treated as a mismatch: it either fails its purpose or becomes too costly to change when reality shifts.

Netflix TV focus behavior is used to show that platform constraints can force complicated code that shouldn’t be dismissed as “bad.”

A movie-badging example demonstrates how shared-object data models can make new features require structural redesigns.

The transcript argues that abstractions and mocks can improve testability but also create debugging nightmares if they hide the path from input to behavior.

Topics

Mentioned

TDD
DORA