Don't Clean Code w/ Creator of HTMX

TL;DR

Carson Gross argues that clean-code rules—especially “always keep methods small”—are often treated as universal laws without strong empirical support.

Briefing Cornell Notes

Briefing

Carson Gross’s “Coding Dirty” pitch challenges the software industry’s default worship of “clean code” rules—especially the idea that small functions, heavy unit testing, and aggressive abstraction are always superior. Gross argues that many clean-code prescriptions are treated like universal laws without solid empirical support, and that real-world maintainability often improves when developers allow larger functions, test at the right level, and avoid abstraction that adds cognitive cost without delivering concrete benefits.

Gross’s first major target is the “keep methods tiny” doctrine. He points to research summarized in “Clean Code” and additional later work suggesting that longer methods can correlate with higher quality metrics—such as fewer bugs per line of code—and that the commonly cited “short is better” guidance often lacks rigorous, example-driven justification. He also offers practical reasons: large functions can be easier to read top-to-bottom, easier to debug because the full context stays in one place, and sometimes even safer to change because there’s less risk of “leaking” behavior through many small helper calls. In his view, splitting logic into dozens of micro-functions can create an “inversion of control” maze—strategy patterns and indirection that force developers to hop across implementations, losing the original context and making step debugging feel like whack-a-mole.

That doesn’t mean decomposition is useless. Gross draws a line between extracting code for reuse or managing complexity versus fragmenting code “for their own sake.” Names and abstractions help humans manage complexity, but they don’t magically preserve truth. A function name inevitably compresses details—edge cases, omitted behavior, and hidden assumptions—so developers can end up with a false sense of what the code actually does. The remedy isn’t to ban small functions or classes; it’s to treat them as tools with costs, not as free correctness.

Testing becomes the second battleground. Gross says he’s not anti-testing, but skeptical of unit tests as a development-driving ideology in the original TDD sense. His preference is situational: during early feature work, developers often don’t know the right internal structure yet, so exhaustive testing should focus on the API-level “cut point” and integration behavior rather than micro-testing every internal function. Unit tests can still make sense when extending an existing system that already has a test harness. He also warns against test-suite bloat: too many end-to-end tests become non-deterministic, get ignored, and can slow refactors. The goal is a cost-benefit balance—tests are only valuable if they reduce production bugs, prevent catastrophic failures, or speed development more than they consume engineering time.

The conversation closes by reinforcing the theme: coding guidance should be applied with judgment, not ideology. Gross frames “Coding Dirty” as a counterweight to clean-code dogma—arguing for “beautiful at any size,” for abstraction when it earns its keep, and for testing that tracks real business outcomes rather than brittle internal state. Alongside the technical debate, the episode also turns into a meme-and-standup detour, including a discussion of why a carefully crafted meme about “getting lit on a Monday night” failed to land—an echo of the episode’s broader message that rules and assumptions don’t always survive contact with reality.

Cornell Notes

Carson Gross’s “Coding Dirty” challenges clean-code commandments by arguing that many rules (like “always use small functions” and “drive development with unit tests”) are applied too ideologically. He cites research suggesting longer methods can be higher quality and explains why big functions can be easier to read, debug, and change—especially when small-function decomposition creates indirection and context loss. On testing, he favors exhaustive checks at the right level: API/integration tests during early design, and targeted unit tests when extending well-covered systems. The throughline is cost-benefit thinking: abstraction and tests help only when they reduce real risk or speed change, not when they add cognitive load or lock teams into brittle expectations.

Why does Gross defend larger functions when “clean code” often demands short methods?

Gross argues that the anti-long-method guidance lacks strong empirical backing and that some studies summarized in “Clean Code” (and additional later work) suggest longer methods can correlate with fewer bugs per line of code. He also gives practical engineering reasons: large functions can be read top-to-bottom without repeatedly jumping to other helpers, debugging can be simpler because the full context stays together, and changes can be safer because there’s less chance of behavior “leaking” through many small call sites. He adds a philosophical point: if a function is truly important, its size can reflect that importance—splitting it into many tiny pieces can smear what matters unless extraction is justified by reuse.

What’s the risk of small functions and heavy decomposition, according to Gross?

Gross says small functions can increase cognitive cost. When logic is broken into many helpers—especially with patterns like strategy—developers must navigate across implementations and versions, losing the original context. He describes a “drill down” effect: function A calls B, which calls C, and after a few hops it’s easy to forget where you started, making step debugging harder. He also argues that names and abstractions compress information: a short function name can’t capture every edge case or omitted behavior, so developers may misunderstand what the code actually does.

How does Gross distinguish when unit tests are useful versus when they’re counterproductive?

Gross isn’t anti-testing; he’s skeptical of unit tests as a universal development-driving method. In early feature work, developers often don’t know the internal structure yet, so he prefers exhaustive testing at a higher “cut point” (API/integration behavior) to capture corner cases without overfitting to internal implementation. In mature codebases, adding a small feature can leverage existing unit tests; if none exist, he suggests adding higher-level integration coverage before building. He also warns that too many tests can become a burden: end-to-end suites can turn non-deterministic, and excessive unit tests can make refactors feel impossible.

What does Gross mean by “abstraction isn’t cost-free”?

Gross argues that abstraction adds real overhead: more names to remember, more concepts to track, and more uncertainty about dynamic dispatch (e.g., which implementation runs at runtime). He doesn’t claim abstraction is always bad—he says it can be justified—but he rejects the idea that it’s automatically beneficial. In his view, developers shouldn’t be intimidated into replacing straightforward control flow (loops/ifs) with complex indirection just because code looks “ugly” at first glance.

Why does Gross say test assertions can become brittle when they focus on internal state?

Gross criticizes tests that assert many internal details because they tightly couple tests to implementation. If internal structure changes while behavior stays correct, those tests fail anyway—removing the confidence that tests are validating the system’s real behavior. He prefers tests tied to business outcomes or observable behavior, so requirements can change without turning the test suite into constant maintenance.

Review Questions

Which quality metrics and study findings does Gross cite to challenge the “short methods are always better” rule?
How does Gross’s argument about function names relate to his critique of abstraction and decomposition?
What criteria should determine whether to add unit tests, integration tests, or end-to-end tests in a refactor-heavy project?

Key Points

1
Carson Gross argues that clean-code rules—especially “always keep methods small”—are often treated as universal laws without strong empirical support.
2
Research cited in “Coding Dirty” and related work suggests longer methods can correlate with higher quality metrics, including fewer bugs per line of code.
3
Large functions can improve readability and debugging by preserving full context, and they may reduce change risk by limiting indirection and “leakage” across many call sites.
4
Function and class names compress information; even well-chosen names can mislead because they can’t encode every edge case and omitted behavior.
5
Gross favors situational testing: exhaustively test API/integration behavior early, use unit tests when extending well-instrumented systems, and avoid test-suite bloat.
6
End-to-end tests must be maintained and focused; too many become non-deterministic, get ignored, and slow refactors.
7
The guiding principle is cost-benefit: abstraction and tests are only “good engineering” when they reduce real risk or speed change more than they consume time.

Highlights

Gross’s core counterclaim: “small functions” isn’t automatically better—longer methods can be higher quality, and they can be easier to read and debug when context stays together.

He argues that abstraction isn’t free: dynamic dispatch and indirection add cognitive load, and names can create a false sense of what code truly does.

On testing, he pushes for level-appropriate coverage—API/integration tests during early design, targeted unit tests later, and restraint to prevent brittle, non-deterministic end-to-end suites.

Topics

Coding Dirty
HTMX
Clean Code
Function Length
Testing Strategy

Mentioned

Carson Gross
Adam Stinsky
TJ
Brian
Kim
Sacha Nadella
Steve Balmer
Uncle Bob