How BAD Is Test Driven Development? - The Standup #6
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
TDD can distort design when the interface optimized for tests doesn’t match the interface needed for real usage.
Briefing
Test-driven development (TDD) drew heavy skepticism in a standup-style debate, with the core complaint landing on a simple trade-off: forcing development to revolve around writing tests can distort design toward testability rather than real-world use. Multiple participants said they like testing, but dislike “test-first” as a primary workflow, arguing that it can produce interfaces and APIs that work under the test harness yet feel awkward or even unusable in practice.
One developer described a recent TDD attempt while building a card-draw mechanic for a game (“The Towers of Mordoria”). The test-driven approach felt great in isolation: the API was designed in the test, tests were set up to fail, and then the implementation was made to pass. The trouble came when integrating into the actual game. The interface that was ideal for testing turned out to be a poor fit for real usage, requiring a redesign. That experience became a broader pattern for him: TDD works best when the “testing interface” and the “usage interface” naturally align, but it often pushes developers toward inversion-of-control abstractions and overly modular interfaces that don’t match how the system is meant to be used.
Another participant echoed the concern, framing it as a zero-sum problem. Time spent constructing tests and shaping architecture around them is time not spent building for the primary use case. He argued that the emphasis on tests can lead to the same failure mode seen in other “optimize for a metric” cultures: lots of effort goes into satisfying the testing ritual, while the resulting system still needs fixing. He also pointed to real-world examples of testing that didn’t prevent user-facing defects, using YouTube’s play/pause indicator inconsistency as a cautionary tale.
Still, the discussion wasn’t anti-testing. Several people described pragmatic testing strategies that deliver confidence without the rigid TDD loop. One approach favored granular tests for stable components (like reusable parsers) and targeted multi-step tests when the system behavior is hard to validate otherwise. Another leaned into snapshot or “golden” testing—asserting that a printed representation (diff-friendly) matches a stored expected output—arguing it’s powerful when the representation is stable and changes are meaningful. But snapshot testing drew its own caveats: if requirements churn or the underlying structure changes frequently, snapshots become noisy, hard to trust, and expensive to update.
The most constructive “TDD” defense came from reframing it as a tool for specific situations. One participant suggested that driving development with tests can be valuable when a task lacks feedback—especially low-level components with no direct visual output—because tests can provide a measurable target (e.g., performance counters like cache misses or cycles). Others argued that mature teams often add tests after discovering bugs (“test-driven debugging”), turning recurring failure modes into regression nets rather than treating tests as the starting point.
By the end, the group converged on a shared message: testing is essential, but the workflow should match the problem. Use tests to catch hard-to-find or catastrophic failures, prefer stable assertions (including snapshots when appropriate), and rely on incremental releases and real-world feedback when the system is still in flux. The debate also included a cautionary anecdote about production incidents and the limits of relying on unit-level confidence alone—reinforcing the idea that testing strategy, not test volume, determines whether software actually improves.
Cornell Notes
The discussion draws a sharp line between “testing” and “test-driven development.” Participants generally agree that tests are valuable, but forcing development to be driven by tests can distort architecture—especially when the interface optimized for tests doesn’t match real usage. Several people prefer targeted testing: granular unit tests for stable components, end-to-end tests for integration confidence, and snapshot/golden tests when outputs can be represented stably and diffed. Others argue that TDD can be useful in narrow cases, such as low-level work with no natural feedback or performance tuning where tests provide measurable targets. The takeaway is to choose the right testing strategy for the system’s stability and risk profile rather than treating TDD as a universal rule.
Why did one developer end up redoing a TDD-designed interface after it “worked” in tests?
What’s the central critique of TDD that treats it as a zero-sum trade-off?
How do snapshot/golden tests fit into the debate, and when do they work best?
What alternative to TDD did participants describe for building confidence?
When did someone argue that test-driven development can still be justified?
What did the group imply about relying on unit tests alone during refactors?
Review Questions
- What specific failure mode did the game developer experience when TDD produced a testing-optimized interface that didn’t translate to real usage?
- Under what conditions do snapshot/golden tests become high-confidence tools versus maintenance burdens?
- Why did participants argue that “testing for refactoring confidence” can be overstated when integration edge cases are the real risk?
Key Points
- 1
TDD can distort design when the interface optimized for tests doesn’t match the interface needed for real usage.
- 2
Testing is broadly valued, but participants object to treating “test-first” as a universal development workflow.
- 3
Time spent building tests and test-friendly abstractions is time not spent designing for the primary use case (a zero-sum framing).
- 4
Snapshot/golden testing can be extremely effective when the asserted representation is stable and diffable, but it becomes noisy when requirements or internal structures churn.
- 5
A pragmatic strategy often combines granular unit tests for stable components, end-to-end tests for integration confidence, and targeted regression tests for high-risk areas.
- 6
Test-driven debugging (adding tests after discovering bugs) can produce better regression nets than strict red-green TDD loops.
- 7
Driving development with tests can be useful for low-feedback tasks and performance tuning when tests provide measurable targets.