C Must Die
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
C’s portability depends on staying within defined behavior; undefined behavior lets compilers change semantics across builds and targets.
Briefing
C’s rise is inseparable from Unix’s early need to move across hardware, but its modern “portability” bargain comes with a darker catch: undefined behavior gives compilers permission to break code in ways that vary by optimization level, compiler version, architecture, and even the order of internal passes. That combination—portable syntax paired with non-portable semantics—turns C into a language where “it works on my machine” can become a security risk, not just a debugging headache.
The story begins at Bell Labs, where Ken Thompson and Dennis Ritchie built early Unix tooling for PDP systems using assembly because writing machine code was too labor-intensive. C emerged as a replacement that let Unix kernels and utilities be rewritten with low-level control while avoiding per-architecture rewrites. Porting Unix to new hardware then became easier, since C source could be recompiled rather than manually reassembled for each instruction set.
But as Unix spread, performance and correctness problems appeared. C programs often ran slower than expected when the target hardware diverged from the original PDP environment. Compiler developers responded with increasingly aggressive optimizations, which gradually moved C further from “transparent” low-level behavior. The language standardization effort—culminating in the 1989 C standard (often referred to as C89)—tried to formalize portability using an “abstract machine.” That abstract machine made it possible to write programs that behave consistently across platforms, while still letting compilers optimize.
The mechanism that enables both portability and optimization is undefined behavior. When a program uses constructs the standard doesn’t define—such as shifting by too many bits, dereferencing invalid pointers, relying on signed overflow, or violating strict aliasing rules—the standard imposes no requirements. Compilers can treat such code as impossible, ignore it, reorder logic, or generate different results across builds. A concrete example shows a left shift that seems like it should yield a predictable value; instead, the outcome depends on how the compiler reasons about the undefined case. Another example demonstrates null-pointer checks being optimized away because the compiler can assume the “impossible” dereference never occurs, changing control flow.
The transcript then broadens the consequences: undefined behavior can linger silently until a change in compiler, flags, or architecture triggers a failure. It can also undermine security practices. Clearing sensitive data with memset may be removed if the compiler decides the memory is no longer used, leaving passwords or secrets in registers or stack memory. Strict aliasing further complicates low-level programming by allowing compilers to assume differently typed pointers don’t refer to the same memory—an assumption that breaks common “type punning” tricks.
A major flashpoint is signed integer overflow. In the C standard, signed overflow is undefined behavior, so compilers may delete overflow checks entirely. The transcript recounts a long-running GCC controversy where a developer’s overflow assertion disappeared under certain conditions, prompting debate over whether compilers should prioritize standards-based optimization or preserve safety checks for existing code. The broader conclusion is blunt: C’s combination of low-level power and undefined behavior makes it unreliable as a general development tool, pushing programmers toward languages like Rust or Zig that aim to reduce or eliminate these “time-bomb” failure modes.
Cornell Notes
C’s portability promise is undercut by undefined behavior: when code hits cases the C standard doesn’t define, compilers are free to optimize in ways that can change results across architectures, compiler versions, and optimization flags. Standardization (C89) introduced an abstract machine to define “normal” behavior, but it also created room for compilers to treat undefined constructs as impossible. The transcript highlights how undefined behavior can be exploited accidentally—shifts past bit-width, null-pointer dereferences, dead-code elimination around checks, strict aliasing violations, and signed overflow. The practical impact is security and reliability failures, including optimizations that remove attempts to clear secrets from memory. The takeaway: writing portable, correct C requires avoiding undefined behavior entirely, because the compiler may turn latent bugs into unpredictable outcomes at the worst possible time.
Why did C become central to Unix, and what problem did it solve compared with assembly?
What is the “abstract machine” introduced in the C standard trying to achieve?
How does undefined behavior turn into real-world unpredictability?
Why can memset-based “secret wiping” fail even when the code looks correct?
What is strict aliasing, and why does it matter for low-level code?
Why did GCC remove an overflow check, and what does that reveal about signed overflow in C?
Review Questions
- Which categories of operations in C are treated as undefined behavior in the transcript, and how do compilers typically exploit that freedom?
- Explain how optimization can remove a null-pointer check in the transcript’s example—what assumption makes the check “dead”?
- What security failure can occur when clearing memory with memset, and why might the compiler decide the clearing is unnecessary?
Key Points
- 1
C’s portability depends on staying within defined behavior; undefined behavior lets compilers change semantics across builds and targets.
- 2
C89’s abstract machine formalized portability, but undefined behavior was intentionally left unspecified to enable optimization.
- 3
Shifts that exceed the bit-width, null-pointer dereferences, and other invalid constructs can produce different results depending on compilation target and optimization.
- 4
Compilers can reorder or eliminate logic around undefined behavior, such as removing null checks when an earlier dereference makes the check redundant under optimization.
- 5
Attempts to wipe secrets with memset can be optimized away if the cleared memory is not observed later, leaving passwords in stack or registers.
- 6
Strict aliasing lets compilers assume differently typed pointers don’t refer to the same memory, breaking common low-level type-punning tricks.
- 7
Signed integer overflow is undefined behavior in C, enabling compilers to remove overflow checks—driving long-running GCC debates about safety vs optimization.