25 crazy software bugs explained
Based on Fireship's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Unsigned integer underflow can wrap values and invert intended behavior, turning “logic” into radically different outcomes.
Briefing
A single line of bad logic can turn everyday software into real-world catastrophe—whether that means freezing a music player, wiping out millions in trading losses, or killing people in flight. Across 25 infamous bugs, the through-line is clear: small mistakes in assumptions, data handling, timing, or unit conversions can cascade into failures that are expensive, dangerous, or both.
The tour begins with “feature” bugs that start as harmless quirks in games and consumer devices. In Sid Meier’s Civilization, Gandhi’s aggression value is treated as an unsigned integer; when diplomacy reduces it, underflow wraps the value around to a maximum, flipping a pacifist into a “diabolical thermonuclear enthusiast.” Players embraced the chaos enough that it effectively became lore. Real systems were less forgiving. Microsoft’s Zoom knockoff froze on New Year’s Eve because leap-year day counting wasn’t handled correctly, trapping the device in a loop until someone removed the battery. On Intel Pentium chips, the infamous fdiv bug produced incorrect floating-point division results due to a flawed SRT division approach with missing lookup-table entries—rare in occurrence but serious enough to trigger major PR fallout.
As the list moves into higher-stakes failures, the consequences scale quickly. A FaceTime group-call exploit on iPhone allowed an attacker to trigger audio behavior and, after dismissing the call, activate the recipient’s camera—discovered by a 14-year-old and eventually patched after it went viral. In finance, a 2024 Chase ATM glitch let people withdraw large sums immediately after depositing fake checks, leading to lawsuits and potential criminal charges. Other failures weren’t about fraud but about brittle systems: an AT&T long-distance crash in 1990 cascaded from one faulty switch rebooting neighboring switches, blocking 50 million calls. At airports, Heathrow Terminal 5’s baggage system suffered a breakdown when multiple software systems failed to coordinate, causing 500+ canceled flights, 42,000 lost bags, and a $16 million fix.
The most dramatic disasters come from unit mistakes, timing errors, and security oversights. NASA’s Mars Climate Orbiter burned up after one team used imperial units while another used metric, breaking the mission’s calculations. An Ariane 5 rocket exploded in 1996 after a conversion error turned a 64-bit floating-point value into a 16-bit integer, sending the vehicle 90° off course. Heartbleed in 2014 exposed servers running vulnerable OpenSSL implementations through a missing bounds check in the TLS heartbeat extension, letting attackers repeatedly request memory contents—leaving roughly two-thirds of internet servers at risk.
The final stretch turns deadly. Toyota’s electronic throttle control and braking logic issues led to recalls, injuries, and deaths. In aviation and defense, software misbehavior contributed to crashes and combat tragedies: a Patriot missile timing overflow killed 28 U.S. soldiers; an Aegis system display and timing lag helped lead to a civilian plane being shot down; and the Boeing 737 MAX disasters tied back to flawed sensor logic in the Maneuvering Characteristics Augmentation System. Even medical devices weren’t safe: the Therac-25 radiation machine delivered lethal doses due to race conditions and the removal of mechanical interlocks.
The common lesson is less about “bugs happen” and more about how assumptions fail under real conditions—leap years, edge-case inputs, overflow, concurrency, inconsistent sensors, and mismatched units. The stakes rise when software controls money, infrastructure, aircraft, or human bodies, making testing, validation, and defensive design non-negotiable.
Cornell Notes
Software failures in the real world often start as small logic errors—like underflow, missing bounds checks, or unit mismatches—but they can cascade into massive financial losses, infrastructure outages, and even deaths. Examples include Gandhi’s unsigned-integer underflow in Civilization, Microsoft Zoom freezing due to leap-year day handling, and the Pentium fdiv bug caused by missing lookup-table entries. Higher-stakes incidents show how timing and conversions can break missions (Mars Climate Orbiter’s imperial vs metric mismatch; Ariane 5’s floating-point to integer conversion error). Security bugs like Heartbleed demonstrate how a single missing bounds check in OpenSSL’s TLS heartbeat can expose sensitive memory across a large portion of the internet.
How can a “harmless” arithmetic mistake become a dramatic behavioral change, even in a game?
What made the Pentium fdiv bug so damaging despite being rare?
Why did the FaceTime group-call exploit lead to both audio glitches and camera activation?
How do unit and conversion errors translate into mission-ending failures?
What exactly enabled Heartbleed, and why did it scale so broadly?
Which patterns show up repeatedly in deadly systems failures?
Review Questions
- Pick one incident involving numeric representation (unsigned underflow, rounding/truncation, overflow, or unit conversion). Explain the specific representation mistake and the cascade effect it caused.
- Compare Heartbleed and the Ariane 5 failure: both involve data handling errors. What kind of data handling failed in each case (bounds checking vs numeric conversion), and what was the resulting impact?
- Choose a deadly-systems example (Therac-25, Patriot, 737 MAX, or Toyota). What safety mechanism failed—validation, redundancy, interlocks, or timing—and how did that failure translate into harm?
Key Points
- 1
Unsigned integer underflow can wrap values and invert intended behavior, turning “logic” into radically different outcomes.
- 2
Leap-year and calendar handling bugs can freeze systems when loops never reach an exit condition.
- 3
Rare hardware-level arithmetic errors (like missing lookup-table entries) can still trigger major real-world consequences when they affect correctness.
- 4
Security vulnerabilities often come from missing bounds checks or state validation, enabling attackers to read memory or trigger unintended device behavior.
- 5
Cascading failures frequently start with one component misbehaving (a switch reboot, a UI workflow confusion, or a misrouted update) and then propagate through dependencies.
- 6
Unit mismatches and numeric conversion mistakes can destroy missions because downstream calculations assume consistent measurement systems and data types.
- 7
In safety-critical software, lack of redundancy, interlocks, or robust handling of edge conditions can turn software defects into physical harm.