Get AI summaries of any video or article — Sign up free
10 weird algorithms every developer should know thumbnail

10 weird algorithms every developer should know

Fireship·
6 min read

Based on Fireship's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Marching cubes converts 3D scalar fields into polygon meshes by mapping 8-bit neighborhood patterns (256 cases) to precomputed polygon configurations.

Briefing

A handful of “weird” algorithms end up doing serious work across medicine, graphics, machine learning, security, and distributed computing—often by turning counterintuitive ideas into practical speedups or robust guarantees. The through-line is that clever constraints and precomputed structure can transform an otherwise brute-force problem into something workable at scale.

One of the clearest examples is marching cubes, created in the late 1980s at General Electric. It takes a 3D scalar field—like the intensity values produced by CT or MRI scans—and converts it into a polygonal mesh. The method samples each point’s eight neighbors, treats the resulting values as an 8-bit pattern (256 possibilities), and maps each pattern to a precomputed set of polygons. Marching through the volume yields a surface that can be rendered in 3D software, making medical slices usable as tangible geometry.

Several other algorithms lean on “structure first, randomness later.” Wave function collapse starts from a set of tiles that represent possible states in a procedural world. Instead of generating a map by brute force, it treats the initial layout as a superposition of possibilities and then “collapses” it into a single consistent outcome when observed—choosing random tiles but enforcing rules like connected roads. Diffusion, originally developed at OpenAI and behind image generators such as DALL·E and Stable Diffusion, works in the opposite direction: it adds noise step-by-step to an image until it becomes random, then learns to reverse that process. After training on millions of labeled images, the learned weights can generate new images by starting from noise and iteratively refining it into coherent structure.

Optimization appears in simulated annealing, which tackles landscapes full of local maxima and minima. It begins with a high “temperature” that allows exploration, then gradually cools so the algorithm becomes less willing to accept worse solutions—balancing exploration versus exploitation. Sorting gets a darkly comic detour with sleep sort and BOGO sort: one delegates ordering to the CPU scheduler by sleeping proportional to values; the other repeatedly guesses randomly until the array happens to be sorted. The transcript pushes the idea further with a hypothetical “quantum BOGO sort,” where parallel universes might contain a sorted version.

Security and reliability bring the stakes back up. RSA is presented as a public-key system whose security depends on the difficulty of factoring large numbers. Quantum computing threatens that foundation via Shor’s algorithm, which can factor integers exponentially faster than classical methods by using cubits, superposition, and entanglement—though the transcript notes practical limits and cites a recent Chinese factoring result using a different approach. For distributed systems, the Byzantine generals problem motivates PBFT (Practical Byzantine Fault Tolerance), designed to keep a network consistent even when up to one-third of nodes behave maliciously or fail.

Finally, the transcript highlights algorithms that mirror nature or exploit text structure. Boids (from 1986) uses three simple flocking rules—avoid crowding, align headings, and move toward local center of mass—to produce emergent bird-like patterns. Boyer–Moore string search (and its “bad character” and “good suffix” style skipping) explains why tools like grep can be fast: it scans from right to left and uses preprocessed tables to skip large chunks of text, with performance improving as more text becomes available.

Cornell Notes

The transcript strings together 10 algorithms that look odd on the surface but solve real problems by exploiting structure, precomputation, or clever constraints. Marching cubes converts 3D scalar fields (e.g., CT/MRI data) into polygon meshes by mapping 8-bit neighborhood patterns to precomputed polygon configurations. Wave function collapse and diffusion both generate consistent outputs from uncertainty—one by collapsing tile superpositions under rules, the other by learning to reverse a noise-adding diffusion process. Simulated annealing improves optimization by balancing exploration and exploitation as “temperature” cools. PBFT addresses distributed consensus under Byzantine failures, while Boyer–Moore speeds string search by skipping based on preprocessed tables.

How does marching cubes turn medical scan data into a 3D surface?

It starts with a 3D scalar field where each spatial point has a single value (e.g., MRI/CT intensity). For each cube cell, it samples the eight neighboring points and treats their values as an 8-bit pattern (256 possible configurations). A precomputed lookup table maps each configuration to a set of polygons. By “marching” through all cube cells and stitching these polygon pieces together, the algorithm produces a polygonal mesh representing an isosurface that can be rendered in 3D software.

What’s the practical difference between wave function collapse and diffusion as generation methods?

Wave function collapse begins with a set of possible tiles and treats the initial state like a superposition of possibilities. When the system “observes” a location, it collapses to a specific tile choice while enforcing constraints (e.g., roads must remain connected), producing a coherent map without relying on modern generative AI. Diffusion instead starts from random noise (high entropy), then—after training on millions of labeled images—uses learned weights to reverse a noise-adding process step-by-step until a structured image emerges (lower entropy).

Why does simulated annealing help with optimization problems that have many local optima?

A hill-climb approach can get stuck on local peaks. Simulated annealing starts with a high temperature so it explores widely, even sometimes accepting worse solutions. As time passes, temperature decreases, reducing the probability of accepting worse moves. This cooling schedule helps the search escape local maxima early and converge toward better solutions later—balancing exploration versus exploitation.

What makes PBFT relevant to real distributed systems?

PBFT is designed for the Byzantine generals problem: nodes may fail or act maliciously (e.g., a “drunk” general). The protocol uses a message sequence—pre-prepare, agreement, and then commit—so that once enough nodes agree (a threshold), the system reaches consensus and applies the same state change everywhere. The transcript links this to blockchain and distributed cloud databases, where consistency must hold despite faulty or compromised nodes.

How does Boyer–Moore string search achieve speedups over naive scanning?

It scans from right to left and uses two preprocessed tables to decide how far to jump when mismatches occur. If a character doesn’t appear in the pattern, it skips past it using a “bad character” rule. If a partial match fails, it uses a separate table (often described as the “good suffix” idea) to maximize safe skipping. Because it can skip more characters than one-by-one comparison, it tends to get faster as it can skip a higher proportion of the text.

What threat does Shor’s algorithm pose to RSA, and what limits are noted?

RSA security relies on the difficulty of factoring large numbers into primes. Shor’s algorithm can solve integer factorization exponentially faster than classical algorithms by leveraging cubits, superposition, and entanglement to perform massive parallel computation. The transcript notes that practical quantum systems are still limited; it mentions that a large number factoring attempt failed on an IBM state-of-the-art Q system, while a Chinese quantum computer recently factored a big number using a different algorithm that doesn’t scale as well for large numbers.

Review Questions

  1. Which lookup-table assumption makes marching cubes feasible, and how many neighborhood configurations does it consider per cube cell?
  2. In what way do wave function collapse and diffusion both start from uncertainty, yet end with a single coherent output?
  3. What specific message phases does PBFT use to reach consensus under Byzantine failures?

Key Points

  1. 1

    Marching cubes converts 3D scalar fields into polygon meshes by mapping 8-bit neighborhood patterns (256 cases) to precomputed polygon configurations.

  2. 2

    Wave function collapse generates procedural maps by collapsing a superposition of tile possibilities into a consistent arrangement while enforcing constraints like connected roads.

  3. 3

    Diffusion-based generation adds noise to images in a forward process and learns to reverse it, producing coherent outputs from random noise after training on large labeled datasets.

  4. 4

    Simulated annealing improves optimization by starting with high-temperature exploration and gradually cooling to reduce acceptance of worse solutions.

  5. 5

    Sleep sort and BOGO sort are intentionally impractical sorting strategies that illustrate how delegating work to randomness or scheduling can fail despite being conceptually simple.

  6. 6

    RSA’s security depends on hard integer factorization, and Shor’s algorithm threatens that foundation by enabling exponentially faster factoring on a sufficiently capable quantum computer.

  7. 7

    PBFT provides a consensus mechanism for distributed systems under Byzantine failures using pre-prepare, agreement, and commit phases.

Highlights

Marching cubes treats each cube’s eight sampled values as an 8-bit pattern and uses a 256-entry lookup table to build an isosurface mesh.
Diffusion generation works by learning to reverse a noise process: start with random noise, then iteratively refine into a structured image.
PBFT keeps distributed systems consistent even when up to one-third of nodes behave maliciously or fail, using a staged message protocol.
Boyer–Moore string search speeds up matching by scanning right-to-left and skipping multiple characters using preprocessed tables.

Topics

Mentioned