Programming Terms: Memoization

TL;DR

Memoization speeds up programs by caching results of expensive function calls and reusing them for identical inputs.

Briefing Cornell Notes

Briefing

Memoization is an optimization technique that speeds up programs by caching the results of expensive function calls and reusing them when the same inputs show up again. Instead of recomputing a costly answer every time, the program stores the first result in a cache and returns that stored value on subsequent calls with identical arguments. The payoff matters most when repeated inputs trigger repeated work—exactly the scenario memoization targets.

The example uses Python to make the idea concrete. An “expensive” function takes a number, prints “Computing …,” then waits one second (a stand-in for real heavy computation) and returns the square of the input. The program calls this function four times: with inputs 10, 4, 10, and 4. Without memoization, each call pays the one-second delay, so the total runtime lands around four seconds, and the “Computing” message appears for every call.

Memoization changes the flow by introducing a cache—implemented as a dictionary keyed by the input number. At the start of the function, the code checks whether the argument already exists in the cache. If it does, the function immediately returns the cached result and skips the expensive work entirely. If it does not, the function performs the computation, stores the result in the cache, and then returns it.

When the memoized version runs, the first time the function sees 4 or 10 it still computes and caches the answer. But on the second occurrence of each input, the cache lookup succeeds right away. The program returns 16 for input 4 and 100 for input 10 immediately, without printing the “Computing” message again and without the one-second sleep. In this small example, memoization cuts runtime from about four seconds to about two seconds.

The key takeaway is that memoization trades memory for time: it uses storage to avoid repeated computation. While the demonstration is simple, the same pattern can produce bigger gains in real systems where expensive operations—like database queries, complex calculations, or network calls—repeat for the same inputs. The technique can also be implemented more automatically in some languages or frameworks, but the core workplace concept remains the same: cache results for identical inputs and reuse them on future calls.

Cornell Notes

Memoization speeds up programs by caching the results of expensive function calls and reusing them when the same input occurs again. In the example, an “expensive” Python function sleeps for one second and returns the square of its input. Without memoization, repeated inputs (4 and 10) cause the function to run four times, taking about four seconds. With memoization, a dictionary cache stores computed results keyed by the input number; subsequent calls with the same input return instantly from the cache. The runtime drops to about two seconds because each unique input is computed only once.

What makes a function call “expensive,” and why does that matter for memoization?

In the example, the function is expensive because it simulates heavy work: it prints a message, then sleeps for 1 second before returning the square of the input. Memoization pays off when that kind of costly step would otherwise repeat for the same inputs. If the computation is cheap or inputs never repeat, caching provides little benefit.

How does the memoized function decide whether to compute or reuse a cached result?

It checks a cache (a dictionary) at the start of the function. If the input number is already a key in the cache, the function immediately returns the cached value. If the input is missing, it performs the computation, stores the result in the cache under that input key, and then returns the result.

What inputs are used in the demonstration, and how does that affect runtime?

The function is called four times with inputs 10, 4, 10, and 4. Without memoization, each call triggers the 1-second delay, so total time is about 4 seconds. With memoization, each unique input (4 and 10) is computed once, and the second time each appears the cached answer is returned, reducing runtime to about 2 seconds.

What exactly gets stored in the cache?

The cache stores the computed output for each input. For example, when the input is 4, the function computes 4*4 and caches the result 16. When the input is 10, it computes 10*10 and caches 100. Later calls with the same input return those stored values directly.

Why does memoization reduce repeated computation in the second run?

On the first encounter with an input, the cache lookup fails, so the function performs the expensive computation and then saves the result. On later encounters, the cache lookup succeeds immediately, so the expensive computation block (including the sleep) is skipped and the cached result is returned.

Review Questions

In the memoization pattern, what condition triggers a cache hit, and what action follows a cache hit?
How does memoization change the number of times the expensive computation runs when inputs repeat?
What tradeoff does memoization introduce by using a cache, and when would that tradeoff be most worthwhile?

Key Points

1
Memoization speeds up programs by caching results of expensive function calls and reusing them for identical inputs.
2
A cache lookup at the start of the function determines whether to compute or return a stored value.
3
In the example, repeated inputs (4 and 10) cause repeated work without memoization, but only first-time work with memoization.
4
The cache is implemented as a dictionary keyed by the function argument, storing the computed output as the value.
5
Memoization reduces runtime by avoiding repeated expensive steps, trading extra memory usage for time savings.
6
The benefit grows with more complex computations and with higher likelihood of repeated inputs.

Highlights

Memoization turns repeated expensive calls into quick cache lookups by storing results keyed by input.

In the demo, runtime drops from about 4 seconds to about 2 seconds because each unique input is computed only once.

On a cache hit, the function returns immediately—skipping both the simulated delay and the computation logic.

The cache stores concrete outputs (e.g., 4→16 and 10→100), enabling instant reuse on later calls.

Topics

Memoization
Caching
Optimization
Python
Performance

Mentioned

Corey Schafer