Olympiad level counting (Generating functions)
Based on 3Blue1Brown's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Encode subset-sum counts using a generating function: ∏_{k=1}^{2000}(1+x^k)=∑ c_n x^n, where c_n counts subsets summing to n.
Briefing
A counting problem about subsets whose element-sums are divisible by 5 turns into a clean formula once the subsets are encoded as coefficients of a polynomial and then “filtered” using the five complex fifth roots of unity. The payoff is an exact answer for the number of subsets of {1,2,…,2000} whose sum is a multiple of 5—namely
(1/5)·(2^2000 + 4·2^400).
The core challenge is precision. A quick heuristic suggests roughly one-fifth of all subsets should work, but the exact count depends on subtle imbalance in how subset sums distribute modulo 5. Brute force is hopeless: there are 2^2000 subsets, far too many to enumerate.
The solution begins by translating subset-sum counting into algebra. For a smaller warm-up set {1,2,3,4,5}, the product (1+x)(1+x^2)(1+x^3)(1+x^4)(1+x^5) expands so that each subset corresponds to one term, and the exponent of x equals the subset’s sum. After collecting like powers, the coefficient of x^n becomes the number of subsets whose elements add to n. For the full problem, the generating function becomes f(x)=∏_{k=1}^{2000}(1+x^k), so the desired quantity is the sum of coefficients c_n for which n≡0 (mod 5).
Evaluating f at special inputs reveals those coefficients without expanding the polynomial. Plugging in x=1 adds all coefficients, giving the total number of subsets 2^2000. Plugging in x=−1 creates alternating cancellation between even-sum and odd-sum subsets, showing how sign changes can isolate parity information. To isolate multiples of 5, the method generalizes this idea using complex numbers: choose ζ=e^{2πi/5}, a primitive fifth root of unity. Powers of ζ cycle every five steps, and the key identity is that ζ^5=1.
Consider the sum f(1)+f(ζ)+f(ζ^2)+f(ζ^3)+f(ζ^4). When this is expanded in terms of coefficients c_n, each coefficient c_n gets multiplied by 1+ζ^n+ζ^{2n}+ζ^{3n}+ζ^{4n}. That factor equals 5 exactly when 5 divides n, and equals 0 otherwise—so the sum filters out precisely the coefficients for multiples of 5. Therefore, the answer is (1/5) times that five-term complex evaluation.
The remaining work is computing f(ζ^j) efficiently. Because ζ^j repeats with period 5, the product over k=1 to 2000 breaks into 400 identical blocks of five factors. A final algebraic trick uses the factorization of z^5−1 into linear terms (z−ζ^j) and evaluates at z=−1 to show that the product (1+ζ^j)(1+ζ^{2j})…(1+ζ^{5j}) equals 2. This yields f(ζ^j)=2^{400} for j=1,2,3,4. The remaining case f(1)=2^{2000}. Putting everything together gives
#(subsets with sum divisible by 5) = (1/5)·(2^{2000}+4·2^{400}).
Beyond the arithmetic, the broader message is methodological: discrete information can be extracted by extending to complex inputs and using roots of unity to detect “frequency” patterns. The same philosophy echoes in the study of primes via zeta functions and in algorithms like Shor’s, where roots of unity reveal periodic structure.
Cornell Notes
Subset-sum counting becomes polynomial coefficient extraction. The generating function f(x)=∏_{k=1}^{2000}(1+x^k) expands so that the coefficient of x^n equals the number of subsets of {1,…,2000} whose elements sum to n. To count only sums divisible by 5, evaluate f at the five fifth roots of unity 1, ζ, ζ^2, ζ^3, ζ^4 and add the results. The combination f(1)+f(ζ)+f(ζ^2)+f(ζ^3)+f(ζ^4) multiplies each coefficient c_n by 1+ζ^n+ζ^{2n}+ζ^{3n}+ζ^{4n}, which is 5 when 5|n and 0 otherwise. Efficient product evaluation then gives f(ζ^j)=2^{400} for j=1..4 and f(1)=2^{2000}, yielding (1/5)(2^{2000}+4·2^{400}).
Why does multiplying (1+x)(1+x^2)…(1+x^2000) encode subset sums?
How does evaluating a generating function at x=−1 help isolate parity information?
What makes roots of unity the right tool for divisibility by 5?
How can f(ζ^j) be computed without expanding a 2000-degree polynomial?
What is the final algebraic trick that turns the five-term product into exactly 2?
Review Questions
- How does the coefficient of x^n in f(x)=∏_{k=1}^{2000}(1+x^k) relate to subsets of {1,…,2000}?
- Why does the sum f(1)+f(ζ)+f(ζ^2)+f(ζ^3)+f(ζ^4) eliminate all coefficients c_n with n not divisible by 5?
- What role does the identity ζ^5=1 play in both the filtering step and the efficient evaluation of f(ζ^j)?
Key Points
- 1
Encode subset-sum counts using a generating function: ∏_{k=1}^{2000}(1+x^k)=∑ c_n x^n, where c_n counts subsets summing to n.
- 2
The desired count is the sum of coefficients c_n for n≡0 (mod 5), not the total number of subsets.
- 3
Filtering modulo 5 is achieved by evaluating the generating function at the five fifth roots of unity and summing: f(1)+f(ζ)+f(ζ^2)+f(ζ^3)+f(ζ^4).
- 4
The identity 1+ζ^n+ζ^{2n}+ζ^{3n}+ζ^{4n}=5 if 5|n and 0 otherwise makes the five-term sum a perfect mod-5 selector.
- 5
Because ζ has period 5 in its powers, f(ζ^j) factors into 400 identical blocks, avoiding any full expansion.
- 6
A factorization of z^5−1 and a substitution z=−1 yields the exact five-term product needed, giving f(ζ^j)=2^{400} for j=1..4.
- 7
Combining f(1)=2^{2000} with the four equal complex evaluations produces the final exact answer: (1/5)(2^{2000}+4·2^{400}).