The Problem With IQ Tests
Based on Veritasium's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
IQ tests are designed to estimate general intelligence (“g”) by combining multiple mental tasks and normalizing scores to a population mean of 100 and SD of 15.
Briefing
IQ tests are widely treated as a clean, objective measure of “intelligence,” but the underlying science is messier: IQ is strongly linked to real-world outcomes, yet it is also shaped by culture, motivation, test-taking strategy, and historical test design choices. That combination helps explain why IQ scores can predict school performance, job success, and even longevity—while also fueling misuse, controversy, and claims that don’t hold up.
The modern IQ concept traces to early 1900s work on correlations between school subjects. In 1904, psychologist Charles Spearman found that students who did well in one subject tended to do well in others, with a correlation of 0.64 between math and English. He proposed a general intelligence factor, “g,” plus smaller subject-specific influences. Around the same time, Alfred Binet and Theodore Simon built the Binet-Simon test to identify children who needed extra help. Their approach used “mental age” relative to actual age, producing the original idea of an intelligence quotient. In the U.S., Lewis Terman standardized and modified the test into the Stanford-Binet, and later IQ batteries expanded to multiple abilities—memory, verbal, spatial, and numerical—then normalized scores so the population mean sits at 100 with a standard deviation of 15.
Those scores do relate to life outcomes. Higher IQ correlates with larger brain size (a 2005 meta-analysis reported 0.33), and with school achievement. A major study of 13,000 Scottish children measured IQ at age 11 and found correlations around 0.8 with later GCSE marks—suggesting a large share of variation in exam performance can be predicted from earlier IQ. IQ also tracks educational attainment and is often comparable to standardized tests like the SAT, ACT, and GRE (correlations around 0.8). Outside school, IQ shows moderate links to occupational success (often 0.2 to 0.6), with the strongest effects in complex roles; the U.S. military historically used IQ thresholds and found that lowering them increased failure rates and remedial training needs.
Yet IQ is not a pure readout of fixed ability. The transcript highlights the “Flynn Effect,” where average IQ scores rise over decades even though genetics don’t change quickly. James Flynn’s work suggests re-normalizations would otherwise reveal a roughly 30-point increase over the last century, likely driven by improved nutrition and health, better education, and shifts toward more abstract work. Even within IQ testing, performance can move with incentives: paying test-takers can raise scores, sometimes by up to 20 points, and coaching can boost results by several points. Time pressure, anxiety, and motivation all matter.
The controversy is also historical and political. The transcript links IQ’s U.S. adoption to eugenics: Henry Goddard’s interpretation of inherited, unchangeable intelligence helped justify forced sterilization laws, upheld by the Supreme Court in 1927, with tens of thousands sterilized and later influence claimed by Nazi Germany. Modern researchers emphasize that IQ is partly heritable but also environment-dependent, with estimates often around a 50/50 split in twin studies.
In the end, IQ is best treated as a useful predictor—not a verdict on worth. It can help identify strengths and support decisions (including in education and clinical settings), but it also measures more than “g,” and it can be distorted by culture and incentives. The transcript’s central message is that IQ scores can be informative while still being incomplete—and that how society uses them matters as much as what they measure.
Cornell Notes
IQ tests correlate with meaningful outcomes—school achievement, job performance, and even longevity—but they do not function as a perfectly objective, fixed measure of intelligence. The concept of IQ grew from Spearman’s “g” factor and Binet’s mental-age quotient approach, later standardized into modern scoring (mean 100, SD 15). Research highlights the Flynn Effect (average IQ rising over decades) and shows that motivation, coaching, and test-taking strategy can shift scores, meaning IQ reflects more than raw ability. Given IQ’s predictive power alongside its susceptibility to cultural and situational influences—and its misuse in eugenics—scores should be treated as a probabilistic tool, not a measure of human worth.
How did IQ testing emerge from early research on correlations between school subjects?
What does an IQ score actually represent in modern testing?
Why do IQ scores predict outcomes like school performance and job success?
What evidence suggests IQ is not fixed and not purely genetic?
How did IQ testing become entangled with eugenics and why does that still shape public attitudes?
What does “culture fair” mean in IQ testing, and why is it difficult to achieve?
Review Questions
- What are Spearman’s “g” and “s-factors,” and how do they justify the structure of IQ batteries?
- How does the Flynn Effect challenge the idea that IQ is fixed, and what explanations are offered for the rise in average scores?
- List at least three non-ability factors mentioned that can change IQ test performance (e.g., motivation, coaching, time pressure) and describe how each affects scores.
Key Points
- 1
IQ tests are designed to estimate general intelligence (“g”) by combining multiple mental tasks and normalizing scores to a population mean of 100 and SD of 15.
- 2
IQ correlates with real-world outcomes such as school achievement, job performance, and longevity, with some studies reporting very strong predictive relationships for education.
- 3
Average IQ scores have risen over time in ways captured by the Flynn Effect, suggesting environment and culture can shift test results even if genetics change slowly.
- 4
Motivation, incentives, coaching, and test-taking strategy can measurably raise IQ scores, meaning performance is not purely a fixed trait.
- 5
Historical misuse—especially eugenics-era interpretations and forced sterilization laws—has contributed to lasting public distrust of IQ testing.
- 6
IQ testing is not fully culture-free; cultural differences influence how people interpret categories and what kinds of knowledge matter for success.
- 7
The most defensible use of IQ is as a probabilistic tool for identifying strengths and risks, not as a measure of personal worth or destiny.