LESSON 14 - THREATS TO INTERNAL AND EXTERNAL VALIDITY OF EXPERIMENTAL DESIGNS
Based on RESEARCH METHODS CLASS WITH PROF. LYDIAH WAMBUGU's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Internal validity is the confidence that the treatment variable, not other factors, caused the observed change in the dependent variable.
Briefing
Experimental designs earn their power from one promise: they can support causal claims—“X causes Y”—only when threats to validity are controlled. Those threats are split into two buckets. Internal validity determines whether the observed change in the dependent variable is truly caused by the treatment variable, not by some other factor. External validity determines whether those causal findings can be generalized beyond the specific people, setting, and time period used in the study.
Internal validity is essentially the confidence that the treatment was the sole cause of the outcome. If a behavior change program improves youths’ behavior, internal validity asks whether the improvement came from the program itself or from other events happening at the same time. The lesson lists common internal threats and practical ways to reduce them. “History” refers to outside events occurring during the experiment—like counseling, exposure to related programs, or family discussions—that could also shift behavior. The fix is to ensure the experimental and control groups experience the same external events and to minimize extraneous variables.
“Testing” is another threat: repeated pre-testing can make participants more familiar with the measure, leading to better post-test performance. Researchers can use longer time gaps between pre- and post-tests, make measures as equivalent as possible, or drop pre-tests entirely via post-test-only designs. “Maturation” (sometimes described as “musculation/machination”) captures natural physical or psychological development during the study; it can be addressed by selecting participants who mature at similar rates and shortening the time between measurements. “Instrumentation” covers changes in measurement that alter scores; it’s handled by using the same instrument for pre- and post-tests.
The lesson also highlights design threats tied to who ends up in each group and who stays. “Selection bias” happens when groups differ at baseline (e.g., brighter students in the experimental group). Random selection and making groups as equivalent as possible reduce this risk. “Experimental mortality” occurs when participants drop out, creating imbalance; recruiting larger samples helps absorb dropout effects. “Statistical regression” is triggered when extreme scores are selected; avoiding extreme entry characteristics helps. Communication between groups threatens causal inference through “division of treatment” or interaction among variables, so groups should be kept separate and blinded to group status.
Motivation and fairness threats—“compensation/resentive demoralization” and “compensatory rivalry”—arise when only the experimental group receives benefits. Providing the control group with a placebo-like alternative or offering the real treatment after the experiment can reduce resentment. “Double blinding” (ensuring neither participants nor researchers know group assignment) supports both internal and external validity.
External validity asks whether results hold in other settings, for other people, and at other times. Threats include selection-related interaction effects, setting-by-treatment interactions, and history-by-treatment interactions that make results time-bound. Reactive effects of testing can also limit generalization if pre-testing alerts participants to the treatment. Researcher expectations can distort outcomes through “experimental effects” (predomarium effect), and “reactive effects of experimental arrangement” (the Hawthorne effect) occurs when participants behave artificially after realizing their group status.
To minimize external threats, the lesson emphasizes randomization, double blindness, researcher neutrality, spacing out multiple treatments, and using procedural control groups that receive something comparable to the experimental group so neither group feels disadvantaged. The takeaway is straightforward: causal confidence requires internal validity, and meaningful usefulness requires external validity—both must be protected for experimental findings to be credible and transferable.
Cornell Notes
Internal validity measures whether a study can credibly claim that the treatment variable caused the observed change in the outcome variable. It is threatened by factors like history, testing effects, maturation, instrumentation differences, selection bias, dropout (experimental mortality), regression to the mean, and contamination or communication between groups. External validity measures whether findings generalize to other people, settings, and time periods; it is threatened by selection-by-treatment and setting-by-treatment interactions, history-by-treatment time limits, reactive testing effects, researcher expectation effects, and the Hawthorne effect. Minimizing these threats relies on randomization, double blindness, neutral administration, appropriate timing, and giving control groups comparable experiences (placebo/procedural controls).
What is internal validity, and how does it connect to causal claims in experiments?
How do “history” and “testing” threaten internal validity, and what are the corresponding fixes?
Why do maturation and instrumentation matter for internal validity?
What threats to internal validity come from group composition and participant dropout?
What makes external validity harder than internal validity, and which threats limit generalization?
Which strategies reduce both internal and external validity threats?
Review Questions
- How would you distinguish internal validity from external validity in an experiment claiming that a treatment causes an outcome?
- List at least four internal validity threats and match each with a practical mitigation strategy.
- What external validity threats would you check before claiming results apply to a different population or setting?
Key Points
- 1
Internal validity is the confidence that the treatment variable, not other factors, caused the observed change in the dependent variable.
- 2
History threatens internal validity when outside events occur during the experiment; matching experiences across experimental and control groups helps reduce it.
- 3
Testing effects can inflate post-test performance; post-test-only designs or longer intervals between tests can mitigate this risk.
- 4
Selection bias, dropout (experimental mortality), and regression to the mean can all distort group comparability; random selection/assignment and appropriate sampling strategies help.
- 5
Contamination between groups (communication or shared experiences) undermines causal inference; keeping groups separate and using blinding reduces this threat.
- 6
External validity depends on generalizability across people, settings, and time; interaction effects (selection-by-treatment, setting-by-treatment, history-by-treatment) limit transferability.
- 7
Double blindness, neutrality, procedural control groups, and careful timing are key tools for minimizing both internal and external validity threats.