10Min Research - 34. Understanding and Performing Stratified Random Sampling in Social Sciences
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Stratified random sampling splits a population into strata so each subgroup receives adequate representation in the final sample.
Briefing
Stratified random sampling is designed to prevent a common failure of simple random sampling: when a population contains distinct subgroups, random draws can over-represent large groups and under-represent—or entirely miss—small ones. In social science research, that matters because each subgroup may contribute different perspectives that affect conclusions. The core idea is to split the population into strata (singular: stratum)—for example, Bachelor, Master, and PhD students—and then run simple random sampling within each stratum so every group appears in the final sample.
A higher-education example makes the risk concrete. Suppose a university has 1,550 students total: 1,000 Bachelor students, 500 Master students, and 50 PhD students. If the study needs a sample of 300 people and sampling is done purely at random from the full list, the sample can easily end up dominated by Bachelor and Master students, with a real chance of including zero PhD students. That would be methodologically wrong for studies where PhD input is important for understanding higher-education practices and procedures.
To fix this, the population is divided into three strata, and sampling is performed separately inside each one. The first approach is proportionate stratified random sampling, where each stratum’s share of the sample matches its share of the population. With a sample size of 300 from a population of 1,550, the overall sampling fraction is 300/1,550 ≈ 19.35%. Applying that fraction to each stratum yields target counts of about 193.5 from each group if the same percentage were used directly in the calculation shown; the practical takeaway is that the sample size allocated to each stratum is computed from the stratum’s population size times the overall sampling fraction (expressed as a percentage).
Because exact targets can be awkward and because researchers want to ensure minimum representation, the method can be adjusted by increasing the number of people contacted in each stratum. The transcript illustrates this by moving from the strict proportionate targets to a larger set of contacted counts (e.g., contacting more Bachelor and Master students and a higher number of PhD students) to help guarantee that the final number of responses meets the minimum required.
A second approach is disproportionate stratified random sampling, where the allocation across strata is intentionally changed to secure stronger representation of smaller or more important groups. Instead of using the same proportional allocation, the researcher increases the PhD stratum share (and adjusts others downward) to avoid the “too few PhD responses” problem.
Operationally, each stratum becomes its own sampling frame. With a list of all Bachelor students (say 1,000 entries in an Excel sheet), the researcher uses a random number generator to select the required number of individuals (e.g., 300) by choosing random indices within the allowed range (minimum 1, maximum 1,000). The same process is repeated for Master and PhD lists. The method depends on having access to the full population frame for each stratum; without the ability to identify and select every element, neither simple random sampling nor stratified random sampling can be carried out properly.
Cornell Notes
Stratified random sampling prevents simple random sampling from over-representing large subgroups and missing small ones. The population is split into strata (e.g., Bachelor, Master, PhD), and simple random sampling is performed within each stratum so every group is represented in the final sample. Proportionate stratified sampling allocates sample sizes according to each stratum’s share of the population, using the overall sampling fraction (sample size divided by total population). Disproportionate stratified sampling changes those allocations to boost representation of smaller or higher-priority groups, often by contacting more people in underrepresented strata to ensure minimum response counts. The method requires a complete sampling frame (a list of all elements) for each stratum so random selection can be done reliably.
Why can simple random sampling fail when a population has subgroups?
How does proportionate stratified random sampling determine how many people to sample from each stratum?
What’s the difference between proportionate and disproportionate stratified sampling?
Why might researchers increase the number of people contacted beyond the target sample size?
How is random selection carried out inside each stratum in practice?
What requirement must be met for stratified random sampling to work?
Review Questions
- If a study needs 300 responses from a population of 1,550, what sampling fraction is used for proportionate stratified allocation?
- When would disproportionate stratified sampling be preferable to proportionate stratified sampling?
- What information must exist before using a random number generator to select participants within a stratum?
Key Points
- 1
Stratified random sampling splits a population into strata so each subgroup receives adequate representation in the final sample.
- 2
Simple random sampling can miss small subgroups entirely, especially when subgroup sizes vary widely.
- 3
Proportionate stratified sampling allocates sample sizes using the overall sampling fraction (sample size ÷ total population).
- 4
Disproportionate stratified sampling intentionally changes stratum allocations to boost representation of smaller or more important groups.
- 5
Researchers often contact more people than the minimum required to account for nonresponse and still meet response targets.
- 6
Random selection within each stratum requires a complete sampling frame (a full list of elements) for that subgroup.