Statistics for #Research - L2 - The Concept of Descriptive Statistics
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Central tendency summarizes what’s typical in a dataset, while dispersion summarizes how spread out the values are.
Briefing
Descriptive statistics boil down to two jobs: summarizing what counts as “typical” in a dataset and quantifying how widely the values spread. Central tendency answers the typicality question—what a person’s height is likely to look like in a group—while dispersion (variability) answers whether those values cluster tightly or scatter across the range. Together, they turn a list of measurements into interpretable research information.
Central tendency is introduced through three measures. The mean is the arithmetic average: add all height measurements and divide by the number of observations. It’s presented as the go-to measure for interval and ratio scale variables. When values are ordered but not evenly spaced—like categories with a rank structure—the median becomes the appropriate “middle” value once the data are sorted from lowest to highest; it’s tied to ordinal scale variables. For nominal scale variables, where categories have no inherent order, the mode is used—the value that appears most frequently. An example with five people (male coded as 1 and female coded as 2) shows how the most repeated code becomes the mode.
Dispersion then shifts from “typical” to “spread.” The range is defined as the difference between the minimum and maximum values, giving a quick sense of how far apart the extremes are. But the transcript emphasizes that dispersion is broader than just extremes: variability describes how individual responses are distributed across the entire range. For a more reliable picture of spread, standard deviation is highlighted as the quickest way to gauge how much observations generally differ from the mean. Low standard deviation means most values sit close to the mean, while high standard deviation signals greater scattering.
Finally, the accuracy of the mean is linked to standard error. Standard error is described as standard deviation divided by the square root of the total number of responses. This connects sample size to precision: as the number of observations grows, the denominator increases, standard error falls, and the estimated mean becomes more accurate relative to the true population mean. The session frames these descriptive statistics—mean/median/mode for central tendency and range/standard deviation/standard error for dispersion—as the core summary information typically reported in research theses and papers.
Cornell Notes
Descriptive statistics focus on two essentials: central tendency (what’s typical) and dispersion (how spread out values are). Central tendency is measured with the mean for interval/ratio data, the median for ordinal data, and the mode for nominal data. Dispersion is assessed using range for a quick min–max spread, while standard deviation measures how far observations generally vary from the mean. Standard error refines this by estimating how accurately a sample mean reflects the true population mean, calculated as standard deviation divided by the square root of the number of responses. These tools help researchers summarize datasets in a way that supports interpretation and reporting.
How do mean, median, and mode differ, and when should each be used?
What does dispersion measure, and how is it related to variability?
Why is standard deviation more informative than range for understanding spread?
What is standard error, and how does sample size affect it?
How do central tendency and dispersion work together in research reporting?
Review Questions
- If a dataset is ordinal, which central tendency measure is appropriate and why?
- A sample has a high standard deviation but a large sample size; how would you expect standard error to behave?
- How would you interpret a small range versus a small standard deviation in terms of data spread?
Key Points
- 1
Central tendency summarizes what’s typical in a dataset, while dispersion summarizes how spread out the values are.
- 2
Mean is the arithmetic average and is used for interval and ratio scale variables.
- 3
Median is the middle value after sorting and is used for ordinal scale variables.
- 4
Mode is the most frequent value and is used for nominal scale variables.
- 5
Range measures spread using the difference between the minimum and maximum values.
- 6
Standard deviation measures how much observations generally vary from the mean; lower values indicate tighter clustering.
- 7
Standard error equals standard deviation divided by the square root of the number of responses, linking sample size to the accuracy of the sample mean.