02. SPSS Classroom | Basic Statistical Concepts (P2) | Hypotheses, Errors (Type 1/Type 2), P-Value
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Hypothesis testing decides whether a population claim is supported by sample evidence by rejecting H0 or failing to reject H0.
Briefing
Hypothesis testing is a structured way to decide whether a real-world claim about a population should be accepted or rejected based on sample data. A hypothesis is treated as an educated guess or assumption about a population characteristic—such as an electric bulb company claiming its bulbs last an average of at least 1,000 hours. Testing that claim means collecting data from a sample (e.g., testing 100–200 bulbs by running them for 1,000–2,000 hours) and comparing the results to the population mean implied by the claim.
The decision hinges on setting up two mutually exclusive and exhaustive alternatives: the null hypothesis (H0) and the alternative hypothesis (H1). H0 represents the “status quo” and is presumed correct unless strong evidence appears against it. H1 is the negation of H0. Importantly, hypothesis testing doesn’t “accept” H1 directly; it either rejects H0 or fails to reject H0. In the bulb example, H0 could be “average life is ≥ 1,000 hours,” while H1 would be “average life is < 1,000 hours.” If H0 is rejected, it signals the claim is likely false and corrective action may be needed to improve bulb life. If H0 is not rejected, no corrective action is typically required because the status quo claim still holds.
This framework also formalizes two kinds of mistakes. A Type I error occurs when H0 is rejected even though it should have been accepted. A Type II error occurs when H0 is not rejected even though it should have been rejected. Reducing one error generally increases the risk of the other, so the practical lever is improving sample size—larger samples make the test more reliable.
To make the accept/reject decision, hypothesis testing uses a significance level, commonly expressed through the p-value (P value). The p-value is the probability of reaching the observed (or more extreme) sample results under the assumption that there is actually no true difference. For instance, in a t-test, a p-value of 0.05 means there is only a 5% chance of getting the calculated t-statistic if the two samples truly come from populations that are equal. In social sciences, 0.05 is often used as a standard threshold: if the obtained p-value is less than 0.05, H0 is rejected; if it is greater than 0.05, H0 is not rejected.
Direction matters for how hypotheses are tested. Directional hypotheses specify the direction of an effect (e.g., male job satisfaction is higher than female), so they use a one-tailed test. Non-directional hypotheses only ask whether a difference exists without specifying direction (e.g., job satisfaction differs between males and females), so they use a two-tailed test. Across business, finance, marketing, human resources, quality control, and research, the goal is typically to find statistically significant evidence—often operationalized as p-values below 0.05—to support H1 while controlling the risk of incorrect conclusions.
Cornell Notes
Hypothesis testing turns real-world claims about populations into decisions based on sample data. Each claim is paired with a null hypothesis (H0) representing the status quo and an alternative hypothesis (H1) as its negation; the process results in either rejecting H0 or failing to reject H0. Two errors are possible: Type I (rejecting H0 when it’s true) and Type II (not rejecting H0 when it’s false), and improving sample size helps reduce both tradeoffs. The p-value measures how likely the observed results are if there is truly no difference; in social sciences, p < 0.05 typically leads to rejecting H0. Directional questions use one-tailed tests, while non-directional questions use two-tailed tests.
Why are null and alternative hypotheses designed to be mutually exclusive and exhaustive?
What does it mean that hypothesis testing never directly “accepts” the alternative hypothesis?
How do Type I and Type II errors differ, and why does sample size matter?
What exactly does a p-value represent in hypothesis testing?
When should a one-tailed test be used instead of a two-tailed test?
Review Questions
- In the bulb-life example, what would H0 and H1 be, and what decision rule would you apply using the p-value?
- Explain how Type I and Type II errors would look in a marketing campaign evaluation where the goal is to detect an impact on product awareness.
- Why does a directional hypothesis require a one-tailed test, and how does that change the interpretation of the p-value?
Key Points
- 1
Hypothesis testing decides whether a population claim is supported by sample evidence by rejecting H0 or failing to reject H0.
- 2
Null hypothesis (H0) represents the status quo and is presumed correct unless evidence strongly contradicts it.
- 3
Type I error is rejecting a true H0; Type II error is failing to reject a false H0.
- 4
Reducing one type of error often increases the other, so increasing sample size is the main way to improve reliability.
- 5
The p-value is the probability of getting the observed (or more extreme) results when there is no true difference.
- 6
In social sciences, p < 0.05 is commonly used to reject H0 and treat the result as statistically significant.
- 7
Directional hypotheses use one-tailed tests, while non-directional hypotheses use two-tailed tests.