Chi-Square Test
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Chi-square test of association (independence) is designed for two nominal categorical variables to test whether their category distributions are related.
Briefing
Chi-square test of association (also called chi-square test of independence or Pearson’s chi-square test) is used to check whether two categorical variables measured on a nominal scale are related. The key idea is simple: when categories like “introvert/extrovert” or “red/yellow/green/blue” have no inherent order, chi-square can test whether the distribution of one variable differs across the categories of the other. It’s also often interpreted as asking whether there’s a statistically meaningful difference between the variables’ category patterns.
The transcript lays out when this test fits real research questions. Examples include whether gender is associated with preferred learning method (textbook reading vs class discussion), whether personality type (introvert/extrovert) is associated with color preference, whether car make is associated with gender, and whether a watch brand is associated with gender. In each case, both variables are categorical and nominal—respondents are simply classified into groups.
A worked example tests association between personality and color preference. Personality has two categories: introvert (coded as 1) and extrovert (coded as 2). Color preference has four categories: red, yellow, green, and blue. The analysis is performed using cross-tabs: personality is placed in rows and preference in columns, then chi-square is selected under statistics. The output reports 150 respondents with no missing values.
The cross-tab counts show how preferences distribute within each personality group. Among introverts, 13 preferred red, 15 yellow, 29 green, and 13 blue (70 introverts total). Among extroverts, 9 preferred red, 29 yellow, 29 green, and 13 blue (80 extroverts total). To determine whether these differences reflect a real association rather than random variation, the chi-square test uses the chi-square statistic and its p-value.
The results show a chi-square value of 4.53 with degrees of freedom of 3. The p-value is 0.29, which is greater than the 0.05 significance threshold. That leads to the conclusion that there is no significant association between personality and color preference at the 5% level. The transcript also checks an important assumption: expected cell counts should not be too small. Here, 0% of cells have expected counts less than five, and the minimum expected count is 10.27—comfortably above the usual cutoff—so the chi-square approximation is considered acceptable.
For reporting, the transcript provides a template-style sentence: chi-square statistics were used to examine association between the categorical variables, and because the relationship is insignificant, the result is stated without parenthetical “insignificant” phrasing. The final conclusion is that H1 is not supported: personality and color preference are not statistically associated in this dataset at the 5% significance level.
Cornell Notes
Chi-square test of association (chi-square test of independence) checks whether two nominal categorical variables are related. It’s appropriate when categories have no order—such as introvert/extrovert versus red/yellow/green/blue. In the example, cross-tabs are built with personality in rows and color preference in columns, then chi-square is computed. The output gives χ² = 4.53, df = 3, and p = 0.29, which is above 0.05, so the association is not significant. The analysis also verifies assumptions: 0% of cells have expected counts below 5, with a minimum expected count of 10.27, supporting the validity of the chi-square test.
When should a chi-square test of association be used instead of other tests?
How does the example set up the chi-square test for personality and color preference?
What do the cross-tab counts reveal, and why aren’t they enough on their own?
How are the decision criteria applied in the example?
What assumption about expected counts is checked, and what were the results here?
What is a clear way to report the findings from the chi-square test?
Review Questions
- What makes a variable “nominal” and why does that matter for choosing the chi-square test of association?
- In the example, why does a p-value of 0.29 lead to concluding no significant association at the 5% level?
- What expected-count check is performed for chi-square validity, and how would you interpret a case where many cells have expected counts below 5?
Key Points
- 1
Chi-square test of association (independence) is designed for two nominal categorical variables to test whether their category distributions are related.
- 2
It’s appropriate for questions like gender vs learning method, personality vs color preference, and brand vs gender when both variables are categorical.
- 3
Run the test using cross-tabs: place one categorical variable in rows and the other in columns, then select chi-square under statistics.
- 4
Use the chi-square statistic with degrees of freedom and the p-value to decide significance against a chosen threshold (commonly 0.05).
- 5
Check expected cell counts: ensure expected counts are not too small (the example reports 0% below 5 and a minimum expected count of 10.27).
- 6
Report results clearly with χ², df, and p, and state whether H1 is supported based on whether p is below the significance level.