06. SPSS Classroom | Chi Square test of Independence - Analyze, Interpret, and Report Chi Square
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Chi-square test of independence is designed for nominal or ordinal categorical variables using contingency tables and frequency-based comparisons.
Briefing
Chi-square test of independence is the go-to method for checking whether two categorical variables move together—or whether their relationship is just random chance. It’s especially useful when the data are nominal or ordinal, where averages and other descriptive statistics become meaningless. Instead, researchers rely on contingency tables and compare observed counts in each cell against expected counts calculated under the assumption of independence.
The core logic is straightforward: observed cell frequencies are the actual numbers collected for each category combination, while expected cell frequencies represent what the counts would look like if there were no association between the variables. The test statistic (chi-square) measures how far observed frequencies deviate from expected frequencies. A small chi-square value supports the null hypothesis of independence; a larger chi-square value signals that the variables are associated. Degrees of freedom are determined from the contingency table dimensions using (rows − 1) × (columns − 1), and the resulting chi-square value is evaluated against significance levels to decide whether to reject the null.
The transcript walks through practical examples of where this approach fits: whether business performance categories (loss, break-even, profit) depend on a country’s income group; whether employee satisfaction levels (e.g., 1 to 3) depend on job placement (local vs international); and whether personality type (introvert vs extrovert) is associated with color preference (red, yellow, green, blue). In each case, the variables are categorical, and the analysis hinges on contingency tables rather than means.
A worked example uses a table of introverts and extroverts against four color preferences. For instance, 13 introverts chose red, 15 chose yellow, 29 chose green, and 13 chose blue, with totals of 70 introverts, 80 extroverts, and 150 respondents overall. Expected counts are computed as (row total × column total) / grand total. For the red–introvert cell, the expected count is 70 × 22 / 150 ≈ 10.3. This expected-vs-observed comparison is repeated across all cells to drive the chi-square statistic.
The workflow in SPSS is then laid out: use Analyze → Descriptive Statistics → Crosstabs, place the categorical variables into rows and columns (with optional layering for multi-group comparisons), and select Statistics → Chi-square. The output includes the chi-square statistic, degrees of freedom, and a p-value. In the example, the p-value is greater than 0.05, leading to the conclusion that there is no significant association between personality and color preference at the 5% level.
Finally, the transcript emphasizes an important assumption check: chi-square isn’t suitable when any cell has fewer than five cases. If that happens, an alternative like Fisher’s exact test is recommended. In the example, Fisher’s exact test also yields a p-value above 0.05, reinforcing the same conclusion. Reporting guidance follows: state the variables, the hypotheses (H1 vs null), and the chi-square (or Fisher’s exact) results including degrees of freedom and p-value, concluding whether H1 is supported.
Cornell Notes
Chi-square test of independence checks whether two categorical variables are independent or associated. It compares observed cell frequencies from the data with expected cell frequencies computed as (row total × column total) / grand total, using a chi-square statistic and degrees of freedom (rows − 1) × (columns − 1). A p-value above the chosen significance level (commonly 0.05) means there’s no evidence of an association. The method works for nominal or ordinal categorical variables, but it requires adequate cell counts—if any expected/observed cell count is below 5, Fisher’s exact test is preferred. In the SPSS example, both chi-square and Fisher’s exact tests produce p-values above 0.05, so independence is not rejected.
Why can’t researchers rely on means for nominal or ordinal categorical data?
What exactly distinguishes observed cell frequencies from expected cell frequencies in a chi-square test?
How does the chi-square statistic connect to the null hypothesis of independence?
How are degrees of freedom determined for a contingency table?
What cell-count rule determines whether chi-square is appropriate or whether Fisher’s exact test is needed?
What does it mean to report “no significant association” in this context?
Review Questions
- In a contingency table with r rows and c columns, what is the formula for degrees of freedom used in the chi-square test of independence?
- How do you compute an expected cell frequency from row and column totals, and how does that expected value relate to the observed count?
- What decision rule changes when any cell count falls below 5, and which SPSS test should be used instead?
Key Points
- 1
Chi-square test of independence is designed for nominal or ordinal categorical variables using contingency tables and frequency-based comparisons.
- 2
Observed cell frequencies come directly from the collected data; expected cell frequencies are computed as (row total × column total) / grand total.
- 3
A small chi-square statistic supports the null hypothesis of independence; a large chi-square statistic suggests association between the variables.
- 4
Degrees of freedom are calculated as (rows − 1) × (columns − 1), and SPSS provides chi-square, degrees of freedom, and p-values for significance testing.
- 5
Chi-square assumptions require adequate cell counts; if any cell has fewer than five cases, Fisher’s exact test should be used.
- 6
SPSS reporting should include the chi-square statistic, degrees of freedom, and p-value (or Fisher’s exact p-value), followed by a clear conclusion about whether H1 is supported.