Crosstab Report and Chi Square Test using SPSS
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Cross tab reports in SPSS summarize two categorical variables by producing a row-by-column frequency table at each intersection.
Briefing
Cross tab reports in SPSS are used to summarize how two categorical variables relate by turning their joint frequencies into a two-way (or multi-way) table. Data are grouped at the intersection of row and column categories, producing counts (and optionally percentages) for each combination. When more than two variables need to be displayed, the “layer” option acts like a control variable: it splits the same cross-tab output into separate panels for each layer value, making it easier to see whether relationships differ across groups.
A common workflow starts with choosing which variable goes on rows and which goes on columns. In the example, “personality” (introvert vs. extrovert) is placed on rows and “preference” (red, yellow, green, blue) on columns. The resulting table shows how many respondents fall into each intersection—e.g., counts for introverts choosing each color and counts for extroverts choosing each color. The output also includes totals and can display N and valid percentages, helping confirm sample size and missing-data handling.
To demonstrate layering, the example adds a third variable as the layer factor (“city”). With city layered, the cross-tab table is broken into separate panels for each city (such as Islamabad and Lahore). This allows a direct comparison of the personality–preference relationship within each city, rather than mixing all cities together. The counts in each panel reflect the joint distribution of personality and color preference for that specific city.
The key inferential step is the Chi-square test of association (also described as a Chi-square test of independence). This test is appropriate when both variables are nominal (no inherent order) and the goal is to check whether categories are associated. It can also be interpreted as a test of difference between nominal categories when comparing two categorical variables.
In the example, the Chi-square test checks whether personality and color preference are associated. The reported results include a Chi-square value of 4.53, degrees of freedom of 3, and a p-value of 0.209. Because the p-value is greater than 0.05, the result is treated as statistically insignificant, meaning there’s no evidence of an association between personality and color preference at the 5% significance level. The same logic is then applied within each city using the layer variable: the significance remains above 0.05 for Islamabad and Lahore (and overall), again indicating no association.
Finally, the transcript shows how to enrich cross-tab tables with percentages. By adjusting the SPSS “cells” settings to display percentages in rows and columns, the table can report not only counts (e.g., how many introverts chose red) but also the share those counts represent within each row or within each column category. This makes the output more interpretable for reporting and comparison across groups.
Cornell Notes
Cross tab reports in SPSS summarize the joint frequency of two nominal categorical variables by placing one variable on rows and the other on columns, then reporting counts at each intersection. A “layer” factor can split the same cross-tab into separate panels (e.g., by city), letting users compare relationships within each subgroup. For inference, the Chi-square test of association/independence is used to test whether two nominal variables are associated. In the example, personality (introvert/extrovert) and color preference (red/yellow/green/blue) produce a Chi-square value of 4.53 with df = 3 and p = 0.209, which is greater than 0.05, so no association is found. Layering the test by city also yields p-values above 0.05, reinforcing the conclusion.
What exactly does a cross tab report produce in SPSS, and how is the table structured?
When should the “layer” variable be used, and what does it change in the output?
Why is the Chi-square test of association/independence appropriate for the example variables?
How are the Chi-square results interpreted in the example?
How does layering affect the hypothesis testing with Chi-square?
What does it mean to display percentages in cross tabs, and how can that help reporting?
Review Questions
- In a cross tab table, what do the row and column categories represent, and what appears at each intersection?
- Under what conditions is the Chi-square test of association/independence used, and what does a p-value above 0.05 imply in this context?
- How does adding a layer factor (like city) change both the cross-tab output and the interpretation of Chi-square results?
Key Points
- 1
Cross tab reports in SPSS summarize two categorical variables by producing a row-by-column frequency table at each intersection.
- 2
A two-way cross tab uses two variables; multi-way output can be created by adding a layer factor that splits results into separate panels.
- 3
The layer factor acts like a control variable, enabling subgroup comparisons (e.g., personality–preference patterns within each city).
- 4
The Chi-square test of association/independence is designed for nominal categorical variables to test whether category distributions are associated.
- 5
In the example, Chi-square = 4.53 (df = 3) with p = 0.209 leads to a conclusion of no significant association at the 5% level.
- 6
Layering the Chi-square test by city keeps p-values above 0.05, reinforcing the “no association” conclusion within each subgroup.
- 7
Cross tab “cells” settings can display percentages (in rows and/or columns) to complement counts and improve interpretability for reporting.