Mann Whitney U Test in SPSS - Concept, Interpretation, and Reporting Mann Whitney U Test

Q: When should researchers choose Mann–Whitney U test instead of an independent-samples t test?

Use Mann–Whitney U when the independent-samples t test assumptions don’t hold—specifically when there’s no requirement for normality. It fits ordinal data (e.g., excellent/moderate/poor) and also continuous data that are non-normal (e.g., compensation that doesn’t follow a normal distribution).

Q: What does Mann–Whitney U compare: means or medians?

It compares the distributions via ranks. Because scores are converted to ranks, the test effectively evaluates whether the groups differ in central tendency in a median-like way, rather than comparing means directly as the independent-samples t test does.

Q: How does SPSS output support interpretation of group differences?

The key significance value (p-value) indicates whether the rank difference between groups is statistically significant. In the example, the significance value is less than 0.05, so perceived service quality differs between suppliers and customers. The output also provides sample sizes and mean ranks (suppliers lower, customers higher), which reflect the rank-based shift.

Q: Why compute effect size manually, and how is it calculated here?

SPSS doesn’t provide the effect size in the demonstrated workflow. Effect size is computed using r = z / √N, where z comes from the Mann–Whitney output and N is the total sample size. With z = 2.6 and N = 304, √N ≈ 17.43, so r ≈ 2.6 / 17.43 ≈ 0.14, interpreted as a small effect.

Q: How should results be reported, including medians and test statistics?

A clear report states that the Mann–Whitney U test found a significant difference between groups (p < 0.05), notes the effect size (small, based on r ≈ 0.14), and includes medians for context. The example retrieves medians in SPSS (Analyze → Compare Means → Means, selecting Median), showing both groups have median = 2, while still reporting U = 971 and z = 2.6.

TL;DR

Mann–Whitney U test is the independent-samples t test alternative when normality isn’t reasonable or when data are ordinal.

Briefing Cornell Notes

Briefing

Mann–Whitney U test is presented as the go-to non-parametric alternative to the independent-samples t test when group data are ordinal or continuous but not normally distributed—and when comparing means isn’t appropriate. Instead of testing whether two groups differ in their means, the Mann–Whitney U test evaluates whether the distributions’ ranks differ, which effectively corresponds to a comparison of medians for the two groups. Because the method converts scores into ranks, the original score distribution doesn’t need to be normal, and the data do not need to be continuous.

The session lays out practical situations where this test fits. One example involves a quality manager comparing service quality ratings received from suppliers versus customers. Service quality is measured with three ordered categories—excellent, moderate, and poor—so the data are ordinal. Another example uses a market researcher comparing attention to social media between men and women, again measured with a three-option ordinal statement. A third scenario is an HR manager comparing compensation across two departments (finance and HR): compensation is continuous but described as non-normal, which also makes Mann–Whitney U appropriate.

A key interpretation point is that statistical significance and effect size are not the same. After running the test in SPSS, the results include a p-value indicating whether the rank-based difference between groups is statistically significant, but SPSS does not directly provide an effect size in this workflow. The session therefore computes effect size using a z-based approach: effect size r is calculated as z divided by the square root of the total sample size (r = z / √N). In the worked example, there are 158 suppliers and 146 customers (N = 304). The output shows a significance value below 0.05, indicating a significant difference in perceived service quality between suppliers and customers. The z statistic is reported as 2.6, and dividing by √304 (about 17.43) yields an effect size around 0.14, which is interpreted as a small effect—meaning the difference exists but is not large.

Reporting guidance ties the statistical output back to the original question and hypothesis. The session demonstrates writing a results statement that the Mann–Whitney U test found significant differences in service quality perception between suppliers and customers (p < 0.05), while also noting that the effect size is small. To make the report more concrete, it also retrieves median values for each group in SPSS: both suppliers and customers have a median service-quality rating of 2. The U statistic is shown as 971, and the z statistic and effect size are used to support the conclusion that the groups differ significantly in ranks, even though the median values are the same—consistent with a small effect.

Overall, the workflow emphasizes choosing Mann–Whitney U when normality assumptions fail or when data are ordinal, interpreting rank-based differences via p-values, and quantifying practical impact through an effect size calculation.

Cornell Notes

Mann–Whitney U test serves as the non-parametric alternative to the independent-samples t test when group data are ordinal or continuous but non-normal. Rather than comparing means, it compares the ranks of observations across two independent groups, which aligns with a median-based interpretation. In SPSS, the test is run under Analyze → Nonparametric Tests → Legacy Dialogs → Two Independent Samples, selecting the grouping variable (e.g., supplier vs. customer) and the test variable (e.g., service quality). A p-value below 0.05 indicates a statistically significant difference in ranks between groups. Practical importance is assessed by computing effect size r = z / √N; in the example, r ≈ 0.14 indicates a small effect even when the result is significant.

When should researchers choose Mann–Whitney U test instead of an independent-samples t test?

Use Mann–Whitney U when the independent-samples t test assumptions don’t hold—specifically when there’s no requirement for normality. It fits ordinal data (e.g., excellent/moderate/poor) and also continuous data that are non-normal (e.g., compensation that doesn’t follow a normal distribution).

What does Mann–Whitney U compare: means or medians?

It compares the distributions via ranks. Because scores are converted to ranks, the test effectively evaluates whether the groups differ in central tendency in a median-like way, rather than comparing means directly as the independent-samples t test does.

How does SPSS output support interpretation of group differences?

The key significance value (p-value) indicates whether the rank difference between groups is statistically significant. In the example, the significance value is less than 0.05, so perceived service quality differs between suppliers and customers. The output also provides sample sizes and mean ranks (suppliers lower, customers higher), which reflect the rank-based shift.

Why compute effect size manually, and how is it calculated here?

SPSS doesn’t provide the effect size in the demonstrated workflow. Effect size is computed using r = z / √N, where z comes from the Mann–Whitney output and N is the total sample size. With z = 2.6 and N = 304, √N ≈ 17.43, so r ≈ 2.6 / 17.43 ≈ 0.14, interpreted as a small effect.

How should results be reported, including medians and test statistics?