Wilcoxon Signed Rank Test: Concept, Interpretation, Reporting Wilcoxon Signed Rank Test

TL;DR

Wilcoxon signed-rank test is a non-parametric alternative to the paired-samples t test for related before–after measurements.

Briefing Cornell Notes

Briefing

Wilcoxon signed-rank test is presented as a non-parametric alternative to the paired-samples t test for detecting whether an intervention changes outcomes measured on the same people before and after. Instead of relying on normality assumptions, it ranks the size of within-subject differences and then tracks whether those differences trend positive or negative. That makes it useful in practical “before–after” scenarios such as testing whether a new math-focused TV program boosts children’s interest, whether sleep duration changes after clinical treatment, or whether seminar-based learning shifts preferences.

The walkthrough centers on a market-research question: whether preference for watching social media ads changes after attending a seminar on learning through social media. The setup uses paired observations—each respondent provides a “pre-seminar” preference score and a “post-seminar” preference score. The analysis is framed with an alternative hypothesis that expects a significant change in preference after the seminar. The data are entered as two related samples (social media preference 1 for before, social media preference 2 for after), and the Wilcoxon signed-rank test is run under the “two related samples” option.

Results are interpreted through the sign and magnitude of the ranks. The output shows zero negative ranks, which would indicate post-seminar scores are lower than pre-seminar scores; that pattern does not occur here. Instead, all observed changes are in the positive direction: 11 respondents show an increase in preference, while 4 report no change. This directionality matters because it aligns the statistical conclusion with the substantive story—preferences rise rather than fall.

Statistical significance is reported using the test’s p-value. The significance value is given as 0.03, indicating a statistically significant change in preference after the seminar. The test statistic is summarized with a Z value of -3.17 (the negative sign reflects the direction coding used by the software; the interpretation focuses on the positive change shown by the rank counts). The p-value of 0.03 supports rejecting the null of no change.

Finally, the effect size is calculated to judge practical importance, not just statistical significance. The effect size formula used is r = Z / √n, with n taken as 30 because there are 15 pre and 15 post observations (paired, treated as 30 total cases in the calculation). Using Z = -3.17 yields r = 0.55. The magnitude is interpreted as a large effect size, leading to the conclusion that the seminar produced a significantly large positive shift in how people perceive or prefer watching social media advertisements.

Overall, the transcript ties together concept, direction of change, statistical significance, and effect size—showing how Wilcoxon signed-rank test can evaluate intervention impact in paired before–after designs when parametric assumptions are not appropriate.

Cornell Notes

Wilcoxon signed-rank test is a non-parametric method for paired before–after data, used to determine whether an intervention produces a significant change in the same participants’ scores. In the example, preference for watching social media ads is measured before and after a seminar on learning through social media. The results show zero negative ranks, meaning no participants decreased their preference; 11 increased preference and 4 had no change. The test reports a Z value of -3.17 with a p-value of 0.03, indicating a statistically significant positive change. Effect size is computed as r = Z/√n with n = 30, giving r = 0.55, which is interpreted as a large effect.

When should a researcher choose the Wilcoxon signed-rank test instead of a paired-samples t test?

It’s used for paired before–after (related samples) comparisons when a non-parametric approach is preferred—most notably when normality assumptions for the paired t test are questionable. The method ranks within-subject differences and evaluates whether the median shift is different from zero without requiring the differences to follow a normal distribution.

How does the sign of the ranks (negative vs positive) affect interpretation?

Negative ranks correspond to cases where post-intervention scores are lower than pre-intervention scores. Positive ranks correspond to post scores being higher than pre scores. In the example, negative ranks are zero, so no respondents show a decrease; the substantive conclusion becomes that the intervention increased preference.

What do the counts of respondents (increase vs no change) tell you beyond the p-value?

They describe the direction and prevalence of change. Here, 11 respondents show an increase in preference after the seminar, while 4 show no change. This supports the direction implied by the rank results and helps interpret what “significant change” means in real terms.

How are statistical significance and the Z value used together in reporting?

The Z value summarizes the standardized test statistic, while the p-value (reported as significance value) determines whether the change is statistically significant. The example reports Z = -3.17 and p = 0.03, so the change is significant at conventional thresholds (e.g., 0.05) and the direction is interpreted using the rank pattern.

Why compute an effect size after finding significance, and how is it calculated here?

Significance alone doesn’t indicate how large or meaningful the change is. The transcript computes effect size using r = Z/√n, with n taken as 30 because there are 15 pre and 15 post observations used in the calculation. With Z = -3.17, r = 0.55, which is interpreted as a large effect.

Review Questions

In a paired before–after study, what would zero negative ranks imply about the direction of change?
How would you report the results of a Wilcoxon signed-rank test using Z, p-value, and effect size?
If the p-value were not significant, what additional information (like effect size) might still help interpret the intervention’s impact?

Key Points

1
Wilcoxon signed-rank test is a non-parametric alternative to the paired-samples t test for related before–after measurements.
2
The test ranks within-subject differences and uses the pattern of positive vs negative ranks to determine direction of change.
3
Zero negative ranks indicate no participants decreased their outcome after the intervention.
4
Statistical significance is assessed using the p-value; the example reports p = 0.03 with Z = -3.17.
5
Effect size is computed as r = Z/√n to quantify practical importance beyond statistical significance.
6
In the example, r = 0.55 is interpreted as a large effect, supporting that the seminar produced a meaningful positive shift in preference.
7
Reporting should include direction (increase/decrease/no change), Z value, p-value, and effect size.

Highlights

Zero negative ranks mean the intervention did not produce any decreases—every observed change was either an increase or no change.

A p-value of 0.03 (with Z = -3.17) supports a statistically significant positive change in preference after the seminar.

Effect size r = 0.55 is treated as large, indicating the change is not only statistically significant but also practically substantial.

Topics

Wilcoxon Signed Rank Test
Paired Samples
Non-Parametric Testing
Effect Size
Before-After Intervention