20. SPSS AMOS | Assessing Normal Distribution of Data
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Run AMOS’s “test for normality and outliers” before moving to the structural model so skewness, kurtosis, and outliers are assessed for each variable.
Briefing
Normality checks in AMOS matter because structural equation modeling (SEM) typically relies on maximum likelihood estimation (MLE), which performs best when variables are reasonably consistent with a normal distribution. The core workflow is to run AMOS’s “test for normality and outliers,” then interpret skewness and kurtosis (kosis) statistics to decide whether the data are normal enough to proceed, and only then move on to the structural model.
Normal distribution is described as a symmetric probability pattern centered on the mean, where most observations cluster near the center and relatively few fall on the tails. To diagnose departures from that pattern, AMOS focuses on skewness (tilt of the distribution) and kurtosis (peakness and tail heaviness/lightness). Skewness can be positive (right skew) or negative (left skew), depending on which side the distribution stretches. Kurtosis indicates whether the distribution is flatter or more peaked than a normal curve; heavier or lighter tails can signal non-normality.
In AMOS, the normality assessment table is interpreted using the absolute value of skewness for each item. An absolute skewness of 1.0 or lower indicates normality, and SEM with MLE is described as fairly robust even when skewness exceeds 1 in absolute value. For larger samples, the critical region for skewness is given as not exceeding 8.0; if that threshold holds, the data are treated as normal and analysis can proceed. The transcript also notes a practical rule tied to sample size: with MLE, samples larger than 200 are considered large enough that slightly non-normal distributions may still be acceptable, with absolute skewness up to about +2 (and some experts suggesting up to 3).
Kurtosis is assessed next. With MLE, AMOS is described as robust to kurtosis violations of multivariate normality when sample size is large. The kurtosis statistic is treated as normally distributed within a range of -10 to +10, citing Kline (2020). In the example results, kurtosis values fall within that range, and the critical ratio is just over 3.5, supporting the conclusion that normality is acceptable.
Even with acceptable skewness and kurtosis, AMOS output includes Mahalanobis distance, a multivariate measure used to flag potential outliers. AMOS computes a distance from the centroid (the multivariate center, analogous to the mean) for each observation and reports two probability columns (P1 and P2). The guidance given is that observations with P1 and P2 values less than 1 are denoted as outliers; a more conservative rule is to remove cases where P1 is less than 0.01 or where either P1 or P2 is less than 0.1, depending on the stated threshold in the output. The transcript’s example suggests that removing the flagged cases would likely not materially change results, but the procedure is clear: identify the observation numbers, delete them in SPSS, respecify the measurement model using the cleaned dataset, and rerun.
If normality still fails, bootstrapping is presented as the remedy. Bootstrapping resamples the dataset with replacement (e.g., 1,000 samples), recomputes parameter estimates for each resample, and produces confidence intervals and significance tests. The final step is to compare original estimates with bootstrap results; if they align, the analysis is considered acceptable despite non-normality.
Cornell Notes
AMOS normality assessment is used to decide whether SEM with maximum likelihood estimation (MLE) can proceed reliably. The process starts with skewness and kurtosis: absolute skewness ≤ 1.0 suggests normality, and MLE is described as robust when skewness critical ratios stay below 8.0, especially with large samples (often >200). Kurtosis is treated as acceptable within -10 to +10 (citing Kline, 2020) and when critical ratios remain modest (the example is just over 3.5). Mahalanobis distance then flags multivariate outliers using P1 and P2 probabilities; flagged cases can be removed and the model rerun. If normality remains problematic, bootstrapping (e.g., 1,000 resamples) generates confidence intervals and significance tests to stabilize inference.
How does AMOS define and diagnose normality using skewness and kurtosis?
What skewness thresholds are used to decide whether data are normal enough for MLE-based SEM?
How is kurtosis interpreted in AMOS, and what range is considered acceptable?
What is Mahalanobis distance in AMOS, and how do P1 and P2 guide outlier removal?
What should be done if normality assumptions still fail after skewness, kurtosis, and outlier checks?
Review Questions
- What skewness and kurtosis criteria would justify proceeding with MLE-based SEM without major changes to the model?
- How do you use Mahalanobis distance output (P1 and P2) to decide whether to delete observations, and what is the rerun procedure afterward?
- When would bootstrapping be preferred over deleting outliers, and what does it produce that helps with inference?
Key Points
- 1
Run AMOS’s “test for normality and outliers” before moving to the structural model so skewness, kurtosis, and outliers are assessed for each variable.
- 2
Interpret skewness using absolute values: ≤ 1.0 suggests normality, and MLE is described as robust when skewness critical ratios stay below 8.0, especially with large samples.
- 3
Treat kurtosis as acceptable within -10 to +10 (Kline, 2020) and look for modest critical ratios to support a normality conclusion.
- 4
Use Mahalanobis distance to flag multivariate outliers; rely on P1 and P2 probabilities to decide whether deletion is warranted.
- 5
If outliers are removed, delete the flagged observation numbers in SPSS, respecify the measurement model in AMOS, and rerun the analysis.
- 6
If normality remains problematic, use bootstrapping (e.g., 1,000 resamples) to generate confidence intervals and significance tests that stabilize inference under non-normality.