6. SEM | SPSS AMOS - Factor Loadings, Model Fit, and Modification Indices - Research Coach
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Standardized factor loadings (0–1 scale) are preferred because they allow direct comparison of indicator strength across CFA models.
Briefing
Structural equation modeling hinges on two linked tasks: judging whether the measurement model represents latent constructs well, and checking whether the overall model reproduces the observed covariance structure. Factor loadings in confirmatory factor analysis (CFA) quantify how strongly each unobservable construct (latent variable) drives its observed indicators. Standardized factor loadings are typically reported because they put indicator weights on a comparable 0–1 scale; squaring a standardized loading yields the proportion of explained variance in an indicator. As a practical rule, standardized loadings above 0.70 suggest an indicator is doing meaningful work (explaining at least half the variance), while loadings below that threshold imply the indicator contributes little and may be considered for deletion—though that decision should follow additional conditions.
Once factor loadings are set, measurement error for each indicator can be derived as 1 − r², meaning lower explained variance corresponds to higher error. CFA also requires “metric setting” in the structural equation model: each latent variable must be assigned a scale by constraining one of its factor loadings to 1 (the “reference term”). That constraint acts as an anchor so the remaining loadings can be freely estimated; without it, covariance-based SEM software will fail with an unidentified error. When comparing multiple groups (for example, male versus female), the same indicator should be constrained to 1 in each group to maintain measurement comparability.
Model fit then addresses a different question: does the specified model reproduce the observed covariance matrix closely enough to be considered plausible? A good fit means the estimated covariance structure closely matches the data; a bad fit signals systematic mismatch. Importantly, good overall fit does not guarantee every part of the model is correct. Fit is also influenced by model complexity: models with fewer indicators per factor often show higher apparent fit than models with more indicators, so parsimony matters.
Several fit statistics are used, each with different sensitivities. The chi-square goodness-of-fit test (often called “badness of fit”) should be non-significant for a good fit, but it is highly sensitive to sample size. To reduce that dependence, relative chi-square (chi-square divided by degrees of freedom) is used; one cited guideline places it between 3 and 5 for a good fit. Comparative fit indices such as CFI (above 0.90), IFI (above 0.90), and TLI (above 0.90) are recommended because they are less affected by sample size. RMSEA is treated as a badness-of-fit measure where values near zero are best; thresholds commonly cited are below 0.05 for adequate fit and below 0.08 for acceptable fit. SRMR similarly flags poor fit when it rises above about 0.09.
Because “good fit” thresholds are debated, the guidance emphasizes using multiple indices rather than relying on a single cutoff. Even a well-fitting model can still be misspecified in terms of how relationships are represented. When fit is poor, modification indices offer a route to improvement by suggesting additional covariances—typically between error terms within the same construct. These suggestions must be applied carefully and justified. In Amos output, modifications are often filtered by a threshold (with values around 3.8–4 highlighted as meaningful), and certain changes are explicitly disallowed, such as adding covariance between error terms from different constructs or between an error term and a latent construct.
Cornell Notes
Factor loadings in CFA quantify how latent constructs affect observed indicators, and standardized loadings (0–1 scale) make indicator contributions comparable. Squaring a standardized loading gives the proportion of variance in an indicator explained by the latent construct; values above 0.70 are treated as strong (at least 50% explained variance). Measurement error follows 1 − r², and each latent variable needs a metric set by constraining one factor loading to 1; this is required for identification and must be consistent across groups. Model fit evaluates whether the estimated covariance structure matches the observed covariance matrix using indices like chi-square/relative chi-square, CFI/IFI/TLI, RMSEA, and SRMR. Poor fit can sometimes be addressed with modification indices, but only by adding justified covariances between error terms within the same construct.
How do standardized factor loadings translate into explained variance, and what cutoff is used to judge indicator quality?
Why must one factor loading per latent variable be constrained to 1 in SEM/CFA, and what happens if that step is skipped?
What does model fit test in SEM actually assess, and why doesn’t good fit guarantee the model is fully correct?
Which fit indices are emphasized, and how do their thresholds relate to sample size sensitivity?
When model fit is poor, what do modification indices recommend, and what restrictions apply?
Review Questions
- If a standardized factor loading is 0.75, how much variance in the indicator does the latent construct explain, and would it meet the stated retention rule?
- Why is relative chi-square preferred over chi-square in large samples, and what range is cited as indicating good fit?
- What kinds of covariance changes are permitted when using modification indices, and which specific changes are explicitly disallowed?
Key Points
- 1
Standardized factor loadings (0–1 scale) are preferred because they allow direct comparison of indicator strength across CFA models.
- 2
Squaring a standardized factor loading gives the explained variance in an indicator; loadings above 0.70 are treated as strong (≥50% explained variance).
- 3
Indicator measurement error can be computed as 1 − r², so lower explained variance implies higher measurement error.
- 4
Each latent variable must have its metric set by constraining one factor loading to 1; skipping this leads to identification errors in covariance-based SEM.
- 5
When comparing groups, the same indicator must be constrained to 1 in each group to preserve measurement scale consistency.
- 6
Model fit should be evaluated with multiple indices (CFI/IFI/TLI, RMSEA, SRMR, and chi-square/relative chi-square) because no single cutoff universally settles fit quality.
- 7
Modification indices can guide improvements, but only justified covariances between error terms within the same construct are allowed, with a practical MI threshold around 3.8–4.