Differences between CBSEM, PLSSEM, and GSCA

TL;DR

CBSEM minimizes discrepancies between observed and model-implied covariance matrices and is most suitable for confirmatory, theory-driven testing with global fit evaluation.

Briefing Cornell Notes

Briefing

Generalized Structured Component Analysis (GSCA) is the newest addition discussed here, and it stands out for how it fits data: it minimizes residuals using least squares while still supporting both reflective and formative indicators, including hierarchical models. That combination—residual minimization plus flexible measurement handling—makes GSCA a practical option when researchers need fit indices and are working with complex component-based structures, especially in settings where sample size and normality are concerns.

The comparison starts with covariance-based SEM (CBSEM), which is built for confirmatory, theory-driven testing. CBSEM targets the covariance structure by minimizing discrepancies between observed covariance matrices and model-implied covariance matrices. It requires a predefined model and leans on maximum likelihood estimation. The tradeoffs are substantial: CBSEM depends on multivariate normality and large samples, and it can be highly sensitive to model misspecification. With complex models—especially those with many indicators and parameters—achieving acceptable global model fit can become difficult. CBSEM is most appropriate when testing well-established theories and when the data meet normality and sample-size requirements, with an emphasis on global fit evaluation.

Variance-based SEM (PLS-SEM) shifts the goal from reproducing covariances to explaining variance in dependent variables. Instead of focusing on global fit, PLS-SEM emphasizes predictive power—how much change in outcomes is explained by predictors. It is exploratory and component-based, using an iterative partial least squares algorithm. A key advantage is fewer distributional constraints: PLS-SEM works well with small samples and non-normal data. It also handles formative constructs effectively and fits early-stage theory development. The main limitations are the lack (or reduced rigor) of global fit indices and weaker support for causal inference compared with covariance-based approaches. PLS-SEM is commonly used in marketing, management, and information systems, particularly for predictive analysis, exploratory research, higher-order models, formative models, and non-normal data.

GSCA, added recently to Smart PLS, is positioned as a component-based alternative that minimizes residuals via least squares—an approach described as aligning model predictions closely with observed data. Residuals are treated as unexplained variance: higher unexplained variance implies poorer model fit, so reducing residuals is central to improving fit. Like PLS-SEM, GSCA is tolerant of small samples and non-normal data, and it can handle both reflective and formative indicators as well as hierarchical models. Its differentiator versus PLS-SEM is the availability of fit indices, which helps researchers evaluate model fit more directly. The tradeoff is practical rather than methodological: GSCA is less widespread and has fewer software options, though Smart PLS support is expanding.

In short, CBSEM prioritizes theory confirmation and global fit under stricter statistical assumptions; PLS-SEM prioritizes explained variance and prediction with flexibility for non-normality and formative measurement; GSCA aims to reduce residuals while offering fit indices and supporting complex component structures. The choice depends on whether the research question is confirmatory versus exploratory, whether global fit matters, and whether the data and model complexity fit the assumptions of covariance-based methods.

Cornell Notes

Covariance-based SEM (CBSEM) tests predefined, theory-driven models by minimizing discrepancies between observed and model-implied covariance matrices. It uses maximum likelihood estimation and requires multivariate normality and large samples, making it sensitive to misspecification—especially in complex models.

Variance-based SEM (PLS-SEM) focuses on explained variance in dependent variables and is better suited for exploratory, predictive work. It uses an iterative partial least squares algorithm, tolerates small samples and non-normal data, and handles formative constructs well, but offers limited global fit assessment and weaker causal rigor.

Generalized Structured Component Analysis (GSCA) is a component-based method that minimizes residuals via least squares, aiming to align predicted values with observed data. It supports reflective and formative indicators and hierarchical models, is tolerant of small samples and non-normality, and—unlike PLS-SEM—provides fit indices. It is available in Smart PLS, though it remains less widely supported elsewhere.

What is the core modeling target in CBSEM, and what does that imply for its assumptions?

CBSEM targets the covariance structure by minimizing discrepancies between the observed covariance matrix and the model-implied covariance matrix. Because it relies on maximum likelihood estimation for this covariance reproduction task, it assumes multivariate normality and typically needs a large sample size. Those requirements also make CBSEM sensitive to model misspecification—especially when models are complex with many indicators and parameters.

Why do researchers often choose PLS-SEM over CBSEM when the goal is prediction or early-stage theory?

PLS-SEM is designed to maximize explained variance in dependent variables, using an iterative partial least squares algorithm. It is exploratory and component-based, so it fits early-stage theory development and predictive analysis. It avoids distributional requirements like multivariate normality, works with small samples and non-normal data, and handles formative constructs effectively—advantages that are harder to realize with CBSEM.

What tradeoff comes with PLS-SEM’s flexibility?

PLS-SEM’s flexibility comes with weaker global model fit assessment: it lacks (or provides less rigorous) global fit indices compared with covariance-based approaches. It also offers less rigorous support for causal influence than CBSEM, so it’s often better aligned with prediction and exploratory modeling than with strict causal testing.

How does GSCA’s residual-minimization approach differ from CBSEM and PLS-SEM?

GSCA minimizes residuals via least squares, aiming to reduce the differences between observed data points and model-predicted values. Residuals are treated as unexplained variance—so higher unexplained variance implies poorer fit. That makes GSCA more similar in spirit to traditional regression’s residual focus, while still operating within structural equation modeling and supporting complex component structures.

What makes GSCA especially attractive relative to PLS-SEM?

GSCA provides fit indices, which is a key differentiator from PLS-SEM. Both can handle complex component models and tolerate small samples and non-normal data, but GSCA’s fit indices make it easier to evaluate model fit directly. The practical downside noted is that GSCA is less widespread and has fewer software options, though Smart PLS includes it and support is expanding.

Review Questions

If a study needs global model fit evaluation under a predefined, theory-driven model, which SEM family is the best match and why?
A model includes formative constructs and the data are non-normal with a small sample. Which method is most appropriate and what limitation should be expected?
How do residuals and unexplained variance relate to GSCA’s least-squares objective, and how does that connect to model fit?

Key Points

1
CBSEM minimizes discrepancies between observed and model-implied covariance matrices and is most suitable for confirmatory, theory-driven testing with global fit evaluation.
2
CBSEM typically requires multivariate normality and large samples and can struggle with complex models due to sensitivity to misspecification.
3
PLS-SEM maximizes explained variance for prediction and exploratory research, using an iterative partial least squares algorithm.
4
PLS-SEM tolerates small samples and non-normal data and handles formative constructs well, but it provides limited global fit indices and weaker causal rigor.
5
GSCA minimizes residuals via least squares, treating residuals as unexplained variance, and aims to align predicted values with observed data.
6
GSCA supports reflective and formative indicators and hierarchical models, is tolerant of small samples and non-normality, and offers fit indices—making it a notable alternative to PLS-SEM.
7
GSCA is available in Smart PLS, but it remains less widely supported than CBSEM and PLS-SEM in general software ecosystems.

Highlights

GSCA’s defining move is residual minimization via least squares, framed as reducing unexplained variance to improve model fit.

CBSEM’s covariance-matching approach depends on multivariate normality and large samples, and it is sensitive to misspecification in complex models.

PLS-SEM prioritizes explained variance and prediction, working well with small samples and formative constructs, but it offers limited global fit assessment.

GSCA’s main advantage over PLS-SEM is fit indices, while both methods can handle complex component structures and non-normal data.

Smart PLS is highlighted as a key software option for GSCA, with broader development expected.

Topics

CBSEM
PLS-SEM
GSCA
Model Fit
Formative Constructs

Mentioned

Smart PLS
CBSEM
PLS-SEM
GSCA
SCM