CB-SEM using #SmartPLS4 - 3 - Understanding Basic Concepts in Structural Equation Modeling (SEM)

TL;DR

Structural equation modeling tests theory-driven relationships among latent constructs measured by multiple questionnaire items, explicitly incorporating error.

Briefing Cornell Notes

Briefing

Structural equation modeling (SEM) is built to test relationships among theoretical constructs by modeling both the constructs and the measurement items that represent them—along with error. Instead of treating variables as directly observed scores only, SEM uses a schematic framework where constructs are linked by hypothesized paths, and each construct is measured through a set of questionnaire items. That combination matters because it lets researchers evaluate whether the items reliably and validly represent the underlying concepts before assessing how those concepts influence one another.

A key clarification is that SEM—especially covariance-based SEM (CB-SEM)—does not establish causation by itself. CB-SEM takes a covariance matrix as input, meaning the analysis is driven by correlations among variables to estimate how constructs are associated. High correlation does not automatically mean one construct causes another; demonstrating causality typically requires experimental design. Still, SEM remains a strong tool for assessing how constructs influence one another in observational data, and it can also be applied to experimental datasets.

SEM distinguishes between variables and constructs. Variables are directly measured quantities such as age, exam score, height, or income. Constructs are indirectly measured, often hypothetical concepts like job satisfaction, perceived usefulness, loyalty, servant leadership, or organizational culture—typically captured through multiple questionnaire items. These constructs are often latent, meaning they are not observed directly; instead, they are inferred from patterns in responses to item sets. For example, “organizational learning” might be treated as a latent construct measured by eight questionnaire items.

Because latent constructs cannot be handled by ordinary least squares (OLS) in a straightforward way—such as by averaging items and running a basic regression—SEM analyzes measurement and structural relationships simultaneously. In SEM, at least two components appear: the measurement model and the structural model. The measurement model focuses on validity and reliability: whether the indicators (items) actually reflect the underlying latent constructs and how well the model fits the measurement side. Once that foundation is supported, the structural model evaluates the influence and significance between constructs—capturing hypothesized directional links (e.g., construct A influencing construct B).

A “full structural model” includes both measurement and structural relationships, tying together items, latent constructs, and their interconnections. SEM also estimates parameters that describe the size and nature of relationships. Parameters can be fixed or freely estimated from data, and they appear not only in the links between constructs but also in how latent constructs map onto their indicators, including error terms.

Sample size guidance for SEM is debated, but one commonly cited rule-of-thumb (Hair et al., 2010) ties minimum sample size to the number of latent constructs and the number of items per construct. For instance, with five or fewer constructs (each with more than three measuring items), 100 cases are suggested; with seven or fewer constructs (each with more than three items), 150 cases; and when constructs have fewer than three items, the guidance increases to 300 or even 500 cases when there are more than seven constructs.

Overall, SEM—particularly CB-SEM—offers a structured way to test theory-driven relationships among latent constructs while explicitly accounting for measurement quality and error, but it relies on covariance patterns rather than experimental control to infer causality.

Cornell Notes

Structural equation modeling (SEM) tests theory-driven relationships among latent constructs measured by questionnaire items, while accounting for measurement error. Covariance-based SEM (CB-SEM) uses a covariance matrix as input, so it captures associations rather than proving causation; causal claims generally require experimental design. SEM separates the measurement model (validity and reliability of indicators) from the structural model (influence and significance between constructs), and can combine both in a full structural model. Latent constructs are inferred from item sets (e.g., “organizational learning” measured by eight items), which is why SEM is preferred over simple OLS approaches that cannot directly model latent variables. Sample-size rules of thumb often depend on the number of constructs and items per construct, with larger samples needed when constructs have fewer indicators or when there are many constructs.

Why does CB-SEM rely on a covariance matrix, and what does that imply for causation claims?

CB-SEM takes a covariance matrix as its input, so the model is estimated from correlations among variables/constructs. That means the analysis can assess how constructs are associated and how one construct relates to another, but it does not, by itself, determine causation. Even strong correlations do not prove one construct causes another; establishing causality typically requires experimental design.

What is the difference between a variable and a construct in SEM?

A variable is directly measured (e.g., age, exam score, height, income). A construct is an indirectly measured, often hypothetical concept (e.g., job satisfaction, perceived usefulness, loyalty, servant leadership, culture). Constructs are treated as latent when they are inferred from responses to multiple questionnaire items rather than observed directly.

Why can’t latent constructs be handled with ordinary least squares (OLS) in the same way?

OLS procedures generally work with observed numeric variables, such as by averaging item scores and then running regression. Latent constructs require modeling the relationship between the unobserved construct and its indicators simultaneously, including measurement error. SEM performs this joint estimation, combining measurement and structural components rather than collapsing items into a single observed score.

How do the measurement model and structural model differ in SEM?

The measurement model checks whether indicators (items) validly and reliably represent each latent construct and assesses measurement fit. The structural model then evaluates the influence and significance between constructs—capturing hypothesized directional links (e.g., A → B). A full structural model includes both measurement and structural relationships together.

What do SEM parameters represent, and where do they appear in the model?

Parameters describe the size and nature of relationships in the model. They can be fixed or freely estimated from data. In SEM, parameters appear both in the measurement part (latent constructs mapping to indicators, including error terms) and in the structural part (relationships between latent constructs).

How do sample-size guidelines for SEM typically depend on model complexity?

Hair et al. (2010) offers rule-of-thumb minimums based on the number of latent constructs and how many measuring items each construct has. Examples include: 100 cases for five or fewer constructs with more than three items each; 150 cases for seven or fewer constructs with more than three items each; 300 cases when some constructs have fewer than three items; and 500 cases when there are more than seven constructs with some constructs having fewer than three items.

Review Questions

How does using a covariance matrix in CB-SEM shape what kinds of conclusions can be drawn about relationships among constructs?
What specific tasks belong to the measurement model versus the structural model in SEM?
Why do latent constructs require SEM rather than a simple OLS regression approach based on averaged item scores?

Key Points

1
Structural equation modeling tests theory-driven relationships among latent constructs measured by multiple questionnaire items, explicitly incorporating error.
2
Covariance-based SEM uses a covariance matrix, making it well-suited for assessing associations but not for proving causation without experimental design.
3
Variables are directly measured quantities, while constructs are indirectly measured concepts inferred from item sets.
4
SEM separates the measurement model (validity/reliability of indicators) from the structural model (influence/significance between constructs), and can combine both in a full model.
5
Latent constructs are inferred from patterns in responses; SEM estimates parameters for both construct-to-indicator links and construct-to-construct paths.
6
Sample-size guidance often scales with the number of latent constructs and the number of indicators per construct, with larger samples needed for more complex or weakly measured models.

Highlights

CB-SEM estimates relationships from a covariance matrix, so it supports association testing rather than causal proof.

SEM’s measurement model must establish indicator validity and reliability before interpreting structural links between constructs.

Latent constructs are not observed directly; they’re inferred from multiple questionnaire items, with error modeled explicitly.

Rule-of-thumb sample sizes (Hair et al., 2010) increase sharply when constructs have fewer indicators or when the number of constructs grows.

Topics

Structural Equation Modeling Basics
Covariance-Based SEM
Latent Constructs
Measurement vs Structural Models
Sample Size Guidelines

Mentioned

SmartPLS
SmartPLS4
OT
Hair
SEM
CB-SEM
SCM
OLS
ANOVA
CB-SEM