How to Calculate Sample Size in Structural Equation Modelling (SEM)

TL;DR

Sample size in PLS-SEM must be planned around population characteristics and statistical power, not just software feasibility.

Briefing Cornell Notes

Briefing

Structural equation modelling in PLS (PLS-SEM/PLS SCM) can produce solutions with smaller samples, but sample size still has to match the population and the analysis requirements. A common misconception—that “any small sample will still be accurate”—fails once researchers consider how heterogeneous the target population is and how multivariate techniques depend on adequate statistical power. When sample size falls short, real effects can remain undetected, creating a Type II error; even worse, results may not generalize, so a model estimated on one sample can yield different conclusions than the same model estimated on a larger sample from the same population.

The transcript highlights that minimum sample size guidelines matter because they help ensure robustness and generalizability for methods such as PLS-SEM. It also criticizes the widely cited “10-times rule,” which recommends setting sample size to ten times the number of independent variables in the most complex regression (including both measurement and structural parts). That shortcut can mislead researchers because it ties sample size to model complexity in a way that ignores other determinants—especially the expected effect size and the statistical testing conditions.

To address these issues, Co and Hada (2018) propose the inverse square root method for calculating minimum sample size. The method focuses on the probability that the ratio of a path coefficient to its standard error exceeds the critical value of the test statistic at a chosen significance level. In practice, the minimum sample size depends on the path coefficient (the strength of the hypothesized effect) and the selected significance level—not on the size of the most complex formative model or the overall model size, and not on a single parameter alone.

The transcript describes how to compute n_min using separate equations for common significance levels: 1%, 5%, and 10%. It then walks through an example at a 5% significance level: with a minimum path coefficient of 0.20, the minimum sample size comes out to about 154.2, which is rounded up to the next integer (155). The method is described as conservative because it tends to slightly overestimate the sample size needed to achieve the desired power (assumed at 80%) for detecting an effect.

Because researchers often lack precise expectations for effect sizes—even when a pilot study exists—the transcript recommends using ranges of plausible path coefficients rather than a single value. It suggests taking the upper boundary of the effect-size range when applying the inverse square root method, again reflecting the method’s conservative nature. For instance, if the expected minimum path coefficient is somewhere between 0.11 and 0.20 at a 5% significance level, the calculation uses the upper value (0.20), leading to roughly 155 observations. Similar lookups are provided for other effect ranges (e.g., 0.21–0.30 and 0.31–0.40), producing corresponding minimum sample sizes.

Overall, the core takeaway is that defensible sample size planning in PLS-SEM should be tied to statistical power, significance level, and the expected magnitude of the path effect—rather than relying on simplified rules that ignore these drivers of Type II error and generalizability.

Cornell Notes

PLS-SEM can run with smaller samples, but accuracy and generalizability depend on whether the study has enough statistical power to detect real effects. Underpowered samples increase the risk of Type II error and can produce results that don’t replicate with another sample from the same population. The inverse square root method (Co and Hada, 2018) estimates a minimum sample size based on the chosen significance level and the expected path coefficient, using the probability that the path coefficient’s critical ratio exceeds the test’s critical value. With power set to 80%, the method provides separate calculations for common significance levels (1%, 5%, 10%). Because effect sizes are often uncertain, it’s recommended to use effect-size ranges and apply the upper boundary for a conservative estimate.

Why is small-sample use in PLS-SEM not automatically “accurate,” even if the software produces results?

Small samples can still yield estimates, but they may lack statistical power. If the sample is too small, real effects in the population may not be detected, creating a Type II error. That underpowered situation also threatens generalizability: the same model can produce different results when estimated on a larger sample from the same population.

What problem does the “10-times rule” create for sample size planning in PLS-SEM?

The “10-times rule” sets sample size to ten times the number of independent variables in the most complex regression (including both measurement and structural model parts). The transcript frames this as insufficient because it doesn’t account for key drivers of detectability—especially the expected effect size and the statistical testing conditions—so it can recommend sample sizes that are too small for the desired power and significance.

How does the inverse square root method determine minimum sample size?

It calculates n_min using the probability that the ratio of a path coefficient to its standard error exceeds the critical value of the test statistic for a chosen significance level. The transcript emphasizes that the resulting minimum sample size depends on the path coefficient and the significance level (and assumes a common power level of 80%), rather than on the size of the most complex formative model or the overall model size.

What does an example look like at a 5% significance level?

With power at 80% and significance at 5%, the transcript uses a minimum path coefficient of 0.20. The computed minimum sample size is about 154.2, which is rounded up to the next integer: 155 observations.

Why use effect-size ranges instead of a single expected path coefficient?

Researchers often don’t know the exact effect size in advance, even after pilot work. The transcript recommends using plausible ranges of path coefficients and then using the upper boundary of the range when applying the inverse square root method. This matches the method’s conservative tendency to slightly overestimate the sample size needed to detect an effect.

Review Questions

In what ways can an insufficient sample size distort PLS-SEM conclusions beyond simply failing to find significance?
How does the inverse square root method differ from the “10-times rule” in what it uses to compute n_min?
If effect size expectations are given as a range, what value should be used for a conservative inverse square root calculation and why?

Key Points

1
Sample size in PLS-SEM must be planned around population characteristics and statistical power, not just software feasibility.
2
Underpowered studies increase the risk of Type II error and can reduce the generalizability of PLS-SEM results.
3
The “10-times rule” can mislead because it ignores effect size and significance-level testing conditions.
4
The inverse square root method estimates minimum sample size using the path coefficient, its standard error, and the critical test value at a chosen significance level.
5
With assumed 80% power, separate n_min calculations apply for common significance levels (1%, 5%, 10%).
6
Minimum sample size calculations should be rounded up to the next integer to ensure adequate power.
7
When effect sizes are uncertain, use effect-size ranges and apply the upper boundary for a conservative estimate.

Highlights

A key misconception is that PLS-SEM accuracy is guaranteed with small samples; insufficient sample size can hide real effects (Type II error) and harm generalizability.

The inverse square root method ties n_min to the probability that the path coefficient’s critical ratio exceeds the test’s critical value at the selected significance level.

At 5% significance with a minimum path coefficient of 0.20, the minimum sample size is about 154.2—rounded up to 155.

The method is intentionally conservative and works best when researchers use plausible effect-size ranges rather than single guessed values.

Topics

PLS-SEM Sample Size
Inverse Square Root Method
Statistical Power
Type II Error
Effect Size Ranges

Mentioned

Co
Hada
PLS
SEM
PLS-SEM
Type II error

How to Calculate Sample Size in Structural Equation Modelling (SEM) - Inverse Square Root Method