Robustness Checks using #SmartPLS4 - Linearity - Endogeneity

TL;DR

Linearity in SmartPLS4 can be tested by adding quadratic effects for all paths and running bootstrapping; insignificant quadratic effects support the linearity assumption.

Briefing Cornell Notes

Briefing

Structural equation modeling in SmartPLS relies on assumptions that can quietly break results—so robustness checks matter before interpreting paths. One key assumption is linearity: relationships between constructs should behave linearly rather than curving. In SmartPLS4, linearity is tested by adding quadratic effects for each path and running bootstrapping (recommended 5,000–10,000, using a two-tailed bias-corrected approach). When the quadratic terms show no significant p-values—and the path coefficients confirm all quadratic effects are insignificant—the model’s linearity assumption is treated as satisfied.

The next robustness step targets endogeneity, a problem where predictors correlate with error terms, distorting causal interpretation. The workflow uses the GoFian copula approach (implemented as “goian copula” in the interface). The procedure starts with single relationships: each path is tested for endogeneity, then results are copied out (e.g., into Excel) for tracking. A crucial pattern emerges: some individual paths can be significant for endogeneity while others are not. In the example, one relationship pair (from “commitment to leadership” and “leadership to reliability” are mentioned as the kinds of paths being combined) produces significant endogeneity when tested alone, but when the model tests different combinations, some pairs remain problematic while others show no endogeneity. The method scales up by testing combinations of two, then three, then four relationships—essentially checking whether endogeneity appears only in isolated paths or persists across sets.

After endogeneity checks, the robustness process moves to heterogeneity—specifically unobserved heterogeneity, where distinct subgroups in the data imply meaningfully different parameter estimates. If subgroup effects cancel out when estimating on the full sample, overall results can become misleading. SmartPLS addresses this using the FIX (finite mixture segmentation) approach. The key practical question is how many segments to test. The transcript uses a rule-of-thumb sample size of at least 85 per segment, based on a medium effect size (.15) and 80% power. With a total sample around 341, that implies testing roughly four segments (1–4).

For each candidate segmentation (1, 2, 3, and 4 segments), model selection criteria are collected—AIC3, CAIC, AIC4, BIC, and MDL5—along with entropy. The example ends with an ambiguous picture: AIC3 and CAIC point to different segment counts (AIC3 favoring four segments while CAIC favors two). AIC4 and BIC also favor two segments, and MDL5 points inconsistently (in the example, MDL5 favors one segment). Because the criteria do not unambiguously converge on a single segmentation solution, the analysis avoids further segmentation and instead treats the full dataset as a single group. That decision is used to conclude there is no clear evidence of unobserved heterogeneity affecting the results.

Cornell Notes

The robustness workflow in SmartPLS4 checks three assumptions before trusting structural model results: linearity, endogeneity, and unobserved heterogeneity. Linearity is tested by adding quadratic effects to each path and bootstrapping; insignificant quadratic terms indicate relationships are adequately linear. Endogeneity is assessed with a GoFian copula procedure, first for single paths and then for combinations of paths (two, three, four), with significant p-values signaling endogeneity issues that may require remediation. Unobserved heterogeneity is examined using FIX finite mixture segmentation, testing 1–4 segments based on a minimum sample size per segment (85) tied to medium effect size (.15) and 80% power. When fit indices disagree on the segment count, the approach defaults to analyzing the full dataset without segmentation.

How does SmartPLS4 test whether relationships are linear rather than curved?

It adds quadratic effects for the paths. The user selects “quadratic effect,” then adds it for all relationships, and runs bootstrapping (recommended 5,000–10,000; two-tailed, bias-corrected). Linearity is supported when the quadratic effects’ p-values are not significant. The transcript emphasizes that quadratic effects correspond to multiplying a variable by its own value (capturing curvature); insignificance across paths means the linearity assumption holds.

What does the GoFian copula endogeneity check do, and why test combinations of paths?

The GoFian copula procedure checks endogeneity at the level of individual relationships first, then repeats the test for combinations of relationships (two, then three, then four). Testing combinations helps determine whether endogeneity is isolated to one path or emerges when multiple predictors are considered together. In the example, one relationship shows significant endogeneity when tested alone, while other combinations show different significance patterns, indicating that endogeneity can depend on the set of paths included.

How are endogeneity results interpreted using p-values?

The transcript uses a threshold of p-values greater than 0.5 to indicate no endogeneity issues. When p-values fall below that threshold (a significant result), the corresponding relationship or combination is flagged as having endogeneity problems. Results are copied into Excel to build a final table of which paths/combinations are problematic.

Why does unobserved heterogeneity threaten the validity of structural model estimates?

Unobserved heterogeneity means the data contain subgroups with substantially different model estimates. Estimating on the full dataset can produce misleading averages—positive and negative effects across groups may cancel out—so the overall model may look weaker or even directionally wrong compared with subgroup-specific relationships.

How does the FIX segmentation procedure decide how many segments to test?

It uses a sample-size rule: at least 85 observations per segment, derived from medium effect size (.15) and 80% power. With a total sample around 341, dividing by 85 yields roughly four segments, so the procedure tests segment counts from 1 to 4 using SmartPLS’s finite mixture segmentation (FIX).

What happens when AIC, BIC, CAIC, and MDL5 disagree on the number of segments?

The transcript treats the situation as ambiguous and avoids further segmentation. Even though some criteria point to two segments and others point to one or four, the lack of a single consistent solution leads to analyzing the full dataset together, concluding there is no clear evidence of unobserved heterogeneity driving the results.

Review Questions

In SmartPLS4, what specific output should be checked to confirm the linearity assumption, and what does insignificance of quadratic effects imply?
During the GoFian copula endogeneity assessment, how does testing single relationships differ from testing combinations of two, three, or four relationships?
What decision rule is used when FIX fit indices (AIC3/CAIC/AIC4/BIC/MDL5 and entropy) do not converge on a single segment solution?

Key Points

1
Linearity in SmartPLS4 can be tested by adding quadratic effects for all paths and running bootstrapping; insignificant quadratic effects support the linearity assumption.
2
Endogeneity is assessed using the GoFian copula approach, starting with single paths and then expanding to combinations of paths to see where endogeneity persists.
3
Endogeneity results are tracked by copying path-coefficient outputs (including the copula-related row) into Excel to build a clear map of problematic relationships.
4
Unobserved heterogeneity is treated as a subgroup problem where different parameter estimates can cancel out when fitting one model to all data.
5
FIX finite mixture segmentation uses a minimum sample size per segment of 85 (medium effect size .15, power 80), guiding how many segments to test.
6
When model selection criteria disagree on the number of segments, the transcript’s approach is to avoid further segmentation and analyze the full dataset as one group.

Highlights

Quadratic effects are used as a direct diagnostic for linearity: if all quadratic terms are insignificant after bootstrapping, the model’s linearity assumption is considered met.

Endogeneity checks aren’t limited to single paths—testing combinations of relationships can reveal whether endogeneity is path-specific or combination-dependent.

FIX segmentation can fail to produce a clear segment count when AIC3/CAIC and AIC4/BIC and MDL5 point in different directions, leading to a default decision to keep the full dataset unsegmented.

Topics

SmartPLS4 Robustness Checks
Linearity via Quadratic Effects
Endogeneity via GoFian Copula
Unobserved Heterogeneity via FIX
Model Selection Criteria

Mentioned

SmartPLS4
SmartPLS
SCM
PLS
FIX
AIC
CAIC
BIC
MDL
GoFian
GoFian copula
GoFian copula approach
GoFian copula TL
GC com arot TL

Robustness Checks using #SmartPLS4 - Linearity - Endogeneity - Heterogeneity