08. SEMinR Lecture Series. Bootstrapping the PLS Model and generating Summary Results

TL;DR

Bootstrapping in SEMinR is used to generate standard errors and confidence intervals for PLS-SEM path coefficients, enabling significance testing.

Briefing Cornell Notes

Briefing

Bootstrapping in SEMinR is used to test whether PLS-SEM path coefficients are statistically significant by generating standard errors and confidence intervals from resampled data. After estimating a PLS model, the workflow shifts from “what are the path estimates?” to “how reliable are those paths?”—a non-parametric step that matters because it avoids relying on strict distributional assumptions.

The session begins with a quick recap of the SEMinR setup: load the SEMinR library, import the dataset into an R object, inspect it for issues, then define the measurement model using the `constructs` function (composites) and the structural model using `relationships` (the directional paths among constructs). The model is then estimated with a PLS algorithm call that takes the data, measurement model, structural model, and missing-value handling options. Results are stored as an estimated PLS model object, and `summary` is used to produce a report. That summary object can be queried for specific details—such as path coefficients, reliability, and other metrics—via `$` access to sub-objects like `summary_model`.

Bootstrapping comes next. Because bootstrapping is non-parametric, it estimates standard errors and confidence intervals by repeatedly resampling the data and re-estimating the model. SEMinR’s `bootstrap_model` function performs this task on a previously estimated SEMinR PLS model object. Key inputs include the model to bootstrap, the number of bootstrap subsamples, the number of CPU cores for parallel processing, and a seed for reproducibility. In the example, 1,000 bootstrap subsamples are used for the run, while the recommended practice is to use 10,000 for final reporting; parallelization can be controlled through the `cores` argument.

The transcript also highlights a practical debugging point: errors during bootstrapping can stem from invisible formatting issues—like extra spaces introduced when copying code from other software. Removing problematic spaces and rerunning resolves the issue, after which the bootstrap completes successfully.

To interpret results, the workflow again uses `summary`, this time on the bootstrapped model object. The resulting summary object includes bootstrapped means, standard deviations, t statistics, and confidence intervals for paths (plus additional outputs like loadings and HTMT results, which are deferred to later sessions). For significance testing, the confidence interval is central: if the interval does not cross zero, the path is treated as significant. The example notes that under a one-tailed interpretation, significance can be tied to a threshold such as 1.645, and it demonstrates that the confidence intervals for the listed paths exclude zero—so the effects are considered significant.

Finally, the session shows how to narrow output to only the desired section of the summary object (e.g., only bootstrap paths), and it closes with a step-by-step checklist: load library, read and inspect data, define measurement and structural models, estimate the PLS model, bootstrap with `bootstrap_model`, and summarize bootstrapped results with `summary` for targeted inference.

Cornell Notes

The session lays out a complete SEMinR workflow for bootstrapping a PLS-SEM model to assess path significance. After estimating the PLS model and storing it in an object, `bootstrap_model` resamples the data (non-parametrically) to generate standard errors and confidence intervals for path coefficients. The example uses 1,000 bootstrap subsamples for the run, while recommending 10,000 for final results, and it includes options for CPU parallelism and a seed for reproducibility. Results are then summarized with `summary` on the bootstrapped object, where path significance is judged primarily by whether confidence intervals cross zero. The session also flags a common coding issue: invisible spaces from copy/paste can trigger errors and must be removed.

Why is bootstrapping needed after estimating a PLS-SEM model in SEMinR?

Bootstrapping provides inferential statistics—standard errors and confidence intervals—for path coefficients without assuming a specific parametric distribution. That’s what turns raw PLS estimates into significance testing for relationships among constructs.

What does `bootstrap_model` require, and what do the main arguments control?

`bootstrap_model` takes (1) the previously estimated SEMinR PLS model object to resample, (2) the number of bootstrap subsamples (e.g., 1,000 in the example), (3) CPU cores for parallel processing (with a default that can use maximum available cores), and (4) a seed to make the random resampling reproducible.

What’s the practical difference between using 1,000 subsamples and the recommended 10,000?

The example runs with 1,000 subsamples to get results, but it notes that final computation should draw on 10,000 subsamples for more stable confidence intervals and more reliable inference.

How do confidence intervals determine whether a path is significant?

The transcript uses the rule that if a path’s confidence interval does not include zero, the effect is treated as significant. It also mentions a one-tailed interpretation where significance can be associated with a t-statistic threshold such as 1.645, but the confidence-interval crossing of zero is the key visual/decision criterion shown.

What kind of error can appear during bootstrapping, and how is it fixed here?

Errors can arise from invisible formatting problems—specifically extra spaces introduced when copying code from another environment. Removing those spaces and rerunning the bootstrapping call resolves the issue.

How can output be restricted to only the information needed for path significance?

After running `summary` on the bootstrapped model object, the summary contains multiple sub-objects (e.g., bootstrap paths, loadings, HTMT). The transcript demonstrates using `$` access to pull only the bootstrap paths section, rather than reviewing everything at once.

Review Questions

In SEMinR, what is the role of `bootstrap_model` relative to the earlier PLS estimation step?
Which criterion in the bootstrapped path results indicates significance, and what does it mean if the confidence interval includes zero?
Why might code that works in one environment fail during bootstrapping after copy/paste, and what specific fix is suggested?

Key Points

1
Bootstrapping in SEMinR is used to generate standard errors and confidence intervals for PLS-SEM path coefficients, enabling significance testing.
2
The bootstrapping step runs on a previously estimated PLS model object using `bootstrap_model` and produces a bootstrapped model object.
3
The number of bootstrap subsamples should be increased for final reporting (the session uses 1,000 for demonstration but recommends 10,000).
4
Parallel processing can be controlled via the `cores` argument, and reproducibility is supported through a `seed`.
5
Confidence intervals for paths are the primary decision tool: if they do not cross zero, the path effect is treated as significant.
6
Copy/paste formatting issues—especially invisible spaces—can cause bootstrapping errors; removing spaces can resolve them.
7
`summary` on the bootstrapped object yields multiple result types, and `$` sub-objects let users extract only what they need (e.g., bootstrap paths).

Highlights

Bootstrapping turns PLS estimates into inferential results by producing standard errors and confidence intervals for path coefficients.

A path is treated as significant when its bootstrapped confidence interval does not include zero.

The session recommends 10,000 bootstrap subsamples for final results, even though it demonstrates with 1,000.

Invisible spaces introduced during copy/paste can break SEMinR code; deleting them can fix bootstrapping errors.

Topics

PLS-SEM Bootstrapping
SEMinR Summary Objects
Confidence Intervals
Path Significance
Parallel Processing

Mentioned

PLS
SEM
HTMT
BS