32. SEMinR Lecture Series - Multi-group Analysis (PLS-MGA)

TL;DR

SEMinR PLS-MGA tests whether structural path coefficients differ across subgroups by estimating the same PLS-SEM model for each group and comparing betas with p-values.

Briefing Cornell Notes

Briefing

Multi-group analysis in SEMinR (PLS-MGA) lets researchers test whether key path relationships in a PLS-SEM model hold the same across subgroups—here, gender—by estimating the model separately for each group and then checking whether path coefficients differ significantly.

The workflow starts with the usual SEMinR setup: load the dataset into an object (named data s in the transcript), verify the data loaded correctly, then define a measurement model with multiple-item constructs. In this example, three constructs are specified—Collaborative Culture (CC), Organizational Commitment (OC), and Organizational Performance (OP)—each measured with six indicators (CC1–CC6, OC1–OC6, OP1–OP6). After the measurement model is set, a structural model is built by drawing directional paths: Collaborative Culture → Organizational Commitment and Organizational Commitment → Organizational Performance. The model is then estimated using an object to store results (data sore MGA), with missing values handled via mean replacement when missing values are coded as -99.

Once the baseline model runs and produces valid observations and plots, the transcript moves to multi-group analysis. Gender is used as the grouping variable, coded so that 1 represents male and 2 represents female. The dataset is split implicitly by specifying gender == 1 for one group and gender == 2 for the other, yielding 289 male respondents and 52 female respondents. The multi-group estimation uses the SEMinR function estimate_PLS_MGA, with the model argument set to simple_model (the previously estimated structural model). Bootstrapping is used during estimation, which can take some time.

The output provides beta coefficients for each group and p-values that indicate whether differences between groups are statistically significant. For the male group (group 1), the transcript reports beta values for the two paths: Collaborative Culture → Organizational Commitment (β = 0.612) and Organizational Commitment → Organizational Performance (β = 0.4). For the female group (group 2), the corresponding betas are Collaborative Culture → Organizational Commitment (β = 0.650) and Organizational Commitment → Organizational Performance (β = 0.660). For the path from Collaborative Culture to Organizational Performance (reported as β = 0.306 for males and β = 0.228 for females), the key takeaway is the same: none of the relationships show statistically significant differences between male and female respondents. In other words, the impact of Collaborative Culture on Organizational Performance—and the intermediate role of Organizational Commitment—appears consistent across genders.

The session also flags a practical limitation: SEMinR multi-group comparisons in this setup are restricted to comparing two groups at a time. When more than two groups exist (the transcript gives an example of three ranks), the approach requires separating the data and running separate analyses to compare one subgroup against the others combined. That constraint shapes how multi-group hypotheses must be operationalized in SEMinR.

Cornell Notes

The session demonstrates how to run PLS-MGA (multi-group analysis) in SEMinR to test whether structural path relationships differ across groups. After defining a measurement model (CC, OC, OP with six indicators each) and a structural model (CC → OC and OC → OP), the model is estimated with missing values handled as -99 via mean replacement. Multi-group analysis then uses estimate_PLS_MGA with gender as the grouping variable (1 = male, 2 = female) and bootstrapping to produce group-specific beta coefficients and p-values. In the example, none of the tested relationships show significant differences between male and female respondents, suggesting the model’s effects are stable across genders. The transcript also notes a limitation: comparisons are limited to two groups at a time, so multi-group scenarios with more than two categories require separate runs.

What are the core steps to set up a PLS-SEM model in SEMinR before multi-group testing?

First, load the dataset into an SEMinR object (data s) and confirm it loaded correctly. Next, define the measurement model by specifying constructs and their indicators (here: CC1–CC6 for Collaborative Culture, OC1–OC6 for Organizational Commitment, OP1–OP6 for Organizational Performance). Then build the structural model by specifying directional paths using the arrow syntax—Collaborative Culture → Organizational Commitment and Organizational Commitment → Organizational Performance—being careful with parentheses to avoid syntax errors. Finally, estimate the model into an object (data sore MGA), applying mean replacement for missing values coded as -99.

How does the transcript implement multi-group analysis using gender as the grouping variable?

Gender is used as the grouping variable with coding gender == 1 for males and gender == 2 for females. The transcript reports 289 male respondents and 52 female respondents. The multi-group estimation is run with estimate_PLS_MGA, using simple_model as the model to compare. The function is called with the dataset object (data s) and a group specification that compares group 1 (gender == 1) against group 2 (gender == 2). Bootstrapping runs during estimation and can take time.

What does the multi-group output tell you, and how is it interpreted here?

The output provides beta coefficients for each group for each path relationship, along with p-values that indicate whether the difference between groups is statistically significant. The transcript lists group-specific betas for the relationships (e.g., Collaborative Culture → Organizational Commitment and Organizational Commitment → Organizational Performance). It then concludes that for each relationship tested, the p-values indicate no significant differences between male and female respondents—meaning the perceived effects are the same across genders.

Why might a researcher need to run separate analyses when there are more than two groups?

The transcript notes a limitation: SEMinR’s multi-group comparison in this setup can compare only two groups at a time. For three groups (example: ranks), the workaround is to separate the data and run separate multi-group analyses—for instance, comparing rank 1 against the other two ranks combined—rather than comparing all three categories in a single run.

How are missing values handled in the example, and why does that matter for multi-group results?

Missing values are denoted by -99 and handled using mean replacement during estimation. This matters because multi-group analysis relies on the estimated model coefficients; inconsistent or unhandled missingness could distort group-specific betas and the resulting p-values for differences between groups.

Review Questions

In the example, which constructs and indicator sets define the measurement model, and what two structural paths are tested in the structural model?
How does estimate_PLS_MGA use group definitions (gender == 1 vs gender == 2) to produce group-specific beta coefficients and p-values?
What limitation of SEMinR multi-group analysis is highlighted, and what practical strategy is suggested for handling more than two groups?

Key Points

1
SEMinR PLS-MGA tests whether structural path coefficients differ across subgroups by estimating the same PLS-SEM model for each group and comparing betas with p-values.
2
A complete setup requires defining both a measurement model (constructs with multiple indicators) and a structural model (directional paths) before running multi-group analysis.
3
Missing values coded as -99 can be handled via mean replacement during model estimation to keep the dataset usable for group comparisons.
4
Multi-group analysis in the transcript uses gender as the grouping variable (1 = male, 2 = female) and bootstrapping to support significance testing.
5
In the gender comparison shown, none of the tested relationships produce significant differences between male and female respondents, indicating stable effects across groups.
6
The workflow for more than two categories requires separate two-group comparisons because SEMinR multi-group comparisons are limited to two groups at a time.

Highlights

The gender-based PLS-MGA run produced group-specific beta coefficients and p-values, and every tested relationship showed no statistically significant difference between males and females.

The structural model tested two main links: Collaborative Culture → Organizational Commitment and Organizational Commitment → Organizational Performance, with group comparisons applied to these paths.

Multi-group analysis in this SEMinR setup relies on bootstrapping and a two-group comparison design, forcing separate runs when more than two groups exist.

Topics

Mentioned

PLS-MGA
PLS
SEM
MGA