Get AI summaries of any video or article — Sign up free
How to Add Control Variables in SmartPLS3? (See Description) thumbnail

How to Add Control Variables in SmartPLS3? (See Description)

Research With Fawad·
5 min read

Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Add control variables in SmartPLS3 as latent variables, then run bootstrapping to test their significance.

Briefing

Adding control variables in SmartPLS3 can change whether demographic predictors look statistically significant—and that shift matters because it reveals potential confounding. In this walkthrough, gender and age are introduced as control variables to test whether they distort the relationships among the model’s endogenous constructs, specifically OC and CC (collaborative culture). Gender is treated as a categorical variable with two categories (male, female), so it’s entered directly as a latent variable without creating dummy variables. Age is treated as a continuous variable, also added as a latent variable, then the model is bootstrapped to assess significance.

With gender and age included, the results show no effect from gender, while age produces a significant effect. To understand whether age is acting as a confounder, the analysis is repeated without any control variables. The comparison is done using R-square values and the significance of the paths. When controls are removed, the R-square for OC drops slightly (from 0.495 with controls to 0.482 without), indicating that age contributes explanatory power. More importantly, age’s significance persists: age remains influential even when it is not included as a control variable. That persistence suggests age is not merely a nuisance variable—it genuinely affects the endogenous outcome, so it carries confounding influence.

The walkthrough then compares how the model’s key statistics behave when controls are added versus removed. Even when the beta coefficients and overall relationships remain broadly similar, the p-values can change. The takeaway is nuanced: significance levels may shift with or without controls, but the substantive effect of age on the endogenous construct remains. In other words, adding age as a control variable alters statistical signaling more than it alters the underlying direction of relationships.

Finally, the tutorial addresses a common complication: control variables with more than two categories. Using job rank as the example, it explains that SmartPLS requires dummy-variable coding for multi-category categorical predictors. For job rank with three levels—Junior, Middle, and Senior—three dummy indicators are conceptually created, but only two are actually added to the model so one category serves as the reference group (Junior). Middle is coded as 1 for Middle employees and 0 otherwise; Senior is coded as 1 for Senior employees and 0 otherwise. After bootstrapping with these dummy controls included, the results show no significant impact from Middle or Senior job rank on the endogenous variables. That means job rank does not produce a confounding effect, so there’s no strong statistical reason to keep it as a control.

Overall, the process is practical: add candidate controls in SmartPLS3, bootstrap to test significance, rerun the model without controls, and compare R-square and path significance to judge whether the control variable changes the relationships in a meaningful way. The tutorial’s examples show both a case where age behaves like a confounder and a case where job rank does not.

Cornell Notes

SmartPLS3 control variables can be added as latent variables and then tested for confounding by comparing results with and without the controls. In the example, gender (male/female) is entered without dummy coding and shows no significant effect, while age (continuous) is significant when included. Removing controls slightly lowers OC’s R-square (0.495 to 0.482) and keeps age’s influence, indicating age affects the endogenous construct and acts as a confounder. For multi-category controls like job rank (Junior/Middle/Senior), dummy variables are created and only two are added, using Junior as the reference category. Bootstrapping then shows no significant effect from Middle or Senior, so job rank is not a confounding factor.

How does the walkthrough decide whether gender and age should be treated differently when adding them as control variables in SmartPLS3?

Gender is categorical with two options (male and female), so it’s added as a latent variable directly without creating dummy variables. Age is continuous, so it’s also added as a latent variable without dummy coding. After adding them, bootstrapping is used to check whether each control variable has significant effects on the endogenous constructs (OC and CC).

What comparison is used to judge whether age is a confounder?

The model is bootstrapped with controls (gender and age) and then bootstrapped again without any control variables. The walkthrough compares R-square values and the significance of paths. OC’s R-square drops from 0.495 (with controls) to 0.482 (without), and age remains significant, indicating age influences the endogenous variable rather than merely changing statistical noise—so it behaves like a confounding factor.

Why does the walkthrough emphasize that p-values can change even if beta values and relationships look similar?

When controls are added or removed, significance levels may shift due to changes in model estimation and explained variance (R-square). The walkthrough notes that while significance values can change, the substantive direction/impact (beta behavior) remains broadly similar. The practical conclusion is based on whether the control variable meaningfully affects the endogenous relationships, not only on whether p-values move slightly.

How are dummy variables created for a control variable with more than two categories (job rank)?

For job rank with three categories—Junior, Middle, Senior—the walkthrough creates dummy indicators for each level but adds only two to the model. Junior is the reference category. Middle is coded as 1 for Middle employees and 0 otherwise; Senior is coded as 1 for Senior employees and 0 otherwise. These two dummy variables are then added as control variables in SmartPLS and the model is bootstrapped.

What does it mean in this context when job rank dummies show no significant effects after bootstrapping?

If the paths from the job rank dummy variables (Middle and Senior, with Junior as reference) are insignificant for the endogenous constructs, job rank does not produce a confounding effect. The walkthrough concludes there’s no need to keep job rank as a control because the relationships and weights do not change significantly.

Review Questions

  1. When comparing models with and without control variables in SmartPLS3, which metrics does the walkthrough use to assess confounding (and why)?
  2. How would you code a categorical control variable with three categories so that one category becomes the reference group in SmartPLS3?
  3. In the age example, what evidence suggests age is more than a statistical artifact when controls are removed?

Key Points

  1. 1

    Add control variables in SmartPLS3 as latent variables, then run bootstrapping to test their significance.

  2. 2

    Treat two-category categorical controls (e.g., gender) without dummy variables when entering them as latent variables.

  3. 3

    Treat continuous controls (e.g., age) as continuous latent variables and compare results with and without them.

  4. 4

    Assess confounding by rerunning the model without controls and comparing R-square and path significance.

  5. 5

    Expect p-values to change when controls are added/removed, but judge confounding by whether substantive relationships meaningfully change.

  6. 6

    For multi-category categorical controls (e.g., job rank), create dummy variables and add only k−1 dummies, using one category as the reference group.

  7. 7

    If dummy-coded controls (excluding the reference) are insignificant, there’s little evidence they confound the endogenous relationships.

Highlights

Age shows a significant effect when included as a control, and it remains influential even when controls are removed—evidence of confounding influence.
OC’s R-square declines slightly when controls are removed (0.495 to 0.482), signaling that age contributes explanatory power.
For job rank (Junior/Middle/Senior), only two dummy variables are added, with Junior as the reference category.
Middle and Senior job rank dummies come out insignificant after bootstrapping, indicating no confounding effect from job rank.

Topics

  • Control Variables
  • SmartPLS3 Bootstrapping
  • Dummy Variables
  • Confounding
  • Categorical Controls

Mentioned

  • OC
  • CC