Get AI summaries of any video or article — Sign up free
Webinar Day 2: SmartPLS3 for Data Analysis - Basic and Advance Analysis (See Description) thumbnail

Webinar Day 2: SmartPLS3 for Data Analysis - Basic and Advance Analysis (See Description)

Research With Fawad·
6 min read

Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Map each questionnaire item to a SmartPLS indicator column (e.g., ER1–ER8) and keep row identifiers so data-entry errors can be traced back to the original questionnaire.

Briefing

SmartPLS3 is presented as a practical workflow for analyzing survey data with structural equation modeling—starting from how to enter and code questionnaire responses, then moving through measurement-model quality checks (reliability and validity), and finally testing structural relationships with bootstrapping. The core message is that credible SmartPLS results depend less on clicking “run” and more on systematically validating constructs (outer model) before interpreting paths (inner model), including careful handling of discriminant validity and higher-order constructs.

The session begins with the basics of survey data preparation: after distributing a questionnaire to hundreds of people and receiving completed responses, the data must be entered into analysis software. Excel and SPSS are offered as common options for data entry and anomaly checking, but SmartPLS3 is positioned as the main tool for the modeling stage. Variables are defined as columns, with each questionnaire item mapped to an indicator (e.g., ethical responsibilities coded as ER1–ER8). A practical recommendation is to label rows with questionnaire numbers so any data-entry problems can be traced back to the original respondent’s choices.

SmartPLS3 is then framed as software for structural equation modeling (SEM), combining ideas from factor analysis and regression. The instructor contrasts covariance-based SEM (e.g., Amos) with PLS-SEM, noting that PLS-SEM is often used when normality assumptions are not reliable and that it can produce stronger loadings in practice. SEM’s logic is split into two parts: the measurement model (outer model) evaluates how well indicators represent latent constructs, while the structural model (inner model) tests hypothesized relationships among latent constructs.

For the measurement model, the workflow emphasizes construct reliability and validity. Reliability is assessed using Cronbach’s alpha and composite reliability, with a commonly used threshold of 0.70. Factor loadings are treated as a key diagnostic: low-loading indicators may be removed, but not blindly—content validity must be protected, and the session warns against “deleting sprees” that chase statistical thresholds at the expense of construct meaning. Convergent validity is checked via Average Variance Extracted (AVE), with AVE expected to exceed 0.50. Discriminant validity is evaluated using multiple criteria: the Fornell–Larcker approach (square root of AVE should exceed inter-construct correlations), cross-loadings (items should load highest on their own construct), and HTMT (heterotrait–monotrait ratio), with conservative guidance such as HTMT < 0.85 (or ≤ 0.90).

After the outer model passes quality checks, the structural model is tested using bootstrapping. Path coefficients, t-statistics, and p-values determine whether relationships are significant, and R² indicates how much variance in dependent constructs is explained. Mediation is handled through bootstrapped indirect effects, then classified as no mediation, partial mediation, or complete mediation depending on whether direct and indirect effects remain significant. Moderation is tested by adding an interaction term (e.g., role ambiguity moderating the CC → organizational performance link) and interpreting the sign of the moderation coefficient, supported by slope analysis.

Finally, the session addresses higher-order constructs and the reflective vs formative distinction. It explains why higher-order constructs often require special treatment (e.g., two-stage approaches) and why discriminant validity statistics may not apply directly to the higher-order layer. For reflective–reflective higher-order models, the session uses a two-stage approach by exporting latent variable scores and re-importing them as indicators. For reflective–formative cases, it highlights problems like near-zero explained variance under standard modeling and recommends disjoint two-stage approaches, then validates higher-order constructs using collinearity (VIF) and significance of outer weights, while again cautioning that indicator deletion can harm content validity.

Cornell Notes

SmartPLS3 analysis is built around a disciplined sequence: prepare and code survey data, validate the measurement model (outer model), then test the structural model (inner model). Construct reliability is checked with Cronbach’s alpha and composite reliability (typically ≥ 0.70), convergent validity with AVE (typically > 0.50), and discriminant validity using Fornell–Larcker, cross-loadings, and HTMT (often < 0.85, with ≤ 0.90 as a conservative threshold). After the outer model quality is established, bootstrapping provides path significance via t-statistics and p-values, while R² quantifies explained variance. Mediation is classified by comparing direct and indirect effects; moderation is tested by adding an interaction term and interpreting its sign with slope analysis. Higher-order constructs require extra care, often using two-stage or disjoint two-stage approaches depending on whether second-order constructs are reflective or formative.

Why does SmartPLS3 require a measurement-model (outer model) assessment before interpreting structural paths?

Because the outer model determines whether indicators actually measure the intended latent constructs. The session treats reliability and validity as gatekeepers: Cronbach’s alpha and composite reliability (threshold ~0.70) establish consistency; factor loadings diagnose whether items represent the construct; AVE (threshold >0.50) confirms convergent validity; and discriminant validity (Fornell–Larcker, cross-loadings, HTMT) ensures constructs are distinct. Only after these checks does bootstrapping results for path coefficients become meaningful.

What are the main thresholds and diagnostics for construct reliability and convergent validity?

Reliability is assessed using Cronbach’s alpha and composite reliability, with 0.70 used as a common cutoff. Convergent validity is assessed using AVE, with values above 0.50 indicating that the construct explains more variance in its indicators than error. Factor loadings are central to AVE: the session notes that while some items may fall below a loading cutoff, deletion should be guided by both statistical criteria and content validity (e.g., avoid removing items solely to inflate AVE).

How is discriminant validity established, and what do the three common methods check?

Fornell–Larcker compares the square root of AVE for each construct against its correlations with other constructs; the square root of AVE should be higher than inter-construct correlations. Cross-loadings require each indicator to load highest on its own construct rather than on competing constructs. HTMT (heterotrait–monotrait ratio) uses indicator correlations to quantify trait distinctiveness; the session uses conservative guidance such as HTMT < 0.85 (and also mentions ≤ 0.90). Using multiple methods helps catch different kinds of discriminant validity problems.

How does the session classify mediation in SmartPLS?

Mediation is evaluated through bootstrapped indirect effects (IV → mediator → DV). If the indirect effect is significant, mediation exists. Classification then depends on the direct effect (IV → DV) in the presence of the mediator: if direct and indirect effects are both significant, it’s partial mediation; if the direct effect becomes insignificant while the indirect effect remains significant, it’s complete mediation. If indirect effects are insignificant, mediation is not supported.

What does a significant moderation effect mean in this workflow, and how is it interpreted?

Moderation changes the strength or direction of the relationship between an independent variable and a dependent variable. The session builds moderation by adding an interaction term (e.g., role ambiguity moderating collaborative culture → organizational performance) using the product indicator approach for reflective constructs. A negative moderation coefficient indicates the moderator weakens the relationship. Slope analysis then visualizes how the DV changes at low vs high levels of the moderator (e.g., steeper slope at low role ambiguity implies a stronger positive effect of collaborative culture on performance).

Why do higher-order constructs often require two-stage or disjoint two-stage approaches?

Because standard reliability/validity outputs for the higher-order construct layer may not align with how the higher-order construct is actually formed from its subdimensions. For reflective–reflective higher-order models, the session uses a two-stage approach: export latent variable scores for the higher-order construct and re-import them so discriminant validity and convergent validity are computed using the correct indicator structure. For reflective–formative cases, it notes problems like near-zero explained variance under standard modeling and recommends disjoint two-stage approaches, then validates higher-order constructs using collinearity (VIF) and significance of outer weights while protecting content validity.

Review Questions

  1. What specific quality checks must be satisfied in the outer model (reliability, convergent validity, discriminant validity) before bootstrapping results are interpreted?
  2. In SmartPLS mediation testing, what combination of direct and indirect effect significance corresponds to partial vs complete mediation?
  3. When HTMT is high for subdimensions of a higher-order construct, what troubleshooting steps does the session suggest (e.g., examining cross-loadings, loadings, and content validity implications)?

Key Points

  1. 1

    Map each questionnaire item to a SmartPLS indicator column (e.g., ER1–ER8) and keep row identifiers so data-entry errors can be traced back to the original questionnaire.

  2. 2

    Validate the outer model first: use Cronbach’s alpha and composite reliability (≥ 0.70), AVE (> 0.50), and discriminant validity checks (Fornell–Larcker, cross-loadings, HTMT).

  3. 3

    Treat factor loadings as evidence of indicator quality, but avoid removing items purely to meet thresholds; protect content validity and ensure each construct retains enough indicators.

  4. 4

    Use bootstrapping to test the inner model: interpret path coefficients with t-statistics and p-values, and report R² to quantify explained variance.

  5. 5

    Classify mediation by comparing indirect effects (significant vs not) and direct effects (significant vs not) when the mediator is included.

  6. 6

    Test moderation by adding an interaction term and interpreting the moderation coefficient sign; confirm interpretation with slope analysis at low vs high moderator values.

  7. 7

    For higher-order constructs, choose the correct modeling strategy (two-stage vs disjoint two-stage) based on whether second-order constructs are reflective or formative, and validate higher-order constructs using collinearity and outer-weight significance without unnecessary indicator deletion.

Highlights

SmartPLS results become defensible only after the outer model passes reliability and validity checks; structural path significance is not a substitute for measurement quality.
Discriminant validity is evaluated with three complementary lenses—Fornell–Larcker, cross-loadings, and HTMT—because each can reveal different distinctiveness failures.
Mediation classification hinges on whether the direct effect remains significant once the mediator is included: partial mediation keeps both direct and indirect effects significant; complete mediation removes the direct effect.
Moderation is interpreted through both statistics (significance and coefficient direction) and slope analysis, showing how the IV→DV relationship changes at low vs high moderator values.
Higher-order constructs often break standard reporting assumptions, so the session emphasizes two-stage transformations and careful validation rather than copying first-order procedures blindly.

Topics

Mentioned

  • PLS-SEM
  • SEM
  • EFA
  • CFA
  • SCM
  • AMOS
  • SPSS
  • EFA
  • CFA
  • AVE
  • HTMT
  • VIF
  • IV
  • DV
  • OCB
  • JB
  • R&D
  • P value
  • t-statistics