Get AI summaries of any video or article — Sign up free
My Data Collection is Over! How do I start the Data Analysis using #SmartPLS4? thumbnail

My Data Collection is Over! How do I start the Data Analysis using #SmartPLS4?

Research With Fawad·
6 min read

Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Import data into SmartPLS 4 using CSV/Excel (or SPSS) and treat data screening as a gatekeeper step by checking indicator min/max ranges and discrepancies before modeling.

Briefing

SmartPLS 4 is presented as a practical workflow for moving from raw survey data to a fully tested structural equation model—covering measurement quality (loadings, reliability, validity), higher-order constructs, mediation, and moderation—using a single end-to-end process.

The session starts with model setup and data preparation. After collecting data, the first step is coding indicators consistently: each construct uses initials plus item numbers (e.g., Vision items V1–V5), with demographic variables like age, gender, and employment rank included alongside the questionnaire items. SmartPLS 4 accepts data in CSV/Excel or SPSS formats, and the workflow emphasizes data screening: checking minimum/maximum values and indicator statistics to catch discrepancies before analysis. Once imported, the project is organized around a workspace folder, and a new project is created with a PLS-SEM model type.

The core modeling work begins with the measurement model, which is treated as two layers: lower-order constructs first, then higher-order constructs. In the example, Internal Marketing and Internal Service Quality are higher-order constructs. Internal Marketing is modeled as reflective-formative at the higher level: Vision, Development, and Rewards are reflective subdimensions that combine to form the higher-order construct. Internal Service Quality is modeled as reflective-reflective: its subdimensions are reflective and interchangeable in the sense that removing one subdimension would not invalidate the higher-order construct.

On the SmartPLS canvas, constructs and indicators are dragged in, then connected to reflect the hypothesized relationships. The measurement model is run using the PLS-SEM algorithm, with reporting centered on outer loadings (indicator quality), construct reliability (Cronbach’s alpha and composite reliability, targeting >0.70), and construct validity. Convergent validity is checked via AVE (Average Variance Extracted), with a threshold of >0.50. Discriminant validity is assessed using HTMT and Fornell–Larcker logic: HTMT values should be below 0.90, and the square root of AVE for each construct should exceed its correlations with other constructs.

Higher-order constructs require extra validation steps. For reflective-reflective higher-order constructs, the same outer loading/reliability/validity checks apply. For reflective-formative higher-order constructs, the workflow shifts: collinearity is evaluated with VIF (acceptable when <5), then bootstrapping is used to test outer weights and outer loadings for the formative indicators (with significance and magnitude thresholds such as >0.50 for loadings). Latent variable scores are generated and re-imported so the higher-order constructs can be validated at the correct level.

After measurement quality is established, the structural model is assessed through bootstrapping. Direct effects are evaluated via path coefficients and one-tailed significance (p < 0.05), with attention to which relationships are significant (green) versus not (red). Mediation is handled by inspecting indirect effects; in the example, perceived organizational support is not a mediator while other paths through mediators are significant. Moderation is implemented by creating interaction effects between the predictor and moderators, then re-running bootstrapping to test whether interaction terms significantly affect the dependent variable. The session also demonstrates how to plot moderation using a Johnson-Neyman-style slope approach via the “stats tool package” by James G. Kline.

Finally, the session ties results to reporting practice: measurement model tables (factor loadings, reliability, convergent and discriminant validity including HTMT and Fornell–Larcker), structural model metrics (VIF, R², Q² via PLS predict), and a results section organized as direct effects, mediation, then moderation. The takeaway is a repeatable checklist for producing publishable SmartPLS 4 outputs, including the special handling required for higher-order constructs and interaction effects.

Cornell Notes

The workflow shows how to analyze a complex PLS-SEM model in SmartPLS 4 from data import to hypothesis testing. It emphasizes measurement-model quality first: outer loadings, Cronbach’s alpha/composite reliability, convergent validity via AVE (>0.50), and discriminant validity using HTMT (<0.90) and Fornell–Larcker comparisons. Higher-order constructs are validated differently depending on type: reflective-reflective uses the standard reliability/validity checks, while reflective-formative requires VIF checks (<5) and bootstrapped outer weights/loadings. Once measurement is sound, bootstrapping is used for structural paths, indirect effects (mediation), and interaction terms (moderation). The result is a structured, report-ready sequence for R², Q², and hypothesis outcomes.

Why does the measurement model come before the structural model in SmartPLS 4, and what specific checks determine whether indicators and constructs are “good enough” to proceed?

The measurement model establishes that indicators reliably and validly represent latent constructs. The workflow runs PLS-SEM to inspect outer loadings (indicator quality), then checks construct reliability using Cronbach’s alpha and composite reliability (targeting >0.70). Convergent validity is assessed through AVE (Average Variance Extracted), with values >0.50 indicating that items converge on the construct. Discriminant validity is then tested using HTMT (values should be <0.90) and Fornell–Larcker logic (the square root of AVE for a construct should exceed its correlations with other constructs). Only after these pass does the analysis move to structural relationships and hypothesis tests.

How does SmartPLS 4 validation differ between reflective-reflective and reflective-formative higher-order constructs?

Reflective-reflective higher-order constructs (e.g., Internal Service Quality in the example) are validated using the same measurement-model logic as standard reflective constructs: outer loadings, reliability (alpha/composite reliability), and validity (AVE and discriminant validity). Reflective-formative higher-order constructs (e.g., Internal Marketing) require additional steps because the higher-order construct is formed by subdimensions. The workflow checks collinearity among formative indicators using VIF (acceptable when <5). Then it uses bootstrapping to test outer weights (formative indicator contribution) and also checks outer loadings for significance and magnitude (loadings are expected to be >0.50 and significant). If weights/loadings are not adequate, the formative indicators may need revision or deletion.

What does “data screening” mean in this workflow, and where does it fit relative to model building?

Data screening is treated as a prerequisite to analysis, before running algorithms. The workflow stresses checking minimum and maximum values and ensuring indicators fall within expected ranges; discrepancies suggest problems that must be corrected in Excel/SPSS before importing or before trusting results. After importing, SmartPLS provides indicator statistics (means, medians, observed min/max) and indicator correlations, which can reveal issues. The model-building steps (creating the canvas, connecting paths, running PLS-SEM) come only after the dataset passes these checks.

How are mediation and moderation tested after bootstrapping in SmartPLS 4?

Mediation is assessed by examining indirect effects reported in the bootstrapping output. The workflow highlights that not all mediators necessarily carry the indirect effect: in the example, perceived organizational support does not mediate the relationship between the independent variable and organizational performance, while other mediators do. Moderation is implemented by creating interaction effects between the predictor and the moderator (SmartPLS generates the interaction term). Bootstrapping then tests whether the interaction path to the dependent variable is significant. The example finds both moderation terms (role conflict and role ambiguity) initially insignificant, with an interpretation that role ambiguity can weaken the collaborative culture → organizational performance link when significant and negative.

What reporting structure does the session recommend for a thesis or paper using SmartPLS 4 outputs?

The recommended reporting sequence starts with the measurement model: factor loadings, reliability (Cronbach’s alpha and composite reliability), convergent validity (AVE), and discriminant validity (HTMT and/or Fornell–Larcker). Then comes the structural model analysis: report VIF values, R² for explained variance, and Q² for predictive relevance (via PLS predict). Finally, present hypothesis results in an organized order: direct effects first, then mediation (indirect effects), and moderation (interaction effects). The session also notes that results can be exported from SmartPLS into Excel/Word-friendly formats for clean tables.

Review Questions

  1. What thresholds are used for outer loadings, reliability (alpha/composite reliability), AVE, and HTMT in this workflow, and what do they imply if they fail?
  2. How do you validate a reflective-formative higher-order construct differently from a reflective-reflective one in SmartPLS 4?
  3. When interpreting moderation results, how does the sign of the interaction term affect the relationship between the predictor and dependent variable?

Key Points

  1. 1

    Import data into SmartPLS 4 using CSV/Excel (or SPSS) and treat data screening as a gatekeeper step by checking indicator min/max ranges and discrepancies before modeling.

  2. 2

    Build the measurement model first by adding lower-order constructs, connecting indicators to latent variables, and running the PLS-SEM algorithm to generate outer loadings and quality metrics.

  3. 3

    Use reliability and validity thresholds to decide whether to keep indicators: outer loadings should generally be strong, Cronbach’s alpha/composite reliability should exceed 0.70, and AVE should exceed 0.50.

  4. 4

    Assess discriminant validity with HTMT (<0.90) and Fornell–Larcker (square root of AVE greater than inter-construct correlations) to confirm constructs are distinct.

  5. 5

    Validate higher-order constructs at the correct level: reflective-reflective uses standard reflective checks, while reflective-formative requires VIF (<5) plus bootstrapped outer weights/loadings.

  6. 6

    Run bootstrapping for the structural model to test direct effects, then inspect indirect effects for mediation and interaction paths for moderation.

  7. 7

    Report results in a consistent order: measurement model quality first, then structural metrics (VIF, R², Q²), followed by direct effects, mediation, and moderation outcomes.

Highlights

SmartPLS 4 analysis is organized as a checklist: measurement-model quality (loadings, reliability, AVE, HTMT/Fornell–Larcker) must be established before interpreting structural paths.
Higher-order constructs change the validation rules: reflective-reflective higher-order constructs follow standard reflective assessment, while reflective-formative higher-order constructs require VIF checks and bootstrapped outer weights.
Mediation and moderation are both handled through bootstrapping outputs—indirect effects for mediation and interaction terms for moderation—so significance is read directly from the bootstrapped p-values.
The session ties statistical outputs to thesis-ready reporting: export tables from SmartPLS, then structure the results section as measurement model → structural model → hypothesis results (direct, mediation, moderation).

Topics

Mentioned

  • SmartPLS
  • James G. Kline
  • PLS-SEM
  • PLS
  • CSV
  • SPSS
  • AVE
  • HTMT
  • VIF