My Data Collection is Over! How do I start the Data Analysis using #SmartPLS4?
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Import data into SmartPLS 4 using CSV/Excel (or SPSS) and treat data screening as a gatekeeper step by checking indicator min/max ranges and discrepancies before modeling.
Briefing
SmartPLS 4 is presented as a practical workflow for moving from raw survey data to a fully tested structural equation model—covering measurement quality (loadings, reliability, validity), higher-order constructs, mediation, and moderation—using a single end-to-end process.
The session starts with model setup and data preparation. After collecting data, the first step is coding indicators consistently: each construct uses initials plus item numbers (e.g., Vision items V1–V5), with demographic variables like age, gender, and employment rank included alongside the questionnaire items. SmartPLS 4 accepts data in CSV/Excel or SPSS formats, and the workflow emphasizes data screening: checking minimum/maximum values and indicator statistics to catch discrepancies before analysis. Once imported, the project is organized around a workspace folder, and a new project is created with a PLS-SEM model type.
The core modeling work begins with the measurement model, which is treated as two layers: lower-order constructs first, then higher-order constructs. In the example, Internal Marketing and Internal Service Quality are higher-order constructs. Internal Marketing is modeled as reflective-formative at the higher level: Vision, Development, and Rewards are reflective subdimensions that combine to form the higher-order construct. Internal Service Quality is modeled as reflective-reflective: its subdimensions are reflective and interchangeable in the sense that removing one subdimension would not invalidate the higher-order construct.
On the SmartPLS canvas, constructs and indicators are dragged in, then connected to reflect the hypothesized relationships. The measurement model is run using the PLS-SEM algorithm, with reporting centered on outer loadings (indicator quality), construct reliability (Cronbach’s alpha and composite reliability, targeting >0.70), and construct validity. Convergent validity is checked via AVE (Average Variance Extracted), with a threshold of >0.50. Discriminant validity is assessed using HTMT and Fornell–Larcker logic: HTMT values should be below 0.90, and the square root of AVE for each construct should exceed its correlations with other constructs.
Higher-order constructs require extra validation steps. For reflective-reflective higher-order constructs, the same outer loading/reliability/validity checks apply. For reflective-formative higher-order constructs, the workflow shifts: collinearity is evaluated with VIF (acceptable when <5), then bootstrapping is used to test outer weights and outer loadings for the formative indicators (with significance and magnitude thresholds such as >0.50 for loadings). Latent variable scores are generated and re-imported so the higher-order constructs can be validated at the correct level.
After measurement quality is established, the structural model is assessed through bootstrapping. Direct effects are evaluated via path coefficients and one-tailed significance (p < 0.05), with attention to which relationships are significant (green) versus not (red). Mediation is handled by inspecting indirect effects; in the example, perceived organizational support is not a mediator while other paths through mediators are significant. Moderation is implemented by creating interaction effects between the predictor and moderators, then re-running bootstrapping to test whether interaction terms significantly affect the dependent variable. The session also demonstrates how to plot moderation using a Johnson-Neyman-style slope approach via the “stats tool package” by James G. Kline.
Finally, the session ties results to reporting practice: measurement model tables (factor loadings, reliability, convergent and discriminant validity including HTMT and Fornell–Larcker), structural model metrics (VIF, R², Q² via PLS predict), and a results section organized as direct effects, mediation, then moderation. The takeaway is a repeatable checklist for producing publishable SmartPLS 4 outputs, including the special handling required for higher-order constructs and interaction effects.
Cornell Notes
The workflow shows how to analyze a complex PLS-SEM model in SmartPLS 4 from data import to hypothesis testing. It emphasizes measurement-model quality first: outer loadings, Cronbach’s alpha/composite reliability, convergent validity via AVE (>0.50), and discriminant validity using HTMT (<0.90) and Fornell–Larcker comparisons. Higher-order constructs are validated differently depending on type: reflective-reflective uses the standard reliability/validity checks, while reflective-formative requires VIF checks (<5) and bootstrapped outer weights/loadings. Once measurement is sound, bootstrapping is used for structural paths, indirect effects (mediation), and interaction terms (moderation). The result is a structured, report-ready sequence for R², Q², and hypothesis outcomes.
Why does the measurement model come before the structural model in SmartPLS 4, and what specific checks determine whether indicators and constructs are “good enough” to proceed?
How does SmartPLS 4 validation differ between reflective-reflective and reflective-formative higher-order constructs?
What does “data screening” mean in this workflow, and where does it fit relative to model building?
How are mediation and moderation tested after bootstrapping in SmartPLS 4?
What reporting structure does the session recommend for a thesis or paper using SmartPLS 4 outputs?
Review Questions
- What thresholds are used for outer loadings, reliability (alpha/composite reliability), AVE, and HTMT in this workflow, and what do they imply if they fail?
- How do you validate a reflective-formative higher-order construct differently from a reflective-reflective one in SmartPLS 4?
- When interpreting moderation results, how does the sign of the interaction term affect the relationship between the predictor and dependent variable?
Key Points
- 1
Import data into SmartPLS 4 using CSV/Excel (or SPSS) and treat data screening as a gatekeeper step by checking indicator min/max ranges and discrepancies before modeling.
- 2
Build the measurement model first by adding lower-order constructs, connecting indicators to latent variables, and running the PLS-SEM algorithm to generate outer loadings and quality metrics.
- 3
Use reliability and validity thresholds to decide whether to keep indicators: outer loadings should generally be strong, Cronbach’s alpha/composite reliability should exceed 0.70, and AVE should exceed 0.50.
- 4
Assess discriminant validity with HTMT (<0.90) and Fornell–Larcker (square root of AVE greater than inter-construct correlations) to confirm constructs are distinct.
- 5
Validate higher-order constructs at the correct level: reflective-reflective uses standard reflective checks, while reflective-formative requires VIF (<5) plus bootstrapped outer weights/loadings.
- 6
Run bootstrapping for the structural model to test direct effects, then inspect indirect effects for mediation and interaction paths for moderation.
- 7
Report results in a consistent order: measurement model quality first, then structural metrics (VIF, R², Q²), followed by direct effects, mediation, and moderation outcomes.