I have finished my Research Data Collection! How do I start the Data Analysis using SmartPLS
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Prepare questionnaire data so each latent construct has correctly labeled indicator columns (no spaces; use underscores) and export as CSV for SmartPLS import.
Briefing
SmartPLS becomes usable for new researchers once the workflow is treated as a sequence: build the model on the canvas, run a measurement-model check for reliability and validity, then use bootstrapping to test the structural relationships (including mediation and moderation). The practical payoff is that researchers can move from “data collected” to “results that can be reported” without getting stuck on where to begin.
After collecting questionnaire data for multiple latent constructs (ethical responsibility, research and development responsibilities, philanthropic responsibilities, perceived quality, student satisfaction, student trust, student loyalty, university reputation, and university performance), the first operational step is organizing the dataset so SmartPLS can read it cleanly. Demographic variables go at the top of the spreadsheet (e.g., age, gender, education/program, university), and each latent construct is represented by its item indicators using consistent variable names with no spaces (underscore is recommended). If questionnaires are entered from paper, each respondent form should be numbered so errors can be traced back to a specific row.
Once the data sit in Excel, the file is exported to CSV because SmartPLS imports comma-separated values. Inside SmartPLS, a new project is created with a meaningful name, then the CSV is imported. The software then provides descriptive statistics and flags missing values; SmartPLS is also presented as robust to normality concerns, so researchers are not expected to run normality checks the way they might in covariance-based approaches.
Model building starts with selecting the indicators for each construct and dragging them onto the canvas to create latent variables. Relationships are added by connecting the independent latent variable(s) to the dependent latent variable, which turns the model from “red” (not runnable) to “blue” (connected). A key conceptual distinction is that SmartPLS handles both the measurement model (outer model) and the structural model (inner model) within the same setup: the outer model verifies that items measure their constructs well, while the inner model tests how constructs relate.
For the measurement model, the workflow runs PLS algorithm and then checks construct reliability and validity. Reliability is assessed using Cronbach’s Alpha and composite reliability, with the common reporting thresholds described as >0.70 and composite reliability emphasized as the more current standard. Convergent validity is evaluated using Average Variance Extracted (AVE), with AVE expected to exceed 0.50; the logic is that items should “converge” to represent their latent construct, reflected in factor loadings. Discriminant validity is checked using multiple criteria: Fornell–Larcker (square root of AVE should exceed inter-construct correlations), HTMT (reported as needing to be below about 0.85), and cross-loadings (each item should load higher on its own construct than on others). If discriminant validity problems appear, the guidance is to consider removing problematic indicators, but not so aggressively that the construct loses measurement coverage.
After the measurement model passes, bootstrapping is used to test structural paths. The session emphasizes selecting “path” bootstrapping and interpreting output via t-statistics and p-values (with significance tied to p<0.05 and t-values compared to the ~1.96 benchmark). The explained example includes an R² interpretation: the model’s R² for organizational performance is used to quantify how much variance is explained by predictors.
Finally, the same bootstrapping logic extends to mediation and moderation. Mediation is tested by examining total, direct, and specific indirect effects; partial mediation occurs when both direct and indirect effects are significant, while complete mediation occurs when the direct effect is insignificant but the indirect effect is significant. Moderation is tested by adding a moderating effect (e.g., role ambiguity) using the product indicator approach, then bootstrapping the relevant path; the example concludes the moderation effect is insignificant. The overall message is that SmartPLS analysis becomes straightforward when reliability/validity checks come first, and bootstrapping is used to confirm which relationships are statistically supported.
Cornell Notes
SmartPLS analysis is presented as a step-by-step pipeline: (1) prepare questionnaire data so each latent construct has correctly named indicators, (2) import the dataset as CSV into a new SmartPLS project, (3) build the measurement model on the canvas and connect constructs, then (4) run PLS algorithm and check reliability/validity before testing relationships. Reliability is assessed with Cronbach’s Alpha and composite reliability (typically >0.70). Convergent validity uses AVE (typically >0.50), while discriminant validity is evaluated with Fornell–Larcker, HTMT (often <0.85), and cross-loadings. After the measurement model passes, bootstrapping tests path significance (p<0.05) and supports mediation (direct vs indirect effects) and moderation (product indicator approach).
What data-preparation rules make SmartPLS imports and model building smoother?
Why does SmartPLS require a measurement-model check before testing relationships?
How are reliability and convergent validity evaluated in SmartPLS?
What methods establish discriminant validity, and what do the decision rules look like?
How does bootstrapping determine whether structural paths are significant?
How are mediation and moderation tested in SmartPLS?
Review Questions
- What are the typical threshold values for Cronbach’s Alpha, composite reliability, and AVE, and how do they map to reliability vs convergent validity?
- In SmartPLS, what does it mean if discriminant validity fails under HTMT or cross-loadings, and what practical action is suggested?
- How do you distinguish partial mediation from complete mediation using direct and indirect effects in SmartPLS bootstrapping output?
Key Points
- 1
Prepare questionnaire data so each latent construct has correctly labeled indicator columns (no spaces; use underscores) and export as CSV for SmartPLS import.
- 2
Build the model by placing indicators onto the canvas as latent variables, renaming constructs, and connecting independent constructs to the dependent construct before running anything.
- 3
Run PLS algorithm and evaluate the measurement model first: check Cronbach’s Alpha and composite reliability for reliability, then AVE for convergent validity.
- 4
Establish discriminant validity using Fornell–Larcker, HTMT (commonly <0.85), and cross-loadings, ensuring items load more strongly on their own construct than on others.
- 5
Use bootstrapping (path) to test structural relationships, interpreting significance with p-values (p<0.05) and t-statistics (around 1.96 as a benchmark).
- 6
Interpret model fit through R² to quantify how much variance in the dependent construct is explained by predictors.
- 7
Test mediation via total/direct/indirect effects (partial vs complete based on direct effect significance) and test moderation by adding a moderating effect using the product indicator approach, then bootstrapping the moderating path.