I have finished my Research Data Collection! How do I start the Data Analysis using SmartPLS

Q: What data-preparation rules make SmartPLS imports and model building smoother?

The dataset should be structured with demographic variables at the top (e.g., age, gender, education/program, university). Latent constructs are represented by their item indicators using consistent variable names with no spaces; underscores are recommended. Each questionnaire form should be numbered before entry so any anomalies can be traced back to a specific respondent row. After entering data in Excel, export to CSV (comma-delimited) because SmartPLS imports comma-separated values; avoid commas inside data fields and avoid spaces in variable names to prevent import errors.

Q: Why does SmartPLS require a measurement-model check before testing relationships?

SmartPLS treats the model as having two parts: the outer model (measurement model) and the inner model (structural model). The outer model verifies that items reliably and validly measure each latent construct (reliability, convergent validity, discriminant validity). Only after those checks pass should bootstrapping be used to interpret structural paths, because weak measurement can produce misleading path results.

Q: How does bootstrapping determine whether structural paths are significant?

Bootstrapping amplifies the dataset by resampling (the session uses 500 samples, while noting 5,000 is often recommended). For path significance, SmartPLS reports path coefficients plus t-statistics and p-values. The guidance is to treat p<0.05 as significant and to compare t-values to the ~1.96 benchmark. R² is then used to interpret explained variance in the dependent construct (e.g., organizational performance).

Q: How are mediation and moderation tested in SmartPLS?

Mediation uses bootstrapping to examine total effect, direct effect, and specific indirect effect. Partial mediation is when both direct and indirect effects are significant; complete mediation is when the direct effect is insignificant but the indirect effect is significant. Moderation is tested by adding a moderating effect between the independent and dependent constructs, using the product indicator approach (multiplying indicators of the independent and moderator). Bootstrapping then shows whether the moderating path is significant; in the example, the moderation effect is insignificant.

TL;DR

Prepare questionnaire data so each latent construct has correctly labeled indicator columns (no spaces; use underscores) and export as CSV for SmartPLS import.

Briefing Cornell Notes

Briefing

SmartPLS becomes usable for new researchers once the workflow is treated as a sequence: build the model on the canvas, run a measurement-model check for reliability and validity, then use bootstrapping to test the structural relationships (including mediation and moderation). The practical payoff is that researchers can move from “data collected” to “results that can be reported” without getting stuck on where to begin.

After collecting questionnaire data for multiple latent constructs (ethical responsibility, research and development responsibilities, philanthropic responsibilities, perceived quality, student satisfaction, student trust, student loyalty, university reputation, and university performance), the first operational step is organizing the dataset so SmartPLS can read it cleanly. Demographic variables go at the top of the spreadsheet (e.g., age, gender, education/program, university), and each latent construct is represented by its item indicators using consistent variable names with no spaces (underscore is recommended). If questionnaires are entered from paper, each respondent form should be numbered so errors can be traced back to a specific row.

Once the data sit in Excel, the file is exported to CSV because SmartPLS imports comma-separated values. Inside SmartPLS, a new project is created with a meaningful name, then the CSV is imported. The software then provides descriptive statistics and flags missing values; SmartPLS is also presented as robust to normality concerns, so researchers are not expected to run normality checks the way they might in covariance-based approaches.

Model building starts with selecting the indicators for each construct and dragging them onto the canvas to create latent variables. Relationships are added by connecting the independent latent variable(s) to the dependent latent variable, which turns the model from “red” (not runnable) to “blue” (connected). A key conceptual distinction is that SmartPLS handles both the measurement model (outer model) and the structural model (inner model) within the same setup: the outer model verifies that items measure their constructs well, while the inner model tests how constructs relate.

For the measurement model, the workflow runs PLS algorithm and then checks construct reliability and validity. Reliability is assessed using Cronbach’s Alpha and composite reliability, with the common reporting thresholds described as >0.70 and composite reliability emphasized as the more current standard. Convergent validity is evaluated using Average Variance Extracted (AVE), with AVE expected to exceed 0.50; the logic is that items should “converge” to represent their latent construct, reflected in factor loadings. Discriminant validity is checked using multiple criteria: Fornell–Larcker (square root of AVE should exceed inter-construct correlations), HTMT (reported as needing to be below about 0.85), and cross-loadings (each item should load higher on its own construct than on others). If discriminant validity problems appear, the guidance is to consider removing problematic indicators, but not so aggressively that the construct loses measurement coverage.

After the measurement model passes, bootstrapping is used to test structural paths. The session emphasizes selecting “path” bootstrapping and interpreting output via t-statistics and p-values (with significance tied to p<0.05 and t-values compared to the ~1.96 benchmark). The explained example includes an R² interpretation: the model’s R² for organizational performance is used to quantify how much variance is explained by predictors.

Finally, the same bootstrapping logic extends to mediation and moderation. Mediation is tested by examining total, direct, and specific indirect effects; partial mediation occurs when both direct and indirect effects are significant, while complete mediation occurs when the direct effect is insignificant but the indirect effect is significant. Moderation is tested by adding a moderating effect (e.g., role ambiguity) using the product indicator approach, then bootstrapping the relevant path; the example concludes the moderation effect is insignificant. The overall message is that SmartPLS analysis becomes straightforward when reliability/validity checks come first, and bootstrapping is used to confirm which relationships are statistically supported.

Cornell Notes

SmartPLS analysis is presented as a step-by-step pipeline: (1) prepare questionnaire data so each latent construct has correctly named indicators, (2) import the dataset as CSV into a new SmartPLS project, (3) build the measurement model on the canvas and connect constructs, then (4) run PLS algorithm and check reliability/validity before testing relationships. Reliability is assessed with Cronbach’s Alpha and composite reliability (typically >0.70). Convergent validity uses AVE (typically >0.50), while discriminant validity is evaluated with Fornell–Larcker, HTMT (often <0.85), and cross-loadings. After the measurement model passes, bootstrapping tests path significance (p<0.05) and supports mediation (direct vs indirect effects) and moderation (product indicator approach).

What data-preparation rules make SmartPLS imports and model building smoother?

The dataset should be structured with demographic variables at the top (e.g., age, gender, education/program, university). Latent constructs are represented by their item indicators using consistent variable names with no spaces; underscores are recommended. Each questionnaire form should be numbered before entry so any anomalies can be traced back to a specific respondent row. After entering data in Excel, export to CSV (comma-delimited) because SmartPLS imports comma-separated values; avoid commas inside data fields and avoid spaces in variable names to prevent import errors.

Why does SmartPLS require a measurement-model check before testing relationships?

SmartPLS treats the model as having two parts: the outer model (measurement model) and the inner model (structural model). The outer model verifies that items reliably and validly measure each latent construct (reliability, convergent validity, discriminant validity). Only after those checks pass should bootstrapping be used to interpret structural paths, because weak measurement can produce misleading path results.

How are reliability and convergent validity evaluated in SmartPLS?

Reliability is assessed using Cronbach’s Alpha and composite reliability, with the session emphasizing thresholds around >0.70 (composite reliability highlighted as a modern standard). Convergent validity is assessed using Average Variance Extracted (AVE), with AVE expected to be >0.50. The logic is that items for a construct should “converge” to represent the latent variable; AVE is computed from factor loadings (squared loadings averaged across items). Items with lower loadings may still be retained if overall AVE remains above the threshold.

What methods establish discriminant validity, and what do the decision rules look like?

Three checks are described: (1) Fornell–Larcker: the square root of AVE for a construct should exceed its correlations with other constructs; (2) HTMT: the HTMT value should be below about 0.85 (sometimes 0.90 is referenced); (3) cross-loadings: each item should load higher on its own construct than on other constructs. If an item loads similarly on two constructs (difference <0.10 is discussed), it may threaten discriminant validity and could be a candidate for removal—though excessive deletion undermines the construct’s measurement.

How does bootstrapping determine whether structural paths are significant?

Bootstrapping amplifies the dataset by resampling (the session uses 500 samples, while noting 5,000 is often recommended). For path significance, SmartPLS reports path coefficients plus t-statistics and p-values. The guidance is to treat p<0.05 as significant and to compare t-values to the ~1.96 benchmark. R² is then used to interpret explained variance in the dependent construct (e.g., organizational performance).

How are mediation and moderation tested in SmartPLS?

Mediation uses bootstrapping to examine total effect, direct effect, and specific indirect effect. Partial mediation is when both direct and indirect effects are significant; complete mediation is when the direct effect is insignificant but the indirect effect is significant. Moderation is tested by adding a moderating effect between the independent and dependent constructs, using the product indicator approach (multiplying indicators of the independent and moderator). Bootstrapping then shows whether the moderating path is significant; in the example, the moderation effect is insignificant.

Review Questions

What are the typical threshold values for Cronbach’s Alpha, composite reliability, and AVE, and how do they map to reliability vs convergent validity?
In SmartPLS, what does it mean if discriminant validity fails under HTMT or cross-loadings, and what practical action is suggested?
How do you distinguish partial mediation from complete mediation using direct and indirect effects in SmartPLS bootstrapping output?

Key Points

1
Prepare questionnaire data so each latent construct has correctly labeled indicator columns (no spaces; use underscores) and export as CSV for SmartPLS import.
2
Build the model by placing indicators onto the canvas as latent variables, renaming constructs, and connecting independent constructs to the dependent construct before running anything.
3
Run PLS algorithm and evaluate the measurement model first: check Cronbach’s Alpha and composite reliability for reliability, then AVE for convergent validity.
4
Establish discriminant validity using Fornell–Larcker, HTMT (commonly <0.85), and cross-loadings, ensuring items load more strongly on their own construct than on others.
5
Use bootstrapping (path) to test structural relationships, interpreting significance with p-values (p<0.05) and t-statistics (around 1.96 as a benchmark).
6
Interpret model fit through R² to quantify how much variance in the dependent construct is explained by predictors.
7
Test mediation via total/direct/indirect effects (partial vs complete based on direct effect significance) and test moderation by adding a moderating effect using the product indicator approach, then bootstrapping the moderating path.

Highlights

SmartPLS is framed as a two-stage model: the outer (measurement) model must pass reliability/validity checks before the inner (structural) model is interpreted.

Discriminant validity is validated through multiple lenses—Fornell–Larcker, HTMT (often <0.85), and cross-loadings—so researchers can triangulate construct separation.

Mediation in SmartPLS hinges on whether the direct effect remains significant when the indirect effect is present.

Moderation is implemented by creating a product-indicator construct between the independent variable and the moderator, then bootstrapping the moderating effect.

Topics

SmartPLS Workflow
Measurement Model
Reliability Validity
Bootstrapping Paths
Mediation Moderation

Mentioned

PLS
SEM
SCM
CSV
AVE
HTMT