22. SEMinR Series. Evaluating Structural Model | Step 3: Explanatory Power
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Step 3 evaluates explanatory power using R square (R²) for each endogenous construct, reflecting the variance explained and in-sample predictive power.
Briefing
Step 3 of structural model evaluation zeroes in on explanatory power—how much of the variance in endogenous constructs the model accounts for. That explanatory power is primarily measured with R square (R²), which ranges from 0 to 1. Higher R² values indicate stronger in-sample predictive power because they represent the proportion of variance explained in each endogenous construct. In many social science contexts, R² around 0.25 is often treated as substantial, 0.25–0.50 as moderate, and below that as weak, though the transcript stresses that “acceptable” thresholds depend heavily on research context. It also notes a key statistical caveat: R² tends to rise as more predictor constructs are added, so comparisons should be made against similar studies and models with comparable complexity.
Because R² can inflate with additional variables, adjusted R square is introduced as a more conservative alternative. Adjusted R² corrects for the number of explanatory variables relative to the data size, reducing the tendency to overstate explanatory power. Still, adjusted R² is not treated as a precise measure of how much variance each endogenous construct truly explains, which leads to a third metric: f square (f²) effect size.
f² is used to quantify the contribution of each predictor construct. Conceptually, f² answers a counterfactual question: what happens to R² if a specific exogenous variable is removed from the model? The transcript links f² to the “size” of the predictor’s path contribution in the structural model assessment. Interpretation follows common benchmarks: f² values below 0.15 indicate a small effect, values from 0.15 to 0.35 indicate a medium effect, and values above 0.35 indicate a large effect.
A worked example is provided using an endogenous construct labeled “collaborative culture.” Running the model yields an R² of 0.608 for collaborative culture, meaning about 60.8% of the variance in collaborative culture is explained by three predictor variables included in the model. With only those three predictors present, the R² is described as moderate.
The transcript then turns to f² outputs for the exogenous variables. The effect size for “vision development rewards” on collaborative culture is characterized as small (below 0.15), implying that removing that predictor would cause only a minor drop in R². In contrast, removing “development and rewards” is described as having a medium (moderate) impact on R², consistent with f² falling in the 0.15–0.35 range. The practical takeaway is straightforward: R² tells how much variance the model explains overall for each endogenous construct, adjusted R² helps temper inflation from added predictors, and f² pinpoints which specific predictors meaningfully drive that explained variance.
The session closes by previewing reporting guidance in later videos, emphasizing that these metrics—R² for explanatory power and f² for effect size—must be interpreted and presented in line with study context and comparable model complexity.
Cornell Notes
Explanatory power in SEM structural model evaluation is assessed mainly through R square (R²) for each endogenous construct. R² (0–1) indicates the proportion of variance explained and is often treated as in-sample predictive power, but it increases when more predictors are added, so interpretation must be contextual. Adjusted R² corrects for the number of predictors relative to sample/data size, offering a more conservative view, though it still isn’t a precise variance-explained measure for each endogenous construct. To gauge the impact of individual predictors, f square (f²) effect size is used: it measures how much R² would change if an exogenous variable were removed. Benchmarks commonly used are f² < 0.15 (small), 0.15–0.35 (medium), and > 0.35 (large).
What does R square (R²) measure in Step 3 of structural model evaluation, and why does it matter?
Why can R² be misleading when comparing models, and how does adjusted R² address that?
How is f square (f²) interpreted, and what does it quantify?
In the example, what do the R² and f² results imply about collaborative culture?
How should researchers decide whether an R² value is “acceptable”?
Review Questions
- What is the difference between R² and f² in terms of what each metric tells you about model performance?
- Why does adjusted R² often be considered more conservative than R², and what problem does it correct for?
- If a predictor has f² = 0.20, how would you classify its effect size using the thresholds provided?
Key Points
- 1
Step 3 evaluates explanatory power using R square (R²) for each endogenous construct, reflecting the variance explained and in-sample predictive power.
- 2
R² values must be interpreted relative to study context and comparable models because R² tends to increase as more predictors are added.
- 3
Adjusted R square offers a more conservative estimate by correcting for the number of explanatory variables relative to data size.
- 4
f square (f²) measures each predictor’s contribution by estimating how much R² would drop if that exogenous variable were removed.
- 5
Common f² benchmarks are f² < 0.15 (small), 0.15–0.35 (medium), and > 0.35 (large).
- 6
In the example, collaborative culture has R² = 0.608 (about 60.8% variance explained) from three predictors, described as moderate explanatory power.
- 7
Effect sizes in the example differ by predictor: one is small (vision development rewards) while another is medium (development and rewards).