Exploratory Factor Analysis (EFA): Concept, Terminologies, Assumptions, Running, Interpreting

TL;DR

EFA tests whether theory-based item groupings form real latent factors by using factor loadings derived from item correlations.

Briefing Cornell Notes

Briefing

Exploratory factor analysis (EFA) is a data-reduction method used to test whether a set of questionnaire items actually clusters into a smaller number of underlying, unobservable constructs. It matters for scale development because it turns theory-driven item groupings into an evidence-based structure: if items intended to measure “ethical responsibilities,” “research and development responsibilities,” and “philanthropic responsibilities” don’t load together in the data, the proposed scale needs revision.

EFA works by examining systematic interdependence among observed variables—survey items respondents answer—and grouping them based on how strongly they correlate. The goal is to summarize information from many variables into fewer factors that can be named and interpreted. In the example used for scale development, the researcher created 19 items for “University social responsibility,” grouped theoretically into three dimensions: 7 ethical responsibility items, 6 research and development items, and 6 philanthropic responsibility items. EFA then checks whether the empirical correlations support that three-factor structure by looking for “factor loadings,” which represent how strongly each item relates to each extracted factor.

EFA is contrasted with confirmatory factor analysis (CFA). CFA is used when there’s a prior theory to test—relationships are confirmed rather than discovered. EFA is used when prior knowledge is limited or when no established scale exists, because it searches for latent patterns and reduces variables into a smaller set of composite factors.

Before running EFA, several diagnostics determine whether the data are suitable. Kaiser-Meyer-Olkin (KMO) measures sampling adequacy; values above 0.50 indicate factor analysis is appropriate. Bartlett’s test of sphericity checks whether the correlation matrix is meaningfully different from an identity matrix (i.e., whether variables are correlated enough to justify factor extraction). Commonality reflects how much variance an item shares with the factor solution; low commonality suggests poor fit. The number of factors is guided by eigenvalues (typically retaining factors with eigenvalue > 1), supported by a scree plot and/or percentage of variance explained (a common rule of thumb is retaining factors that explain roughly 60–70% of variance).

Running EFA in SPSS involves selecting Dimension Reduction → Factor, using principal component extraction, and applying rotation to clarify the factor structure. The session emphasizes Varimax rotation (orthogonal), which aims to produce a pattern where items load strongly on one factor and weakly on others. Model fit is assessed by comparing reproduced vs. observed correlations; a key rule of thumb is that non-redundant residuals should be under 50%.

In the worked example, the initial EFA output produced a three-factor solution consistent with expectations, with KMO = 0.931, a significant Bartlett test, and a cumulative variance close to 60% for three factors. However, some items failed to load adequately (factor loadings below the 0.50 threshold), including rd1, rd2, and pr1. After removing those problematic items one by one and re-running the analysis, the factor structure became clean: ethical responsibilities, research and development responsibilities, and philanthropic responsibilities aligned with the intended three-factor model. Reporting then follows a structured template: document the extraction method, rotation, loading and commonality criteria, KMO and Bartlett results, items removed and why, and the final factor loadings and variance explained.

Cornell Notes

Exploratory factor analysis (EFA) is used in scale development to test whether many questionnaire items cluster into a smaller set of latent constructs. It groups items based on correlations, using factor loadings to judge whether each item represents its intended factor. EFA is appropriate when there’s limited prior structure (unlike CFA, which confirms an existing theory). Before extraction, diagnostics such as KMO and Bartlett’s test determine whether the correlation matrix is suitable for factor analysis, and commonality helps flag weak items. In the SPSS example, a three-factor solution emerged for “University social responsibility,” but several items with loadings below 0.50 were removed (rd1, rd2, pr1) to achieve a clear ethical, research & development, and philanthropic factor structure.

How does EFA reduce a large set of survey items into fewer constructs?

EFA examines correlations among observed variables (questionnaire items) and groups items that move together into latent factors. Each factor represents a shared underlying construct, and items are evaluated by their factor loadings—how strongly an item correlates with a factor. In the example, 19 items were reduced into three factors corresponding to ethical responsibilities (7 items), research and development responsibilities (6 items), and philanthropic responsibilities (6 items).

What diagnostics determine whether the data are suitable for EFA?

Kaiser-Meyer-Olkin (KMO) measures sampling adequacy; values should be > 0.50 (the example reports KMO = 0.931). Bartlett’s test of sphericity checks whether variables are correlated enough to avoid an identity correlation matrix; the example’s Bartlett test was significant. Commonality indicates how much variance an item shares with the factor solution; items with very low commonality and/or low loadings may need removal.

How is the number of factors chosen in EFA?

A common rule is to retain factors with eigenvalues greater than 1. The scree plot (eigenvalues plotted against factor number) helps confirm the “elbow” point. Percentage of variance explained is also used; the example notes a cumulative variance close to 60% for three factors, supporting retention of the three-factor model.

Why does rotation matter, and what’s the difference between Varimax and Oblimin?

Rotation clarifies factor structure by redistributing loadings so items associate more distinctly with one factor. Varimax is orthogonal, aiming to minimize cross-loadings and keeping factors uncorrelated. Oblimin is oblique, allowing factors to correlate. The session uses Varimax as the rotation method in SPSS.

What does it mean when an item has no or low factor loading, and what happens next?

Low loadings (the session uses a minimum criterion of 0.50) mean the item does not associate strongly with any extracted factor, or it aligns with the wrong factor. In the example, rd1, rd2, and pr1 failed to load adequately and were removed. Re-running EFA after removing such items produced a cleaner three-factor structure matching the theoretical dimensions.

What elements should be included when reporting EFA results?

A reporting template should include: the extraction method (principal component analysis in the example), rotation method (Varimax), minimum factor loading and commonality criteria (0.50), KMO and Bartlett test results, the number of factors retained and variance explained, model fit via non-redundant residuals (rule of thumb < 50%), and a clear list of items removed with reasons (e.g., items not loading on the intended factor or not loading at all).

Review Questions

If KMO were below 0.50 and Bartlett’s test were not significant, what would that imply about attempting EFA?
In the example, why were rd1, rd2, and pr1 removed—what specific loading behavior triggered deletion?
How would you justify choosing three factors using eigenvalues, scree plot evidence, and percentage of variance explained?

Key Points

1
EFA tests whether theory-based item groupings form real latent factors by using factor loadings derived from item correlations.
2
KMO (>0.50) and a significant Bartlett’s test are key prerequisites; they indicate the correlation matrix is suitable for factor extraction.
3
Factor count is typically guided by eigenvalues > 1, supported by scree plot inspection and percentage of variance explained (often targeting ~60–70%).
4
Varimax rotation is used to produce a clearer, more interpretable factor structure by emphasizing strong loadings and reducing ambiguity.
5
Items with factor loadings below the chosen threshold (0.50 in the example) should be removed or reconsidered, especially when they fail to load on their intended factor.
6
Model fit is assessed by how well reproduced correlations match observed correlations; non-redundant residuals should be under 50% as a rule of thumb.
7
Reporting should include diagnostics (KMO, Bartlett), extraction/rotation choices, criteria thresholds, items removed with reasons, and final factor loadings and variance explained.

Highlights

EFA turns a theoretical scale structure into an evidence-based factor model by checking whether items load onto the intended constructs.

A clean three-factor structure for “University social responsibility” emerged only after removing items with loadings below 0.50 (rd1, rd2, pr1).

KMO = 0.931 and a significant Bartlett’s test supported the decision to proceed with factor analysis and extraction.

Varimax rotation was used to sharpen factor interpretability by making items load more distinctly on one factor.

Topics

Exploratory Factor Analysis
Scale Development
SPSS Factor Analysis
Factor Loadings
Rotation Methods

Mentioned

EFA
CFA
KMO
SPSS

Exploratory Factor Analysis (EFA): Concept, Terminologies, Assumptions, Running, Interpreting - SPSS