Exploratory Factor Analysis (EFA): Concept, Key Terminologies, Assumptions, Running, Interpreting

TL;DR

EFA reduces many correlated questionnaire items into fewer latent factors by grouping variables with strong intercorrelations.

Briefing Cornell Notes

Briefing

Exploratory Factor Analysis (EFA) is presented as a practical way to build and validate new measurement scales by compressing many questionnaire items into a smaller set of underlying, unobservable “latent” factors. Instead of treating dozens of observed variables as unrelated, EFA groups items that move together—based on their intercorrelations—so a construct like “service quality” can be represented by components such as teaching, administration, or leadership. That matters because researchers often need objective instruments for concepts that can’t be directly observed, and EFA helps reveal the factor structure when there’s limited prior knowledge about how items should cluster.

The transcript distinguishes EFA from Confirmatory Factor Analysis (CFA): EFA searches for latent patterns in the data, while CFA tests a pre-specified structure tied to hypotheses. For scale development, the workflow begins with data reduction—often using principal component analysis as the extraction method—then checks whether the dataset is suitable for factor modeling. Two core diagnostics are emphasized: the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett’s test of sphericity. KMO values between 0.5 and 1 indicate the sample is appropriate for factor analysis; values below 0.5 suggest the correlations are too weak or the sample is inadequate. Bartlett’s test should be significant, meaning the correlation matrix is not an identity matrix—there must be enough relationships among variables to justify factor extraction.

Once suitability is confirmed, EFA relies on several interpretive statistics. Commonality indicates how much variance a variable shares with the factor solution and should be sufficiently high (the transcript uses a threshold above 0.6). Uniqueness—computed as 1 minus commonality—should be low, since high uniqueness implies an item isn’t well explained by the factors. The number of factors to retain is guided by eigenvalues (Kaiser rule: eigenvalues greater than 1) and supported by scree plots, where the “elbow” suggests a reasonable cutoff. Factor loadings—correlations between items and factors—are then inspected to assign items to dimensions.

Rotation is treated as a key step for interpretability. Varimax (orthogonal) is recommended as a common choice in scale-development papers because it aims to minimize cross-loadings and keep factors independent (orthogonal). The transcript contrasts this with oblique rotations such as Direct Oblimin, which allow factors to correlate, and briefly notes other options like Quartimax and Promax. In practice, items that fail to load clearly (for example, not loading above a minimum threshold such as 0.5) or that cross-load meaningfully on multiple factors are candidates for removal to achieve a cleaner factor structure.

A worked example is described using a CSR scale with multiple dimensions (development responsibilities, ethical responsibilities, relational responsibilities, and information sharing responsibilities). After running EFA with principal component analysis and Varimax rotation, the solution yields four factors, with the retained components explaining a substantial portion of variance (about 57.8% in the example). The transcript ends with guidance on reporting: document the extraction and rotation method, minimum loading criteria, commonality/uniqueness checks, KMO and Bartlett’s test results, the final factor solution, and which items were deleted and why (typically due to poor loadings or problematic cross-loadings).

Cornell Notes

Exploratory Factor Analysis (EFA) is used in scale development to reduce many questionnaire items into a smaller set of latent factors by grouping variables that show strong intercorrelations. The transcript emphasizes suitability checks before extraction: KMO (should be ≥ 0.5) and Bartlett’s test of sphericity (should be significant, meaning the correlation matrix is not an identity matrix). After extraction (often via principal component analysis) and rotation (commonly Varimax), researchers interpret commonality, uniqueness, eigenvalues, and factor loadings to decide how many factors to keep and which items belong to each factor. Items that don’t load adequately or cross-load across multiple factors are typically removed to produce a clearer factor structure. This process yields an interpretable factor solution that can be reported transparently for instrument development.

What is the core purpose of EFA in scale development, and how does it differ from CFA?

EFA is a data-reduction method that groups many observed variables (questionnaire items) into fewer unobservable latent factors based on intercorrelations. It’s used when the underlying factor structure is not well known. CFA is used when a factor structure already exists and hypotheses are being tested; EFA instead searches for latent patterns in the data to discover what factor structure fits best.

Why do KMO and Bartlett’s test come before extracting factors?

KMO (Kaiser-Meyer-Olkin measure of sampling adequacy) checks whether the sample and correlation pattern are suitable for factor analysis; values from 0.5 to 1 indicate appropriateness, while values below 0.5 suggest the data are not suitable. Bartlett’s test of sphericity checks whether variables are uncorrelated in the population; a significant result indicates the correlation matrix is not an identity matrix, meaning there are enough relationships among variables to justify factor grouping.

How do commonality and uniqueness guide item retention?

Commonality is the proportion of variance a variable shares with the factors; the transcript uses a practical threshold of over 0.6 to indicate adequate shared variance. Uniqueness is the variance not explained by the factors and equals 1 minus commonality; uniqueness should be low. If an item has low commonality (high uniqueness), it may not represent the underlying construct well and may be removed.

How is the number of factors selected in the transcript?

Eigenvalues are used with the Kaiser rule: retain factors with eigenvalues greater than 1. A scree plot can also be used to look for the “elbow,” but eigenvalues are described as the easier, more reliable rule of thumb for deciding how many factors to keep.

What role does rotation play, and why is Varimax emphasized?

Rotation makes loading patterns easier to interpret by clarifying which items belong to which factors. Varimax is recommended because it is orthogonal and aims to minimize correlation between factors, producing a simpler, clearer structure with fewer extreme cross-loadings. Direct Oblimin is contrasted as an oblique rotation that allows factors to correlate, which can change the interpretability and structure.

When should items be deleted during EFA?

Items are typically deleted when they fail to load adequately on any factor (for example, not meeting a minimum loading threshold like 0.5) or when they cross-load strongly on multiple factors (e.g., an item loading around 0.50 on Factor 2 and also on Factor 4). Deleting such items helps achieve a cleaner, more defensible factor structure.

Review Questions

What do KMO and Bartlett’s test each tell you about whether EFA is appropriate for a dataset?
How do commonality and uniqueness relate mathematically, and what direction of values is preferred for a good factor structure?
If a factor has an eigenvalue of 0.9, what does the Kaiser rule suggest about retaining it, and why?

Key Points

1
EFA reduces many correlated questionnaire items into fewer latent factors by grouping variables with strong intercorrelations.
2
Use KMO (≥ 0.5) and a significant Bartlett’s test to confirm the correlation matrix is suitable for factor analysis.
3
Interpret commonality (aiming for sufficiently high values such as > 0.6) and uniqueness (low values; uniqueness = 1 − commonality) to judge item fit.
4
Select the number of factors primarily using eigenvalues greater than 1, with scree plots as a supporting check.
5
Apply rotation to improve interpretability; Varimax is emphasized for orthogonal, clearer factor separation, while Direct Oblimin allows correlated factors.
6
Remove items that don’t load clearly on any factor or that cross-load meaningfully across multiple factors to achieve a cleaner structure.
7
Report extraction method, rotation method, loading thresholds, KMO/Bartlett results, factor solution details, and item deletions with reasons.

Highlights

EFA groups items into latent factors by exploiting intercorrelations, turning a long questionnaire into a structured factor model.

KMO values from 0.5 to 1 signal sampling adequacy for factor analysis; Bartlett’s test must be significant to avoid an identity correlation matrix.

Commonality and uniqueness provide a direct item-level check: commonality should be high, uniqueness low (uniqueness = 1 − commonality).

Varimax rotation is recommended for scale development because it produces orthogonal factors and clearer loading patterns.

A practical EFA reporting workflow includes documenting assumptions (KMO/Bartlett), factor retention (eigenvalues), and item deletions due to weak loadings or cross-loadings.

Topics

Exploratory Factor Analysis
Scale Development
Factor Loadings
Rotation Methods
Assumptions and Diagnostics

Mentioned

EFA
CFA
KMO
MSA
CSR
SPSS
PLS