15. SPSS Classroom - How to Handle Missing Data in SPSS - Series Mean & Linear Interpolation Methods

TL;DR

Listwise and pairwise deletion can waste substantial data, since missingness can remove entire surveys or reduce usable observations.

Briefing Cornell Notes

Briefing

Missing data can quietly distort survey results, and the most practical fix is often imputation rather than deleting incomplete cases. Listwise or pairwise deletion drops respondents (or parts of their responses) when any values are missing, which can throw away large amounts of usable information—especially when a single missed question causes an entire survey to be excluded. Prior research cited in the tutorial suggests imputation can “remedy” roughly 20% to 30% of missing values while still producing good parameter estimates, making it a stronger default when missingness isn’t extreme.

Imputation works by replacing each missing value with a numeric estimate. The tutorial contrasts two SPSS-friendly approaches: series mean imputation and linear interpolation. Series mean imputation fills gaps with the mean of the observed values for that indicator. It’s popular because it’s easy to run, but it comes with a cost: it reduces the variance of the variables involved and can mask individual differences between respondents. Linear interpolation takes a different tack. It looks at the last valid value before the missing entry and the next valid value after it, then inserts an estimated value between them—based on the assumption that the data change in a roughly linear way across the sequence.

Both methods can be implemented in SPSS through the same workflow: go to Transform → Replace Missing Values. A dialog prompts users to select which indicators contain missing values to be imputed. SPSS’s default method is series mean; when applied, SPSS creates a new variable rather than overwriting the original, using an underscore and a suffix in the new variable name (e.g., an indicator like 81 becomes 811). After selecting the indicators, users can simply accept the default and press OK to generate the imputed series-mean variables.

To switch to linear interpolation, users still start in Transform → Replace Missing Values, select the target indicator(s), and then change the method from the default. The tutorial emphasizes a key operational detail: after choosing “linear interpolation,” users must click the Change button for the method selection to take effect. Once the method is updated, pressing OK performs the imputation and again produces a new variable with the imputed values. The example walkthrough highlights the difference in outcomes: series mean produces a repeated mean value for missing spots, while linear interpolation inserts a value that sits between neighboring observed points (e.g., replacing a missing second value with an interpolated discrete value rather than a constant mean).

Overall, the tutorial frames missing-data handling as a choice between discarding information and estimating it. When missingness is moderate, imputation—especially SPSS’s series mean or linear interpolation options—offers a straightforward way to preserve data volume while acknowledging the trade-offs each method introduces. For deeper guidance, it points viewers to a dedicated book on missing values.

Cornell Notes

The tutorial argues that deleting incomplete cases (listwise or pairwise deletion) often wastes too much data, since one missing response can remove an entire survey from analysis. Imputation is presented as a better alternative when missingness is not excessive: SPSS replaces missing values with numeric estimates. Two SPSS methods are emphasized. Series mean imputation fills gaps with the mean of the observed values for an indicator, but it can reduce variance and hide individual differences. Linear interpolation estimates missing values by using the last valid value before the gap and the next valid value after it, inserting a value between them under a linearity assumption. Both methods are implemented via Transform → Replace Missing Values, with linear interpolation requiring a Change click to apply the method.

Why does the tutorial discourage listwise or pairwise deletion for missing data?

Listwise or pairwise deletion discards information. If a respondent misses one question, listwise deletion can drop the entire survey from analysis; pairwise deletion can still reduce usable data across analyses. The tutorial cites prior research suggesting imputation can address about 20% to 30% of missing data while maintaining good parameter estimates, making imputation preferable when missingness isn’t too large.

What is series mean imputation, and what drawback does it introduce?

Series mean imputation replaces each missing value with the mean of the observed values for that indicator (in SPSS, the default method in Transform → Replace Missing Values). The tutorial notes a key drawback: it reduces the variance of the variables involved and fails to account for individual differences among respondents because every missing entry gets the same mean-based estimate.

How does linear interpolation imputation estimate missing values?

Linear interpolation looks at the last valid value before the missing entry and the next valid value after it, then imputes a value between those two points. The method relies on the assumption that the data follow a roughly linear pattern across the sequence, so the gap can be filled by interpolation rather than a constant mean.

What SPSS steps create imputed variables using series mean?

In SPSS, go to Transform → Replace Missing Values. Select the indicator(s) with missing values. Since series mean is the default method, users can press OK to run it. SPSS creates new variables (rather than overwriting originals), using a naming pattern with an underscore and suffix (e.g., 81 becomes 811).

What SPSS detail is required to switch from series mean to linear interpolation?

After selecting the indicator(s) in Transform → Replace Missing Values, choose “linear interpolation” as the method, but then click the Change button so the method actually updates. The tutorial warns that selecting linear interpolation without clicking Change may appear to do nothing. After clicking Change, pressing OK performs the imputation and generates the new interpolated variable.

Review Questions

When would imputation be favored over listwise or pairwise deletion, according to the tutorial’s reasoning?
Compare the assumptions behind series mean imputation versus linear interpolation and describe how each affects variance or individual differences.
In SPSS, what naming behavior occurs after Replace Missing Values, and what extra action is needed to apply linear interpolation?

Key Points

1
Listwise and pairwise deletion can waste substantial data, since missingness can remove entire surveys or reduce usable observations.
2
Imputation replaces missing values with numeric estimates and is often preferable when missingness is moderate.
3
Series mean imputation is easy to run in SPSS but can reduce variance and obscure individual differences because it inserts a constant mean-based value.
4
Linear interpolation estimates missing values using the last valid value before the gap and the next valid value after it, assuming a roughly linear trend.
5
In SPSS, both methods are accessed via Transform → Replace Missing Values and create new imputed variables rather than overwriting originals.
6
Switching to linear interpolation in SPSS requires selecting the method and clicking Change before pressing OK.

Highlights

Imputation is framed as a practical remedy for missing data, with cited research suggesting it can handle roughly 20%–30% of missing values while preserving good parameter estimates.

Series mean imputation fills gaps with a constant mean, which reduces variance and can hide respondent-specific patterns.

Linear interpolation fills gaps by inserting a value between neighboring observed points, relying on a linearity assumption.

In SPSS, linear interpolation only takes effect after clicking Change in the Replace Missing Values dialog.

Topics

Missing Data
Imputation Methods
SPSS Replace Missing Values
Series Mean
Linear Interpolation

Mentioned

SPSS