13. SPSS Classroom - Assess Respondent Misconduct in Survey Research

TL;DR

Sort the last survey columns to identify incomplete rows that signal early dropout.

Briefing Cornell Notes

Briefing

A practical way to clean survey data starts with spotting respondents who either quit early or answer in a way that suggests they never read the questions. One quick check is to sort the last few columns of the dataset in ascending order to identify incomplete rows—cases where a respondent stopped answering partway through. If the missing data is limited (for example, the respondent skipped only the last one or two items), the response can often be retained because the rest of the answers may still be usable. But if the respondent left a large share of the questionnaire unanswered—on the order of 40–50%—that record should be deleted after deciding that the level of missingness makes the remaining answers unreliable.

Next comes “respondent misconduct,” where answers show suspicious uniformity. On a 1–7 Likert scale, it’s normal to see variation across items, because people rarely feel exactly the same way for every question. When a respondent selects nearly the same option for every item, it raises the likelihood that the person is not reading and is instead clicking through mechanically. To catch this, the transcript recommends adding attention checks to the survey—such as items that ask respondents to select a specific number on the 1–7 scale, or using reverse-coded questions that should produce different patterns if the respondent is actually processing the items.

For a more quantitative screen, the transcript highlights using each respondent’s standard deviation across their Likert responses. Low variability is a red flag for straight-lining (answering the same way repeatedly). While SPSS can compute standard deviation, the workflow described uses Excel as a faster alternative: enter the standard deviation function in a new column, apply it across only the Likert items (excluding the respondent ID column), and then fill the formula down for all respondents. After calculating standard deviation per row, sort the results from smallest to largest to find cases with extremely low variance.

A commonly used rule of thumb in the transcript is to strongly consider deleting any respondent record with a standard deviation below 0.25, since it indicates little to no variation across the survey items. However, it also stresses that there is no universal “golden rule.” The acceptable threshold depends on the survey’s size and context, and researchers should judge whether the respondent’s pattern is plausible. The key takeaway is not automatic deletion, but a structured decision process: remove clear dropouts, flag likely straight-liners, and use standard deviation plus attention checks to determine which records are valid enough to keep.

Cornell Notes

Survey data cleaning can target two main problems: incomplete responses and respondent misconduct. Incomplete cases can be found by sorting the last survey columns to identify rows where respondents stopped answering; keep records with only a small amount of missing data, but delete those with large gaps (e.g., 40–50% unanswered). Misconduct often appears as “straight-lining” on Likert scales, where a respondent picks nearly the same number for every item. Attention checks (specific-number prompts and reverse-coded items) help detect this behavior. For a quantitative screen, compute each respondent’s standard deviation across Likert items in Excel (excluding the ID column); values below 0.25 are a strong warning sign, though the threshold should be judged based on survey context.

How can researchers quickly identify respondents who abandoned a questionnaire partway through?

Sort the last few columns of the dataset in ascending order to surface incomplete rows. These rows indicate the respondent dropped out and stopped answering one or more of the final items. Then decide whether the missingness is small (e.g., skipping only the last one or two questions—often keep the record) or large (e.g., missing around 40–50% of the questionnaire—often delete the record).

What pattern on a 1–7 Likert scale suggests respondent misconduct?

Suspiciously uniform answers—when a respondent selects nearly the same option for every question. Because it’s unlikely that someone truly feels exactly the same way across all items, low variation across responses can indicate the respondent isn’t reading and is marking mechanically.

What are attention checks, and how do they help detect misconduct?

Attention checks are added items designed to verify engagement. Examples include asking respondents to specifically select a particular number on the 1–7 scale, and using reverse questions that should produce different responses if the respondent is actually processing the items. These checks provide evidence of whether the response pattern is trustworthy.

How can standard deviation be used to flag straight-lining in survey responses?

Compute the standard deviation of each respondent’s answers across the Likert items. In Excel, enter a standard deviation formula in a new column, apply it only to the Likert-scale statements (exclude the respondent ID column), and fill the formula down for all respondents. Sort the resulting standard deviations from smallest to largest to find respondents with extremely low variance.

What threshold is recommended for standard deviation, and why isn’t it automatic?

A rule of thumb is to strongly consider deleting records with standard deviation below 0.25 because they show little or no variation across the survey. But deletion shouldn’t be automatic: researchers should determine what level of variance is acceptable for their specific survey size and context, since there’s no single universal cutoff.

Review Questions

What decision rule should be applied when a respondent skips only the last one or two survey items versus skipping 40–50% of the questionnaire?
Why does low standard deviation across Likert items often indicate misconduct, and how would you calculate it in Excel?
What kinds of attention checks (e.g., specific-number prompts or reverse-coded items) would you add to a 1–7 Likert survey to detect straight-lining?

Key Points

1
Sort the last survey columns to identify incomplete rows that signal early dropout.
2
Keep responses when missingness is limited (such as skipping only the last one or two items), but delete records with large missing portions (around 40–50%).
3
Treat near-identical Likert responses across all items as a misconduct red flag because real attitudes typically vary across questions.
4
Add attention checks, including specific-number selection prompts and reverse-coded questions, to verify respondents are reading.
5
Compute each respondent’s standard deviation across Likert items (excluding the ID column) to quantify straight-lining.
6
Use a standard deviation threshold around 0.25 as a strong warning sign, while still making context-dependent judgments rather than applying a universal rule.

Highlights

Incomplete responses can be detected by sorting the final columns and looking for rows where respondents stopped answering.

Uniform 1–7 Likert responses across many items strongly suggest the respondent may not be reading the questions.

Attention checks—like “select a specific number” items and reverse questions—provide a direct way to detect inattentive responding.

Standard deviation per respondent is a practical metric: values below 0.25 warrant strong scrutiny for validity.

There’s no one-size-fits-all cutoff; researchers should calibrate acceptable variance to their survey context.

Topics

Survey Data Cleaning
Respondent Dropout
Respondent Misconduct
Attention Checks
Standard Deviation Screening