How to transcribe interviews
Based on Qualitative Researcher Dr Kriukow's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Transcription decisions should be driven by the research question and the planned analysis method, not by universal rules.
Briefing
Qualitative transcription isn’t a one-size-fits-all task. The level of detail—what gets written down, what gets omitted, and how speech is cleaned up—should be chosen based on the study’s goals and the kind of analysis planned, because different research questions require different kinds of evidence. Attempts to create universal transcription rules have repeatedly failed, largely because qualitative research varies widely, and transcription choices can change what later arguments seem to support.
For most qualitative work that focuses on content (participants’ experiences, views, beliefs, and narratives), transcription can stay relatively readable and “clean.” In these studies, the transcript primarily functions as a record to help researchers remember and understand what was said so they can code and analyze themes, grounded theory categories, or phenomenological meaning. That typically means nonverbal cues—stutters, “ums,” pauses, and other delivery details—are not essential. Still, researchers should not ignore them automatically: if something stands out as analytically relevant—such as an unusually long pause or visible anxiety—then those cues should be indicated consistently, for example using brackets.
The main exception is discourse analysis, where the research question targets how meaning is constructed through language and interaction, not just what participants report. In discourse analysis, pauses, hesitations, stutters, and other interactional signals can be part of the evidence. That pushes transcription toward a much more fine-grained record, sometimes including line-by-line detail, because the analysis depends on timing, phrasing, and the structure of talk.
To justify transcription decisions later, the guidance centers on a practical continuum between two extremes: “naturalism” and “denaturalism.” Naturalism treats speech as raw data, preserving every hesitancy and filler word so the transcript mirrors natural conversation. Denaturalism, by contrast, cleans up speech so it reads smoothly and focuses on intelligible content. Most researchers land somewhere in the middle—leaning toward cleaned-up transcripts for efficiency—while selectively retaining or marking details when they carry interpretive weight.
A workplace example illustrates the stakes. In a cleaned-up transcript, a colleague’s comment about having a partner can appear friendly and straightforward. Under a more naturalistic transcription approach, the same exchange can include hesitation, awkward laughter, and nervous glances—signals that shift the perceived meaning from casual to uncomfortable or even “creepy.” The key takeaway is that transcription choices shape context and interpretation, so researchers should use judgment grounded in their study aims.
Ultimately, transcription is framed as a decision process: use common sense as the starting point, then align the transcript’s level of detail with the analysis method and the evidence needed to answer the research question. Consistency in how cues are marked matters, but there is no universal “correct” approach—only a defensible one tailored to the study.
Cornell Notes
Transcription detail should match the research question and analysis method, not a universal template. Discourse analysis typically requires preserving interactional features—pauses, hesitations, stutters, and other delivery cues—because meaning is built through how people talk. Most other qualitative approaches (thematic analysis, grounded theory, phenomenology) focus on content, so transcripts often can be cleaned up and kept readable, with only notable nonverbal or interactional moments marked. A useful way to justify choices is to think along a continuum between naturalism (raw, detailed speech) and denaturalism (cleaned, intelligible speech). The “right” transcript is the one that captures what is needed for later coding and interpretation, consistently and defensibly.
Why do universal transcription guidelines tend to fail in qualitative research?
What transcription level is usually appropriate when the study focuses on content rather than interaction?
How does discourse analysis change what must be transcribed?
What is the naturalism vs. denaturalism continuum, and how does it guide transcription decisions?
How can transcription choices change the perceived meaning of the same interview exchange?
How should researchers justify their transcription choices later in their work?
Review Questions
- In a content-focused thematic analysis, what types of nonverbal or delivery cues would you include, and why?
- How would transcription requirements differ between a discourse analysis and a phenomenological study of lived experience?
- Where on the naturalism–denaturalism continuum would you place your transcript, and what evidence from your research question would justify that choice?
Key Points
- 1
Transcription decisions should be driven by the research question and the planned analysis method, not by universal rules.
- 2
Discourse analysis typically requires preserving interactional details like pauses, hesitations, and stutters because meaning is constructed through talk.
- 3
Most content-focused qualitative analyses can use cleaner, more readable transcripts, since coding often targets what participants say rather than how they deliver it.
- 4
Notable exceptions—such as unusually long pauses or visible anxiety—should be included or marked consistently, even in content-focused studies.
- 5
Naturalism and denaturalism form a continuum: naturalism preserves raw speech detail, while denaturalism cleans speech for intelligibility.
- 6
Transcription choices can change interpretation, so researchers should select the level of detail that best supports defensible later claims.
- 7
Consistency in notation (e.g., using brackets to flag cues) improves transparency and makes justification easier.