Get AI summaries of any video or article — Sign up free
How to transcribe interviews thumbnail

How to transcribe interviews

5 min read

Based on Qualitative Researcher Dr Kriukow's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Transcription decisions should be driven by the research question and the planned analysis method, not by universal rules.

Briefing

Qualitative transcription isn’t a one-size-fits-all task. The level of detail—what gets written down, what gets omitted, and how speech is cleaned up—should be chosen based on the study’s goals and the kind of analysis planned, because different research questions require different kinds of evidence. Attempts to create universal transcription rules have repeatedly failed, largely because qualitative research varies widely, and transcription choices can change what later arguments seem to support.

For most qualitative work that focuses on content (participants’ experiences, views, beliefs, and narratives), transcription can stay relatively readable and “clean.” In these studies, the transcript primarily functions as a record to help researchers remember and understand what was said so they can code and analyze themes, grounded theory categories, or phenomenological meaning. That typically means nonverbal cues—stutters, “ums,” pauses, and other delivery details—are not essential. Still, researchers should not ignore them automatically: if something stands out as analytically relevant—such as an unusually long pause or visible anxiety—then those cues should be indicated consistently, for example using brackets.

The main exception is discourse analysis, where the research question targets how meaning is constructed through language and interaction, not just what participants report. In discourse analysis, pauses, hesitations, stutters, and other interactional signals can be part of the evidence. That pushes transcription toward a much more fine-grained record, sometimes including line-by-line detail, because the analysis depends on timing, phrasing, and the structure of talk.

To justify transcription decisions later, the guidance centers on a practical continuum between two extremes: “naturalism” and “denaturalism.” Naturalism treats speech as raw data, preserving every hesitancy and filler word so the transcript mirrors natural conversation. Denaturalism, by contrast, cleans up speech so it reads smoothly and focuses on intelligible content. Most researchers land somewhere in the middle—leaning toward cleaned-up transcripts for efficiency—while selectively retaining or marking details when they carry interpretive weight.

A workplace example illustrates the stakes. In a cleaned-up transcript, a colleague’s comment about having a partner can appear friendly and straightforward. Under a more naturalistic transcription approach, the same exchange can include hesitation, awkward laughter, and nervous glances—signals that shift the perceived meaning from casual to uncomfortable or even “creepy.” The key takeaway is that transcription choices shape context and interpretation, so researchers should use judgment grounded in their study aims.

Ultimately, transcription is framed as a decision process: use common sense as the starting point, then align the transcript’s level of detail with the analysis method and the evidence needed to answer the research question. Consistency in how cues are marked matters, but there is no universal “correct” approach—only a defensible one tailored to the study.

Cornell Notes

Transcription detail should match the research question and analysis method, not a universal template. Discourse analysis typically requires preserving interactional features—pauses, hesitations, stutters, and other delivery cues—because meaning is built through how people talk. Most other qualitative approaches (thematic analysis, grounded theory, phenomenology) focus on content, so transcripts often can be cleaned up and kept readable, with only notable nonverbal or interactional moments marked. A useful way to justify choices is to think along a continuum between naturalism (raw, detailed speech) and denaturalism (cleaned, intelligible speech). The “right” transcript is the one that captures what is needed for later coding and interpretation, consistently and defensibly.

Why do universal transcription guidelines tend to fail in qualitative research?

Qualitative studies vary in purpose, interactional context, and analytic goals, so one fixed rule set can’t fit every design. Because transcription choices affect what evidence later analysis can use, a universal framework would either over-collect detail (wasting effort) or under-collect it (missing meaning). The guidance emphasizes that transcription must be tailored to the study’s aims and the kind of analysis planned.

What transcription level is usually appropriate when the study focuses on content rather than interaction?

When analysis targets what participants say—experiences, views, beliefs, and narratives—transcripts can prioritize readability and comprehension. Nonverbal cues like “ums,” stutters, and most pauses are often unnecessary for coding. Researchers should still include or mark exceptional moments that seem analytically meaningful (e.g., an unusually long pause or visible anxiety), using consistent notation such as brackets.

How does discourse analysis change what must be transcribed?

Discourse analysis treats how participants construct meaning through language and interaction as the evidence. That means transcription should capture fine-grained features: timing between utterances, hesitations, stutters, and other interactional cues. The transcript becomes a tool for analyzing narrative construction and the structure of talk, not just the content of statements.

What is the naturalism vs. denaturalism continuum, and how does it guide transcription decisions?

Naturalism preserves speech in a raw, detailed form—every hesitation, filler, and delivery feature—so the transcript mirrors natural conversation. Denaturalism cleans speech to make it more intelligible and readable, removing details that can clutter meaning for content-focused analysis. Most practical transcripts sit between these extremes, keeping the transcript “clean” by default while selectively retaining or marking details that matter for interpretation.

How can transcription choices change the perceived meaning of the same interview exchange?

A workplace example shows how. A cleaned transcript can present a colleague’s comment as friendly and normal. A more naturalistic transcript adds hesitation, awkward laughter, eye contact, and nervous glances, shifting the interpretation toward discomfort or creepiness. The lesson is that transcription isn’t neutral; it can alter context and therefore later analytic conclusions.

How should researchers justify their transcription choices later in their work?

Justification should tie directly to the study’s goals and the analysis method. Researchers can argue that the chosen level of detail was necessary to capture the evidence required for their analytic claims—content-focused approaches need less interactional detail, while discourse analysis needs more. Consistency in how notable cues are marked (e.g., brackets) supports transparency and defensibility.

Review Questions

  1. In a content-focused thematic analysis, what types of nonverbal or delivery cues would you include, and why?
  2. How would transcription requirements differ between a discourse analysis and a phenomenological study of lived experience?
  3. Where on the naturalism–denaturalism continuum would you place your transcript, and what evidence from your research question would justify that choice?

Key Points

  1. 1

    Transcription decisions should be driven by the research question and the planned analysis method, not by universal rules.

  2. 2

    Discourse analysis typically requires preserving interactional details like pauses, hesitations, and stutters because meaning is constructed through talk.

  3. 3

    Most content-focused qualitative analyses can use cleaner, more readable transcripts, since coding often targets what participants say rather than how they deliver it.

  4. 4

    Notable exceptions—such as unusually long pauses or visible anxiety—should be included or marked consistently, even in content-focused studies.

  5. 5

    Naturalism and denaturalism form a continuum: naturalism preserves raw speech detail, while denaturalism cleans speech for intelligibility.

  6. 6

    Transcription choices can change interpretation, so researchers should select the level of detail that best supports defensible later claims.

  7. 7

    Consistency in notation (e.g., using brackets to flag cues) improves transparency and makes justification easier.

Highlights

Universal transcription guidance is largely impossible because qualitative studies differ too much in goals and analytic needs.
Discourse analysis demands fine-grained transcription of interactional features; content-focused studies usually do not.
A naturalistic transcript can transform a seemingly normal exchange into one that signals discomfort or awkwardness.
Most researchers should aim for a middle position on the naturalism–denaturalism continuum, adding detail only when it matters for analysis.

Topics