Coding and thematic analysis explained in 5 minutes
Based on Qualitative Researcher Dr Kriukow's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Thematic analysis follows a sequence: code first, then revise and organize codes, then develop themes that answer the research question.
Briefing
Thematic analysis turns messy qualitative material into research-ready findings by starting with coding—then reorganizing those codes into themes that directly answer a research question. The core workflow is simple in concept: label what participants say or do in manageable chunks, review and clean up the code set as patterns appear, and finally group the organized codes into themes—recurring topics and patterns that explain what the data means.
The process begins with coding, which is often intimidating only because of the name. In practice, coding means attaching short, specific labels to segments of data to reduce volume and make the material easier to work with. For interview data, that typically means describing what is said in a line, sentence, or paragraph. The goal is not to summarize too early, but to build a reliable “table of contents” of the dataset—so the researcher can return to the code list instead of rereading every transcript.
As coding progresses—either across multiple files or after finishing a single dataset—patterns start to show up. Codes may repeat frequently, or codes may share characteristics that suggest they belong under broader ideas. At this point, the coding framework often looks messy, which is normal. The next step is to review and organize the codes in a common-sense way, essentially tidying the code list into workable groupings. For example, in a study about teachers’ experiences during the pandemic, a researcher might notice multiple codes related to “struggles and challenges” and group them together, while also creating separate groups for “perceived benefits” or “coping strategies.” This organization helps the researcher make sense of what the dataset is telling them.
Only after codes are organized does thematic development begin. Although people talk about “emerging themes,” themes do not simply appear on their own. The researcher has to actively develop them by examining the grouped codes, comparing them against the research questions, and deciding what the codes collectively demonstrate. Themes function as the study’s interpretive layer: they are topics and patterns that show how the data answers what the research set out to investigate.
The final aim is to build a thematic framework that includes everything needed to tell the reader the story of the data. That means ensuring the framework uses the available tools—codes—so the reader can understand what the researcher knows, how that knowledge is grounded in the coded material, and how the resulting themes respond to the research questions. In short, thematic analysis is a structured path from detailed labeling to organized interpretation, designed to produce clear, defensible answers from qualitative data.
Cornell Notes
Thematic analysis relies on a disciplined sequence: code first, then revise and organize codes, and finally develop themes that answer the research question. Coding means labeling segments of qualitative data (often interview lines, sentences, or paragraphs) with short names that reduce volume and create a “table of contents” for the dataset. As coding continues, repeating or related codes reveal patterns, but the code set may look messy and needs cleanup through common-sense grouping (e.g., challenges vs. benefits vs. coping strategies). Themes are not automatic; they are developed by mapping organized codes to the research questions and deciding what the dataset collectively demonstrates.
Why does thematic analysis always start with coding, and what does “coding” practically mean?
How should a researcher handle a coding framework that looks messy?
What triggers the move from coding to theme development?
What does it mean to “develop” themes rather than assume they will emerge?
How does the thematic framework ensure the reader understands the study’s conclusions?
Review Questions
- What are the concrete steps from coding to themes in thematic analysis, and what changes at each stage?
- How can a researcher decide when to reorganize codes, and what is an example of a common-sense grouping?
- Why are themes described as requiring development rather than simply emerging from the data?
Key Points
- 1
Thematic analysis follows a sequence: code first, then revise and organize codes, then develop themes that answer the research question.
- 2
Coding means labeling segments of qualitative data (often interview lines, sentences, or paragraphs) with specific names to reduce volume and improve manageability.
- 3
A well-built code list acts like a “table of contents,” helping researchers rely on codes instead of rereading full transcripts.
- 4
As coding progresses, repeating or related codes reveal patterns, but the code set may look messy and should be cleaned up through common-sense grouping.
- 5
Themes are not automatic; they are actively developed by mapping organized codes to the research questions and determining what patterns the dataset demonstrates.
- 6
The final thematic framework should tell a coherent story grounded in codes, making it clear what the researcher knows and how it answers the study’s aims.