Codes, Categories and Themes - Understand the difference
Based on Qualitative Researcher Dr Kriukow's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Codes are the most specific analytic labels; categories are broader groupings; themes are the most inclusive, abstract organizing ideas.
Briefing
Codes, categories, and themes form a hierarchy—starting with the most specific “codes” and moving up to broader “categories” and the most inclusive “themes.” The practical problem is that researchers and supervisors often use these labels differently, sometimes even treating them as interchangeable in write-ups of findings. That mismatch matters because it affects how results are organized, how readers interpret the analysis, and how easily a student can defend their structure during supervision.
In the clearest, “ideal world” version, analysis begins with codes: small, specific labels attached to particular pieces of text. Codes function as analytic tools during data sorting—often created in qualitative software as the researcher tags excerpts. As analysis progresses, related codes are grouped into more inclusive units. Categories sit above codes, bundling several codes under a broader, less specific concept. Themes sit at the top of the hierarchy, representing the most abstract, overarching ideas that organize the study’s findings.
Yet published work doesn’t always follow that clean structure. Some authors use all three terms when presenting results, and some supervisors may push students toward categories and themes (and sometimes codes) at the findings stage. An example given for online education challenges illustrates the hierarchy: a theme like “challenges students face in online education” could include categories such as “personal challenges” and “institutional challenges.” Those categories, in turn, could be supported by codes such as “financial challenges” or “lack/shortage of teachers.”
Drilling into why confusion persists, the key distinction is timing and purpose. Codes are typically meant for the analysis stage—summaries of what participants say, attached to extracts, and used to build a coding framework. Once the final structure is ready for the dissertation or paper, the write-up should focus on themes (and whatever sub-levels the researcher uses), not on presenting raw analytic ingredients as if they were the final findings. To make that shift intuitive, the transcript uses a cooking analogy: codes are like ingredients while working (e.g., yeast, sugar, salt), but the final product is the theme (e.g., pizza). In other words, codes matter internally for building the framework; themes matter externally for reporting.
Where categories fit depends on the researcher’s chosen terminology. The transcript emphasizes consistency over universal correctness: different methodologies and authors may adopt different label systems. In Dr Kriukow’s own approach, categories are intentionally avoided in favor of themes and sub-themes. “Benefits” and “challenges” become themes; “personal” and “institutional” challenges become sub-themes; and further detail is handled either as additional sub-themes or as “sub-themes” at deeper levels rather than introducing “sub-sub-themes.” The takeaway is not that any one approach is inherently wrong, but that students should use one coherent set of terms, apply it consistently, and be prepared to justify it when supervisors request a different vocabulary—especially because the same underlying concepts can be described with different labels across studies and traditions like grounded theory.
Cornell Notes
Codes, categories, and themes can be arranged from most specific to most inclusive: codes are the smallest analytic labels, categories group related codes, and themes sit at the top as the most abstract organizing ideas. A major source of confusion is that some researchers and supervisors use these terms differently when reporting findings. Codes are typically tools used during data analysis (often inside software) and are most useful for building a coding framework; once writing up results, the focus shifts to themes and their sub-levels. One consistent approach described here uses themes and sub-themes only, intentionally avoiding “categories” to keep terminology simple and defensible. The core lesson: choose a terminology system and apply it consistently, then be ready to explain it to supervisors.
What is the most common hierarchy among codes, categories, and themes, and what changes as you move up it?
Why do codes often disappear from the final findings write-up even though they are central to analysis?
How does the cooking analogy clarify the code-to-theme transition?
What does it mean to use themes and sub-themes instead of categories, and how deep can the hierarchy go?
How should a student respond when a supervisor asks for categories and themes instead of the student’s terminology?
Review Questions
- If codes are primarily analytic tools, what should be the main focus when writing up findings, and why?
- In the online education example, how would you map a theme to categories (or sub-themes) and then to codes?
- What are the risks of mixing terminology systems (e.g., using categories in one place and not in another), and how does consistency address them?
Key Points
- 1
Codes are the most specific analytic labels; categories are broader groupings; themes are the most inclusive, abstract organizing ideas.
- 2
Codes are typically used during data analysis (including inside qualitative software) to build a coding framework, not as the primary unit of the final findings narrative.
- 3
Once the coding framework is complete, reporting usually shifts toward themes and sub-themes rather than presenting codes as if they were the final results.
- 4
Terminology varies across authors and methodologies, so “correctness” often depends on choosing a system and applying it consistently.
- 5
A practical approach described here uses themes and sub-themes only, intentionally avoiding “categories” to keep the hierarchy simple and defensible.
- 6
When supervisors use different terminology, students should be ready to justify their choices and explain how the labels correspond to the same underlying concepts.