How codes become themes (Part 2) | Generating thematic framework

TL;DR

Themes should be built to answer the study’s research questions, not just to reorganize codes mechanically.

Briefing Cornell Notes

Briefing

The central takeaway is that themes don’t emerge from codes in a single “correct” way; they’re shaped by the study’s research questions, and the resulting thematic framework must be immediately understandable to a reader. The shift from codes to themes happens when analysis starts asking, “Which parts of the coded data answer what the study is trying to find?”—including decisions to merge codes, rename them, or delete items that don’t help address the questions.

Using a hypothetical study about mental health access and provision in Saudi Arabia, the analysis begins with a messy middle stage: a set of codes that are partly organized but still unclear. Codes include effects of mental health on social, family, professional, and psychological well-being; obstacles such as stigma, cost of treatment, and lack of knowledge; and “good things” that help, like supportive family, empathetic doctors, friends who can be talked to, and self-education through books or online materials. The key point is that this mixture is common—students often feel stuck—yet it may take only a short, focused restructuring effort to turn codes into a coherent thematic framework.

Before building frameworks, the guidance sets two reader-facing rules. First, the thematic framework must signal what the study is about: by looking at the themes a reader should understand the target of the research. Second, themes must be self-explanatory: theme labels should clarify what they refer to without requiring insider knowledge.

A common failure mode is keeping overly vague labels. “Cost of treatment,” for instance, is treated as a code-level label that doesn’t communicate meaning on its own. Moving it under a broader, reader-friendly theme like “obstacles/negative experiences” makes the framework interpretable. Similarly, “effects” scattered across multiple codes can be consolidated into a single theme such as “effects on mental health,” with sub-themes for social, family, professional, and psychological impacts.

From there, two alternative thematic frameworks are built to demonstrate flexibility—both are considered valid. Version 1 aligns tightly with the research question about improvement: “good things” and supportive experiences are transformed into “ways to improve mental health services and availability.” Participant statements about knowledgeable doctors, supportive family, and self-education are reinterpreted as actionable implications (e.g., ensuring professionals are educated, raising public awareness, and reducing shame). At the same time, items that don’t directly help answer the questions—like “first symptoms” or “history of illness in the family” (in this version)—may be deleted or repurposed.

Version 2 takes a different route. Instead of converting supportive experiences into “ways to improve,” it keeps them as “factors that benefit mental health access and provision,” then categorizes them as internal versus external. External factors include supportive family, empathetic or knowledgeable doctors, and female doctors; internal factors include learning through books/online materials and shifting mindsets away from shame. Version 2 also keeps “history of illness in the family,” treating it as a likely positive influence because prior exposure may reduce stigma and increase acceptance.

Overall, the method emphasizes that theme construction is an interpretive, decision-driven process—guided by research questions, constrained by clarity for readers, and validated by whether the framework supports a coherent discussion of barriers, impacts, and improvement (or enabling factors) in the specific context studied.

Cornell Notes

Themes are built by reorganizing codes into a framework that directly answers the study’s research questions and is clear to a reader. A common mistake is using code-level labels (like “cost of treatment”) as themes; these need to be placed into meaningful categories such as “obstacles/negative experiences.” The process is flexible: two different thematic frameworks can both be correct if they are coherent and research-question-driven. In Version 1, supportive experiences are reframed as “ways to improve” services, while in Version 2 they remain “factors that benefit” access and are split into internal and external categories. The choice of what to keep, rename, merge, or delete depends on interpretation and how each decision supports the final discussion.

When does a code start becoming a theme in this approach?

The shift happens when analysis turns toward the research questions—deciding which codes are relevant, which should be merged, and which should be deleted because they don’t help answer the questions. There’s no single moment where a code automatically becomes a theme; instead, the research-question lens determines the restructuring.

Why is “cost of treatment” treated as a problem label, and what fixes it?

“Cost of treatment” is too vague for a reader: it signals a topic but not its meaning or role in the study. The fix is to relocate it under a broader, interpretable theme such as “obstacles/negative experiences,” so the reader can immediately see how it functions in the barriers to mental health access.

How does Version 1 convert participant “good things” into themes?

Version 1 aligns with the improvement-focused research question. Supportive experiences (empathetic doctors, knowledgeable professionals, supportive family, friends who can be talked to, and self-education) are reframed as “ways to improve mental health services and availability.” The method involves interpreting what participants said as implications—e.g., turning “knowledgeable doctors helped” into “ensure mental health professionals are educated about conditions,” and turning self-education into “raise awareness” for the public.

What changes in Version 2, and why is it still valid?

Version 2 keeps supportive experiences as “factors that benefit mental health access and provision” rather than forcing them into improvement recommendations. It then categorizes them as internal versus external (e.g., internal: learning from books/online materials and mindset shifts; external: supportive family, empathetic doctors, female doctors). It’s valid because it still produces a coherent framework that can be used in discussion, without requiring every code to become an action item.

Why might “history of illness in the family” be deleted in Version 1 but kept in Version 2?

In Version 1, it’s treated as not directly useful for answering the framework’s immediate structure, so it may be deleted. In Version 2, it’s kept because it can plausibly function as a positive factor: prior family experience with mental illness may reduce stigma, increase acceptance, and improve knowledge—therefore supporting access and provision.

Review Questions

What two reader-facing requirements should a thematic framework meet, and how do they influence theme naming?
Give one example of a code-level label that would likely need restructuring into a theme, and explain where it should go and why.
Compare Version 1 and Version 2: what interpretive decision changes between them, and how does that affect what gets kept or deleted?

Key Points

1
Themes should be built to answer the study’s research questions, not just to reorganize codes mechanically.
2
A thematic framework must quickly communicate what the study is about and what each theme refers to.
3
Code-level labels that lack meaning for outsiders (e.g., “cost of treatment”) should be placed under clearer, reader-friendly categories.
4
Two different thematic frameworks can both be correct if they are coherent and research-question-driven.
5
Version 1 reframes supportive experiences into improvement recommendations (“ways to improve”), while Version 2 treats them as enabling conditions (“factors that benefit”).
6
Internal/external categorization is a practical way to structure “good things” without forcing them into action-oriented wording.
7
Decisions to keep, rename, merge, or delete items should be justified by whether they help address barriers, impacts, and improvement (or enabling factors) in the specific context.

Highlights

The code-to-theme transition is guided by the research questions: relevance, merging, and deletion decisions happen when analysis starts asking what the study needs to answer.

“Cost of treatment” illustrates a common theme-label failure—moving it under “obstacles/negative experiences” makes it interpretable for readers.

Version 1 and Version 2 demonstrate that theme generation is flexible: supportive experiences can become either “ways to improve” or “factors that benefit,” depending on interpretation.

Internal vs external categorization can preserve participant meaning while still producing a structured thematic framework.

Items like “first symptoms” may be removed if they don’t help answer the research questions, while “history of illness in the family” may be kept when it plausibly functions as a positive influence.

Topics

Thematic Framework
Codes to Themes
Research Questions
Internal vs External
Theme Naming

Mentioned

NVivo