Qualitative data analysis - Coding Tutorial - Focused Codes| "From Codes to Themes" episode 2

TL;DR

Create a backup of initial codes before making changes, so deletions, renames, or merges can be reversed if needed.

Briefing Cornell Notes

Briefing

After initial coding produces a sprawling set of labels, focused coding turns that mess into a manageable, more coherent code system—often by reorganizing codes into broad categories (such as “good things” vs “bad things”) and then merging duplicates and renaming vague labels. The payoff is practical: fewer codes to track, clearer meaning behind each label, and a cleaner path toward later theme development.

The process starts with a deliberate safety net. Codes are first dumped into an “initial coding” folder as a backup, then copied into a “focused codes” folder where changes happen. That backup serves three purposes: it reduces stress when mistakes occur (deleting, renaming, or merging codes), it creates an audit trail for evidence—showing how later themes trace back to earlier codes—and it supports methodology write-ups by providing concrete examples of how the analytic decisions evolved.

Focused codes become necessary when the initial set grows overwhelming. In the example dataset, 100 codes emerge from relatively simple interview material, and the problem isn’t just volume. As coding continues, researchers often create near-duplicate codes for the same idea, only realizing later that the overlap exists. Focused coding is the stage where those duplicates get cleaned up.

Timing is flexible. Focused coding can happen after all transcripts are initially coded, or earlier—after coding just three, four, or five transcripts—when the code list starts to feel unmanageable. Doing it earlier is described as “by definition” more focused because codes get grouped, renamed, and combined while new coding continues.

The core organizing move is grouping codes into commonsense piles, not yet themes. A common starting point in this walkthrough is splitting everything into two broad buckets: “good things” and “bad things.” Positive and negative codes are dragged into the appropriate category, while background or factual codes may be kept temporarily. The method is intentionally subjective at this stage: group names can be anything that makes sense, and the goal is organization rather than theoretical precision.

Once codes sit inside “good” and “bad” groups, the work becomes refinement. Vague or misleading labels—like “most fun things”—are opened and checked against the underlying text, then replaced with more accurate codes (e.g., “taking risks”). Duplicates are merged (for instance, combining “enjoying danger” into “taking risks”), and sometimes a code is split when it seems to refer to different sources (personal risk-taking versus work or movie-set opportunities that enable risk). The walkthrough repeatedly emphasizes that quotes must not disappear: merging keeps the coded text attached to the new, cleaner label.

As merging and renaming proceed, the number of codes drops sharply—from about 100 down to a more manageable set (roughly 30–40 in the “good” category in this example). The same cleanup logic is then applied to the “bad” category. The end goal is not just tidiness. These reorganized codes become a “table of contents” for understanding the dataset deeply enough to move on to developing themes.

Cornell Notes

Focused coding is the step after initial coding where a large, messy list of labels gets reorganized into a smaller, clearer code system. The workflow keeps an “initial coding” backup folder for stress reduction, audit trail evidence, and support for methodology write-ups, while changes happen in a separate “focused codes” folder. Focused coding can start after all transcripts are initially coded or earlier (after 3–5 transcripts) when the code list becomes overwhelming or duplicates appear. A common strategy is grouping codes into broad categories like “good things” and “bad things,” then cleaning within each group by renaming vague codes, merging duplicates, and sometimes splitting codes when they refer to different underlying ideas. The result is fewer, more meaningful codes that make later theme development feasible.

Why keep both an “initial coding” folder and a “focused codes” folder?

The backup (“initial coding”) protects against regret and stress when codes are deleted, renamed, or merged. It also functions as an audit trail: if someone later asks where themes came from, the original codes provide evidence. Finally, it supports writing the methodology/design chapter by letting the researcher show concrete examples of how coding decisions changed over time.

What triggers the move from initial coding to focused coding?

Focused coding becomes necessary when the code list grows too large and starts to feel unmanageable—like reaching around 100 codes from a small set of interviews. Another trigger is duplication: as coding continues, researchers may create different codes for the same idea and only later notice the overlap. Focused coding is the cleanup stage that reduces both volume and redundancy.

When should focused coding happen—at the end or midstream?

There’s no single rule. One option is to finish initial coding for all transcripts, then reorganize. Another is to switch earlier after coding 3–5 transcripts when the system feels out of control. Doing it earlier is described as more “focused” because codes get grouped, renamed, and combined while new transcripts are still being coded.

How do “good things” and “bad things” groups help before themes exist?

They provide a commonsense structure for organizing codes without forcing theme-level interpretation. Positive codes get placed under “good things,” negative codes under “bad things.” Researchers don’t need perfect group names yet; the point is to create piles that make the code system easier to navigate and later refine within each category.

What does cleanup inside a group look like in practice?

Cleanup involves opening vague codes to check what the underlying text actually says, then renaming them to reflect the real idea (e.g., replacing “most fun things” with “taking risks”). Duplicates are merged (e.g., combining “enjoying danger” into “taking risks”), and sometimes codes are separated when they refer to different sources—such as personal risk-taking versus work or movie-set conditions that enable risk. Merging must preserve the coded quotes under the new label.

How does this stage prepare the researcher for theme development?

By reducing and clarifying codes, focused coding makes the dataset easier to understand. Codes become a “table of contents” for what matters in the interviews. Once most codes fit into coherent groups and duplicates are cleaned, the researcher can more confidently identify patterns that can be turned into themes.

Review Questions

What are the three main reasons for keeping an “initial coding” backup folder separate from the “focused codes” workspace?
Describe two situations that make focused coding necessary, and explain how grouping into “good things” and “bad things” helps before themes are formed.
When merging or renaming codes, what must be checked to ensure the new code still accurately represents the underlying quotes?

Key Points

1
Create a backup of initial codes before making changes, so deletions, renames, or merges can be reversed if needed.
2
Use focused coding to reduce code overload and eliminate near-duplicate labels that emerge during detailed initial coding.
3
Choose timing based on workload: focused coding can start after all transcripts or earlier after 3–5 transcripts when the system feels unmanageable.
4
Group codes using commonsense categories (such as “good things” vs “bad things”) to organize the codebook without forcing theme-level interpretation.
5
Rename vague codes by checking the underlying coded text, so labels reflect what participants actually said.
6
Merge duplicates within each group, preserving all coded quotes under the revised code label.
7
Apply the same cleanup logic to both positive and negative groups so the code system becomes manageable before moving to theme development.

Highlights

Focused coding is about turning a large, duplicate-heavy code list into a smaller, more coherent system—often by reorganizing into “good things” and “bad things.”

Keeping an “initial coding” folder creates an audit trail and reduces stress when analytic decisions later need to be corrected.

Renaming and merging aren’t cosmetic: each vague code must be opened to verify what it really captures before it’s deleted or replaced.

Focused coding can begin midstream (after 3–5 transcripts) when the code list becomes overwhelming, not only after all transcripts are coded.

As codes get merged and clarified, the example drops from about 100 codes to roughly 30–40 in the “good” category, making later theme work feasible.

Topics

Focused Coding
Audit Trail
Code Organization
Merging Duplicates
Good vs Bad Codes

Mentioned

Dr Kriukow
NVivo