Thematic analysis with ChatGPT - 3 ways to create and/or organize your themes in ChatGPT

TL;DR

ChatGPT can assist thematic analysis by reorganizing codes and themes, but the researcher should keep interpretive control over what themes mean and which ones are used.

Briefing Cornell Notes

Briefing

ChatGPT can speed up thematic analysis in three practical ways—first by drafting themes from a code list, second by sorting codes into already-chosen theme buckets, and third by clustering long lists of subthemes into cleaner categories. The most useful pattern across all three is keeping final interpretive control: ChatGPT can reorganize and propose structure, but the researcher still decides what themes mean and which ones best tell the study’s story.

The first approach—asking ChatGPT to generate themes directly from codes—is presented as the least preferred option. The core concern is that themes in qualitative research aren’t supposed to “emerge magically” from data; they’re selected and shaped by the researcher’s knowledge, study focus, and conceptual priorities. Still, the method can be demonstrated by providing ChatGPT with context: a hypothetical study of educational leaders’ lived experiences during crises (conflict, war, economic crisis, and other disruptions), plus a list of codes representing challenges, strategies, and suggestions. With that setup, ChatGPT returns candidate themes and assigns codes under each one. In the example, it produces broad, polished-sounding themes such as “Institutional preparedness and resilience,” “Equitable and Effective Education delivery,” and “Community collaboration,” along with code allocations. The critique is that these themes can come out too general or abstract, mixing different kinds of content (good practices, challenges, and recommendations) rather than separating them into more actionable groupings like challenges, strategies, and suggestions.

The second approach is more aligned with how many researchers work: start with predefined themes and use ChatGPT to group codes into those categories. Here, the prompt is narrower—asking ChatGPT to place the same code list into buckets such as “challenges,” “strategies to overcome challenges,” and “suggestions.” This tends to be more convenient than generating themes from scratch, but it may still be of limited value if the researcher already understands their codes well. The example also notes that ChatGPT can struggle when codes are less descriptive than the model expects, leading to imperfect categorization that still requires human review.

The third approach—grouping subthemes into higher-level categories—is described as the most consistently helpful. When a researcher has a long list of challenges (or strategies, or suggestions), ChatGPT can cluster them into several categories, producing a more readable structure. In the demonstration, a list of challenges gets grouped into categories such as “financial and resource challenges” and “stuffing and personal issues” (as generated in the example). The output can include overlaps or awkward category boundaries, but it often provides a useful starting point. Researchers can then refine the number of groups, adjust labels, and resolve overlaps, using ChatGPT to maximize the value of their own work rather than outsourcing interpretation.

Overall, the practical takeaway is staged assistance: use ChatGPT for organization and scaffolding—especially for clustering and reformatting—while keeping the thematic decisions, definitions, and narrative framing firmly in human hands.

Cornell Notes

ChatGPT can support thematic analysis by reorganizing code and theme structures in three ways: (1) generating themes from a code list, (2) sorting codes into predefined theme buckets, and (3) clustering subthemes into higher-level categories. Generating themes from scratch is treated as the least reliable because themes should reflect researcher-driven choices rather than “magic” emergence from data. Sorting codes into researcher-chosen categories is more practical, though it still needs checking—especially when codes are vague. Clustering subthemes into categories is the most consistently useful, giving a readable structure and a starting point that the researcher can refine. The key is using ChatGPT for scaffolding while retaining interpretive control.

Why is generating themes directly from a code list considered the least preferred workflow?

Themes are not supposed to “emerge magically” from the data. The workflow is criticized because it can produce broad, abstract themes that don’t match the researcher’s intended story or level of specificity. In the example, ChatGPT outputs polished themes like “Institutional preparedness and resilience” and “Equitable and Effective Education delivery,” but the critique is that these can be too general and may mix different content types (challenges, good practices, and suggestions) rather than separating them into clearer, actionable groupings.

How does the “predefined themes” approach change what ChatGPT is asked to do?

Instead of asking ChatGPT to invent themes, the prompt instructs it to sort an existing code list into researcher-defined categories. The example uses buckets such as “challenges,” “strategies to overcome challenges,” and “suggestions.” This makes the task more constrained and often more convenient, but it can still be imperfect if the codes are less descriptive than the model expects, requiring human review and adjustment.

What makes grouping subthemes into categories a particularly good use of ChatGPT?

It turns long lists into a clearer structure. The workflow asks ChatGPT to group a list of challenges (or other items) into several categories, which improves readability and helps the researcher see possible higher-level organization. In the demonstration, challenges are clustered into categories like “financial and resource challenges” and “stuffing and personal issues,” providing a starting point even when overlaps occur.

What kind of problems can appear when ChatGPT groups subthemes into categories?

The output can include too many overlaps between categories or category boundaries that don’t feel clean. The suggested response is iterative refinement: the researcher can later reduce the number of groups, fix overlaps, and adjust labels. The value lies in using ChatGPT as an organizer that accelerates early structuring, not as the final authority on meaning.

What role does researcher control play across all three workflows?

Across the approaches, the researcher remains responsible for thematic decisions. Even when ChatGPT proposes themes or categories, the researcher decides what themes are used, how they are defined, and how they support the study’s narrative. The workflow is framed as maximizing the value of the researcher’s own work—using ChatGPT for organization and scaffolding while retaining interpretive ownership.

Review Questions

When would it be better to ask ChatGPT to generate themes from codes versus sorting codes into predefined categories? Why?
What types of outputs from ChatGPT are most likely to require human correction in thematic analysis?
How can clustering subthemes into categories improve the readability and usefulness of a thematic framework?

Key Points

1
ChatGPT can assist thematic analysis by reorganizing codes and themes, but the researcher should keep interpretive control over what themes mean and which ones are used.
2
Generating themes directly from a code list is convenient but often produces overly general or abstract themes that may not match the study’s intended story.
3
Sorting codes into researcher-defined buckets (e.g., challenges, strategies, suggestions) is usually more aligned with qualitative practice and easier to validate.
4
Grouping subthemes into higher-level categories is the most consistently useful workflow for turning long lists into a clearer structure.
5
ChatGPT’s category outputs can include overlaps or awkward groupings; iterative refinement by the researcher is expected.
6
The quality of ChatGPT’s grouping depends on how descriptive and well-formed the underlying codes are.

Highlights

ChatGPT can draft themes from codes, but the workflow is criticized because themes should reflect researcher-driven choices rather than “emerge” automatically.

A more reliable use is to provide predefined theme buckets and ask ChatGPT to sort codes into them—still requiring review when codes are vague.

Clustering subthemes into categories is presented as the best fit: it creates a cleaner structure and offers a useful starting point that the researcher can refine.

Topics

Mentioned

Kriukow