Thematic analysis with ChatGPT | PART 2- Coding qualitative data with ChatGPT

TL;DR

After initial coding, reorganize codes into groups before attempting final themes, using both manual judgment and ChatGPT-assisted sorting.

Briefing Cornell Notes

Briefing

Qualitative thematic analysis with ChatGPT doesn’t replace the core work of coding—it mainly accelerates the messy middle: reorganizing a large list of initial codes into workable groups. After generating initial codes with ChatGPT, the next step is to sort those codes into meaningful clusters (not yet final themes), then tighten the categories so they align with the study’s research questions—especially when the analysis includes both positive and negative influences on outcomes like job satisfaction.

The workflow starts with a practical audit of the coded material. Codes are color-coded so researchers can trace each code back to the interview it came from. Just as important, quotes must remain attached to codes; without the original quotations, later theme-building becomes hollow because there’s no evidence to support claims. The transcript emphasizes using search tools (e.g., Microsoft Word “Find”) to locate the quotes for any given code name, since ChatGPT reorganization will require repeatedly checking that the right evidence is still connected to the right label.

Before reorganizing, researchers also need to ensure the code list is complete and faithful to the data. If ChatGPT missed a relevant angle—such as a code that captures a limitation or challenge—researchers should add it manually. The guidance is blunt: too many codes are preferable to too few, because ChatGPT can only classify what it sees. This stage also involves scanning code summaries and deciding whether they fit the intended direction of the analysis (for example, the transcript’s focus includes both positive and negative factors affecting job satisfaction).

For grouping, the transcript recommends a hybrid approach: do some manual organization, but use ChatGPT to speed up first-pass sorting. A sample prompt assigns every code to one of three buckets: “challenges” (codes implying negative effects on job satisfaction), “positive factors” (codes implying positive effects), or “other” (codes that don’t clearly fit either). ChatGPT can be fast, but accuracy still needs human correction. The transcript shows this with misplacements—like a code that relates to “desire for something new” being incorrectly treated as a challenge—requiring researchers to move codes to the right group after checking the associated quotes.

Once codes sit in the right buckets, the next refinement is “clean-up” within each group: fix category errors, keep wording consistent, and only merge or rename codes carefully. Renaming creates a traceability problem—if a code label disappears, the researcher may lose the ability to find its quotes later. The workaround is to preserve the old wording in a comment or note before combining codes, so the evidence remains retrievable.

Finally, the transcript describes optional sub-grouping to support later theme development. For example, positive factors can be split into “personal” versus “external/workplace” influences, using additional ChatGPT prompts as brainstorming aids. The overall message is iterative and flexible: ChatGPT helps propose structure, but researchers must verify fit against the quotes, correct misclassifications, and maintain audit trails so themes can be defended with evidence.

Cornell Notes

After generating initial codes with ChatGPT, the next stage is to reorganize those codes into meaningful groups before forming final themes. Researchers must keep quotes traceable to codes—color-coding and using search tools (like Word’s Find) help locate the exact quotations later. ChatGPT can quickly assign each code to broad buckets such as “challenges,” “positive factors,” and “other,” but misclassifications are expected and must be corrected by checking the associated quotes. Researchers then clean up within groups, fix wording carefully, and merge codes only with a record of prior labels so evidence remains findable. Optional sub-grouping (e.g., personal vs external influences) can further prepare the path to final themes tied to research questions.

Why does quote traceability matter during code grouping, and how is it maintained?

Quote traceability is what makes later themes defensible. If codes are reorganized without keeping the original quotations attached, the analysis loses evidence. The transcript stresses using color-coding to track which interview each code came from and using a “Find” function (e.g., in Microsoft Word) to locate quotes by code name. When a code is moved or renamed, the researcher must still be able to search for the quote tied to the original label.

What are the three broad groups used to organize codes in the example workflow?

The example prompt forces every code into one of three buckets: (1) “challenges” for codes that suggest negative effects on job satisfaction, (2) “positive factors” for codes that suggest positive effects, and (3) “other” for codes that don’t clearly match the first two. The prompt also instructs ChatGPT to assign every code so it doesn’t stop after only a few entries.

What kinds of errors should researchers expect from ChatGPT during grouping?

Misclassification is common because code names and summaries can be ambiguous. The transcript gives a concrete example: a code about “desire for something new” was incorrectly placed into “challenges,” even though it aligns with positive experiences (seeking novel and engaging opportunities). The fix is manual: copy the code name, search for its quotes, read the quotations, then move the code to the correct group.

How should researchers handle renaming or merging codes without breaking the evidence trail?

Renaming can break quote retrieval because the quote lookup depends on the code label. The transcript recommends keeping the old wording in a comment or note before combining codes. That way, the researcher can still search for the original label to find the correct quotations, even after the code name changes.

Why is adding additional codes sometimes necessary even when ChatGPT generated an initial list?

ChatGPT may miss relevant angles present in the data. The transcript describes adding a new code when quotes suggest a limitation or challenge not captured in the initial set. The guidance is that having too many codes is safer than having too few, because researchers can later merge or refine categories once the full range of evidence is represented.

How can sub-grouping (like personal vs external factors) support later theme development?

Sub-grouping helps organize broad categories into more interpretable structures that can become sub-themes or feed directly into final themes. The transcript suggests using additional ChatGPT prompts to split positive factors into “personal” (actions, attitudes, individual qualities) versus “external” (workplace-provided qualities independent of the employee). Results may be imperfect, so researchers should treat this as brainstorming and then decide based on what the quotes actually support.

Review Questions

How would you verify that every grouped code still has an associated quote you can retrieve later?
What steps would you take if ChatGPT places a clearly positive code into the “challenges” bucket?
What record-keeping practice would you use before merging two codes with different labels?

Key Points

1
After initial coding, reorganize codes into groups before attempting final themes, using both manual judgment and ChatGPT-assisted sorting.
2
Keep quotes traceable to codes at every step; use color-coding and search tools to find quotations by code name.
3
Add missing codes when the data suggests additional positive/negative angles; too few codes can limit later theme quality.
4
Expect and correct ChatGPT misclassifications by checking the actual quotes tied to each code.
5
Maintain consistent wording when possible; if merging or renaming, record the old label so quotes remain findable.
6
Use broad buckets (challenges, positive factors, other) as a first-pass structure, then refine with sub-grouping aligned to research questions.
7
Treat sub-grouping prompts (e.g., personal vs external) as iterative brainstorming that must be validated against the quotations.

Highlights

The workflow hinges on quote traceability: reorganizing codes without preserving access to the original quotations undermines theme-building.

A practical ChatGPT prompt can force complete coverage by assigning every code to “challenges,” “positive factors,” or “other,” but human correction is still required.

Renaming codes can break quote lookup; the transcript recommends leaving a comment with the prior label before merging categories.

ChatGPT speeds up first-pass grouping, yet the accuracy check—reading quotes and moving misfit codes—remains a manual, evidence-driven step.

Topics

Thematic Analysis
Qualitative Coding
ChatGPT Prompts
Code Grouping
Quote Traceability