"Advanced Data Analysis" with ChatGPT-4 | Cross-case and Within-case thematic analysis

TL;DR

Use ChatGPT-4 via ChatGPT Plus’s “Advanced Data analysis” to upload transcripts directly and avoid copy-paste limits.

Briefing Cornell Notes

Briefing

The core takeaway is a practical workflow for using ChatGPT-4 (via ChatGPT Plus’s “Advanced Data analysis” feature) to perform both within-case and cross-case thematic analysis—first extracting themes from each interview separately, then merging them into a shared thematic framework and finally probing differences by gender and age. The value is speed and structure: individual transcripts get summarized into leadership-style, challenge, and strategy themes, and those outputs then feed into a cross-case synthesis that can highlight patterns across participants.

The process starts with preparing interview transcripts that already include demographic markers. In the example, four leader interviews are uploaded, with each transcript labeled for gender (female/male) and age (e.g., “female 49 years old”). The demographic tagging matters because it gives the model a basis for later comparisons, not just a way to summarize leadership narratives.

A targeted prompt then instructs the system to analyze each transcript one at a time and list what can be said on key topics—leadership style, challenges, strategies, and related guidance. The “Advanced Data analysis” option processes the file directly (avoiding copy-paste) and returns a breakdown per participant, which functions as the within-case analysis. Notably, the workflow also produces extra content such as advice for aspiring leaders even when that wasn’t requested, suggesting the model may expand into adjacent themes when it sees leadership-related material.

After the within-case outputs are generated, the workflow shifts to cross-case analysis. The next prompt asks for a “common thematic framework” that combines the separate participant frameworks into one integrated set of themes. The resulting synthesis includes shared categories such as leadership styles, challenges, and organizational challenges, and it may also comment on differences that emerge across gender and age.

To sharpen the comparison, a follow-up prompt explicitly requests insights related to gender and/or age. This step yields theme-level comparisons that attribute what females and males emphasized, and it can incorporate combined gender-and-age insights. The presenter emphasizes that results can be inconsistent: running the same prompt on the same data may produce different levels of specificity, so repeating the request can sometimes yield a better comparison.

A key methodological caution runs through the workflow: overly strict instructions that demand full “thematic analysis” rigor (including detailed coding stages) can overwhelm the model or cause it to fail. The workaround is to use simpler, “common sense” language—asking for analysis and topic-based listing rather than demanding the entire formal coding pipeline in one go. The tool is positioned as a strong foundation for further investigation, not a replacement for researcher judgment, since it can be random and should not be relied on to complete an entire thesis end-to-end.

Overall, the transcript presents a repeatable, prompt-driven approach: within-case extraction per participant, cross-case synthesis into a shared framework, and targeted demographic comparison—while managing model limitations through pragmatic prompting and iterative refinement.

Cornell Notes

ChatGPT-4 with the “Advanced Data analysis” option can support thematic analysis in three stages: within-case, cross-case, and demographic comparison. First, each interview transcript is uploaded with demographic labels (gender and age), then analyzed separately to extract themes about leadership style, challenges, and strategies. Next, the outputs are combined into a common thematic framework that synthesizes patterns across participants. Finally, a focused prompt asks for differences related to gender and/or age, producing theme-level comparisons. Results can vary across runs, so repeating prompts and using simpler instructions (rather than demanding full formal coding) can improve reliability.

How does the workflow handle within-case thematic analysis for multiple participants?

Each interview transcript is processed separately. The prompt instructs the system to analyze one transcript at a time and list what can be said on the specified topics (leadership style, challenges, strategies). Because the transcripts include demographic markers (gender and age), the within-case outputs can later be compared across participants.

What changes when moving from within-case analysis to cross-case synthesis?

Instead of analyzing each transcript in isolation, the next prompt asks for a “common thematic framework” that combines the four separate participant frameworks. This step merges themes into shared categories such as leadership styles, challenges, and organizational challenges, producing a cross-case view rather than four separate summaries.

Why include demographic information inside the transcripts, and how is it used later?

Demographic labels (e.g., “female 49 years old” and male/female tags) give the model explicit metadata to condition comparisons on. After the common framework is built, a follow-up prompt requests insights specifically about differences related to gender and/or age, enabling theme-level comparisons like what females vs. males emphasized.

What prompting strategy helps avoid model overload during qualitative analysis?

Using simpler, “common sense” instructions—asking for analysis and topic-based listing—reduces the risk of the system attempting a full, formal thematic analysis pipeline (coding stages, detailed rigor) all at once. The transcript notes that overly structured demands can overwhelm the model and cause it to break down, so the process is better broken into stages.

How should researchers treat the model’s demographic comparisons?

As a starting point, not a final authority. The transcript warns that outputs can be random and inconsistent even with the same data and prompt. Re-running the prompt can sometimes produce more specific comparisons, but researcher judgment is still required and the tool should not replace full thesis-level work.

Review Questions

What are the three distinct stages in the workflow, and what prompt shift occurs between each stage?
How does embedding gender and age labels in transcripts enable later cross-case comparisons?
Why might demanding full formal thematic coding in one prompt cause failures, and what alternative wording is suggested?

Key Points

1
Use ChatGPT-4 via ChatGPT Plus’s “Advanced Data analysis” to upload transcripts directly and avoid copy-paste limits.
2
Tag each transcript with demographic metadata (at least gender and age) so later prompts can compare groups meaningfully.
3
Run within-case analysis by instructing the model to analyze each transcript separately and extract themes on leadership style, challenges, and strategies.
4
Build cross-case analysis by prompting for a common thematic framework that combines the individual participant frameworks into shared themes.
5
Add a targeted follow-up prompt to request differences related to gender and/or age, enabling theme-level comparisons.
6
Keep prompts pragmatic: simpler instructions reduce the risk of the model attempting an entire formal coding pipeline and overwhelming itself.
7
Expect variability across runs; repeat prompts and treat outputs as a foundation requiring researcher verification.

Highlights

Within-case analysis is done by analyzing each interview transcript separately, producing participant-level themes on leadership style, challenges, and strategies.

Cross-case synthesis comes next: a prompt asks for a single common thematic framework that merges the four participant frameworks into shared categories.

Demographic comparisons are enabled by embedding gender and age labels in the transcripts, then prompting specifically for gender/age differences at the theme level.

Topics

Within-Case Thematic Analysis
Cross-Case Synthesis
Demographic Comparison
Prompt Engineering
Qualitative Workflow