"Advanced Data Analysis" with ChatGPT-4 | Cross-case and Within-case thematic analysis
Based on Qualitative Researcher Dr Kriukow's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Use ChatGPT-4 via ChatGPT Plus’s “Advanced Data analysis” to upload transcripts directly and avoid copy-paste limits.
Briefing
The core takeaway is a practical workflow for using ChatGPT-4 (via ChatGPT Plus’s “Advanced Data analysis” feature) to perform both within-case and cross-case thematic analysis—first extracting themes from each interview separately, then merging them into a shared thematic framework and finally probing differences by gender and age. The value is speed and structure: individual transcripts get summarized into leadership-style, challenge, and strategy themes, and those outputs then feed into a cross-case synthesis that can highlight patterns across participants.
The process starts with preparing interview transcripts that already include demographic markers. In the example, four leader interviews are uploaded, with each transcript labeled for gender (female/male) and age (e.g., “female 49 years old”). The demographic tagging matters because it gives the model a basis for later comparisons, not just a way to summarize leadership narratives.
A targeted prompt then instructs the system to analyze each transcript one at a time and list what can be said on key topics—leadership style, challenges, strategies, and related guidance. The “Advanced Data analysis” option processes the file directly (avoiding copy-paste) and returns a breakdown per participant, which functions as the within-case analysis. Notably, the workflow also produces extra content such as advice for aspiring leaders even when that wasn’t requested, suggesting the model may expand into adjacent themes when it sees leadership-related material.
After the within-case outputs are generated, the workflow shifts to cross-case analysis. The next prompt asks for a “common thematic framework” that combines the separate participant frameworks into one integrated set of themes. The resulting synthesis includes shared categories such as leadership styles, challenges, and organizational challenges, and it may also comment on differences that emerge across gender and age.
To sharpen the comparison, a follow-up prompt explicitly requests insights related to gender and/or age. This step yields theme-level comparisons that attribute what females and males emphasized, and it can incorporate combined gender-and-age insights. The presenter emphasizes that results can be inconsistent: running the same prompt on the same data may produce different levels of specificity, so repeating the request can sometimes yield a better comparison.
A key methodological caution runs through the workflow: overly strict instructions that demand full “thematic analysis” rigor (including detailed coding stages) can overwhelm the model or cause it to fail. The workaround is to use simpler, “common sense” language—asking for analysis and topic-based listing rather than demanding the entire formal coding pipeline in one go. The tool is positioned as a strong foundation for further investigation, not a replacement for researcher judgment, since it can be random and should not be relied on to complete an entire thesis end-to-end.
Overall, the transcript presents a repeatable, prompt-driven approach: within-case extraction per participant, cross-case synthesis into a shared framework, and targeted demographic comparison—while managing model limitations through pragmatic prompting and iterative refinement.
Cornell Notes
ChatGPT-4 with the “Advanced Data analysis” option can support thematic analysis in three stages: within-case, cross-case, and demographic comparison. First, each interview transcript is uploaded with demographic labels (gender and age), then analyzed separately to extract themes about leadership style, challenges, and strategies. Next, the outputs are combined into a common thematic framework that synthesizes patterns across participants. Finally, a focused prompt asks for differences related to gender and/or age, producing theme-level comparisons. Results can vary across runs, so repeating prompts and using simpler instructions (rather than demanding full formal coding) can improve reliability.
How does the workflow handle within-case thematic analysis for multiple participants?
What changes when moving from within-case analysis to cross-case synthesis?
Why include demographic information inside the transcripts, and how is it used later?
What prompting strategy helps avoid model overload during qualitative analysis?
How should researchers treat the model’s demographic comparisons?
Review Questions
- What are the three distinct stages in the workflow, and what prompt shift occurs between each stage?
- How does embedding gender and age labels in transcripts enable later cross-case comparisons?
- Why might demanding full formal thematic coding in one prompt cause failures, and what alternative wording is suggested?
Key Points
- 1
Use ChatGPT-4 via ChatGPT Plus’s “Advanced Data analysis” to upload transcripts directly and avoid copy-paste limits.
- 2
Tag each transcript with demographic metadata (at least gender and age) so later prompts can compare groups meaningfully.
- 3
Run within-case analysis by instructing the model to analyze each transcript separately and extract themes on leadership style, challenges, and strategies.
- 4
Build cross-case analysis by prompting for a common thematic framework that combines the individual participant frameworks into shared themes.
- 5
Add a targeted follow-up prompt to request differences related to gender and/or age, enabling theme-level comparisons.
- 6
Keep prompts pragmatic: simpler instructions reduce the risk of the model attempting an entire formal coding pipeline and overwhelming itself.
- 7
Expect variability across runs; repeat prompts and treat outputs as a foundation requiring researcher verification.