How to ACTUALLY use ChatGPT and Gemini as a researcher (advanced tactics)

TL;DR

ChatGPT and Gemini can miss recent scholarship because their training knowledge has a cutoff (around 2021–2023).

Briefing Cornell Notes

Briefing

Using ChatGPT or Gemini for research only works when the prompts supply the missing ingredients the models don’t have: personal context, field-specific facts, and a step-by-step process. Without those inputs, the outputs tend to be generic, vague, and prone to hallucinations—because the systems are trained on older data, don’t know who the researcher is, and are optimized to guess rather than admit uncertainty.

A key limitation is the knowledge cutoff: Gemini and ChatGPT rely on training data that stops around 2021–2023, so they may miss the latest findings or the most recent systematic literature reviews in a given specialty. Even when the model has broad background knowledge, it still lacks the “who” and “where” of the user—what discipline they’re in, what stage of research they’re at (first-year PhD, assistant professor, poster stage), and what constraints or goals shape their work. The result is predictable: prompts that don’t provide context force the model to respond as if it’s guessing what the user needs.

Hallucinations follow from training incentives. The models are rewarded for producing plausible answers, which can make them favor guessing over uncertainty. When challenged, they may still generate fabricated details unless the prompt explicitly demands factual grounding. This is why treating AI like a search engine or an oracle is risky. Researchers often misuse it by asking one-shot questions, accepting mediocre outputs without iterating, or using AI as a substitute for judgment—especially early-career researchers who may not yet have enough domain knowledge to detect errors.

The practical fix is prompt engineering built around four pillars: context, facts, precision, and process. Context means telling the model who the researcher is and what they’re working on. Facts means supplying the relevant source material—papers, documents, or other text—so the model can work from the user’s actual evidence rather than its outdated training base. Precision means narrowing the prompt so the model can’t “interpret” the request into something else. Process means guiding the model through the workflow a researcher would normally follow, rather than letting it improvise.

The transcript demonstrates the difference with a linguistics example. A generic prompt like “Can you help me find a good research question in linguistics?” produces output that’s broad and unhelpful—described as like searching for a needle in a haystack. The improvement comes after the user adds: (1) personal context (first-year PhD in linguistics), (2) a narrower topic (English as a lingua franca), (3) an attached recent paper to ground the response in current framing, and (4) explicit instructions for the model to act as a supervisor and run tasks one at a time. The revised interaction yields a structured first step: identifying research gaps from the provided text, choosing one gap, and then specifying the context for that gap.

In short, the models become genuinely useful when they’re treated like research assistants that need briefing materials and a workflow—not like independent experts with infinite knowledge.

Cornell Notes

ChatGPT and Gemini often produce generic or unreliable research help because they lack up-to-date knowledge, don’t know the user’s background, and can generate confident guesses that aren’t grounded in facts. The transcript’s core method is to prompt with four pillars: context (who the researcher is and their stage), facts (attach relevant papers or documents), precision (avoid vague requests), and process (give step-by-step instructions). A linguistics example shows that a vague prompt yields unhelpful output, while adding a persona, narrowing the topic, attaching a recent paper, and requesting a task-by-task workflow produces actionable results like identifying research gaps. The approach matters because it reduces hallucinations and turns AI from an oracle into a guided research partner.

Why do ChatGPT and Gemini outputs often feel generic or off-target for research?

They’re trained on older data with a knowledge cutoff (roughly 2021–2023), so they may miss newer literature. They also don’t know the user’s field, identity, or research stage, so the model can’t tailor recommendations to the actual constraints of a first-year PhD versus a later-stage project. Without that context, the model effectively guesses what the user wants, which leads to vague or irrelevant suggestions.

What drives hallucinations, and how can prompts reduce them?

The models are trained with incentives that reward producing plausible answers rather than admitting uncertainty. That makes them more likely to “fill in” missing details. Prompts can counter this by requiring factual grounding—e.g., attaching the relevant paper or documents and instructing the model to use them as the basis for responses—so the output is anchored to provided evidence rather than speculation.

What’s wrong with using AI like a one-shot search engine or oracle?

Treating AI as a search engine encourages one-question, one-answer behavior, even when the response is mediocre. Treating it as an oracle assumes it has infinite, correct knowledge and can replace researcher judgment. The transcript argues that better results come from iterative conversation and from using AI as a guided assistant that follows a research workflow rather than delivering final answers immediately.

How do context and facts work together in the improved prompt?

Context tells the model who the researcher is (e.g., a first-year PhD student) and what they’re focusing on (e.g., English as a lingua franca). Facts provide the model with the actual source material to reason from—such as attaching a recent paper that summarizes the topic. Together, they reduce the model’s need to guess and increase the relevance of the generated research gaps or questions.

Why does precision in the prompt matter as much as the topic?

Vague prompts leave too much room for interpretation. The transcript compares this to talking to a person who misunderstands because the request is unclear. With AI, that ambiguity can produce output that doesn’t match the intended task or format. Precision means specifying what to do (e.g., one task at a time), what to produce (e.g., identify gaps first), and how to present it (e.g., not overwhelming length).

What does a “process” prompt look like in practice?

Instead of asking for a final research question immediately, the prompt instructs the model to act like a supervisor and run a sequence of tasks. In the example, the model is told to identify research gaps from the attached text, choose one gap, and then explain the specific context for that gap. This turns the interaction into a guided workflow the researcher can follow and complete.

Review Questions

What specific limitations (knowledge cutoff, lack of user context, and hallucination incentives) explain why generic prompts underperform?
How does attaching a recent paper change the quality of AI-generated research gaps compared with asking a broad question?
Design a two-step prompt for a research assistant: what context, facts, precision constraints, and process instructions would you include?

Key Points

1
ChatGPT and Gemini can miss recent scholarship because their training knowledge has a cutoff (around 2021–2023).
2
AI doesn’t know the researcher’s identity, discipline, or career stage unless the prompt supplies that context.
3
Hallucinations are partly driven by training incentives that reward plausible guessing; grounding prompts in provided documents helps reduce fabrication.
4
Avoid treating AI as a one-shot oracle; iterate with follow-up context and task refinement.
5
Use precision to prevent misinterpretation—specify the task, output format, and how much detail to produce.
6
Provide facts by uploading or pasting relevant papers or texts so the model can reason from current material.
7
Guide the model with a research workflow (task-by-task instructions) rather than asking for a final answer immediately.

Highlights

Generic prompts like “find a good research question in linguistics” tend to produce broad, unhelpful output because they supply neither user context nor grounded facts.

Attaching a recent paper and instructing the model to use it as the basis for interaction turns vague suggestions into concrete steps like identifying research gaps.

The most effective prompts treat AI like a supervisor running a workflow—one task at a time—rather than an oracle delivering final answers.

Hallucinations persist when prompts allow guessing; requiring evidence-based responses and supplying source material reduces that risk.

Topics

Prompting for Research
Reducing Hallucinations
Research Gap Identification
Context and Grounding
Workflow-Based AI Use