How to do a Literature Review | Step by Step

TL;DR

Draft working chapter headings first, then treat them as flexible placeholders while you build the related-work structure.

Briefing Cornell Notes

Briefing

A literature review becomes manageable when it’s built like a pipeline: start with thesis-level headings, break each heading into focused sub-questions, then search, screen, summarize, and synthesize until each paragraph has clear claims backed by clustered evidence. The core idea is to avoid drowning in papers by narrowing the scope to one subtopic at a time—so the review grows in organized layers rather than as a chaotic pile of reading notes.

The process begins by drafting the chapter structure for the thesis, including likely “main headings” and related-work sections. Instead of treating these as final, the approach treats them as working placeholders. From there, each main topic is converted into a set of subtopics framed as questions. For an injury prediction and prevention section, example sub-questions include how frequently runners get injured (and how that varies by distance or runner type), what risk factors drive those injuries, which prevention techniques already have support, and what advances exist in computer science that could enable prediction using machine learning.

Once a subtopic is chosen—say, “how frequent are running-related injuries?”—the next step is keyword brainstorming. The goal is to generate search terms that capture both the population and the outcome, such as “running,” “marathon,” “runner/marathoner,” and injury incidence concepts like “incidence,” “frequency,” or “injury incidence.” With keywords in hand, the workflow shifts to database searching (the transcript mentions PubMed as an example, but the method is database-agnostic). Search results can be overwhelming, so screening happens in two passes: first the title, then the abstract. Papers that don’t match the subtopic are rejected early, and selected papers are stored in a structured folder system (e.g., a folder for “injury prevention,” with subfolders for each subtopic).

After selecting papers, the researcher creates a lightweight summary document populated with brief, subtopic-specific findings—typically only a couple of sentences per paper—so patterns can emerge without writing full paragraphs too soon. Then comes citation tracing: using tools like Google Scholar (and similar platforms) to follow who cited the chosen papers and which references they cite. This step helps expand the set of relevant literature, but it also requires restraint; citation tracing can explode into thousands of documents, so systematic reviews should often be treated as sufficient unless a specific referenced study is directly needed.

Reading follows the same scope logic. For “frequency of injuries,” the emphasis is on results and what the studies found, not on replicating the study methods. For computer science subtopics—where machine learning approaches may be central—the methodology sections become more important because the thesis may depend on understanding how those techniques work.

Finally, synthesis turns summaries into paragraph-ready claims. The transcript illustrates aggregating multiple papers into a handful of main points, such as: injury incidence is high across runner distances; ultra marathoners show high rates of injuries requiring medical care; time-loss injuries follow a U-shaped relationship with running distance; novice runners experience the highest injury rates; and injury incidence relates to dropping out of running. Each paragraph is then shaped with a topic sentence and a concluding sentence that highlights the remaining gap—often setting up the next section (e.g., moving from incidence to risk factors and then to prevention). The overall payoff is a repeatable method for building a literature review one subtopic at a time, with evidence organized enough to write confidently.

Cornell Notes

The literature review workflow centers on building thesis paragraphs from focused subtopics rather than reading everything at once. First, draft chapter headings and then translate each main topic into sub-questions (e.g., injury frequency, risk factors, prevention, and computer science advances for prediction). For one subtopic, brainstorm keywords, search a database, and screen papers using titles and abstracts, storing relevant studies in a structured folder system. Summarize each selected paper briefly from the abstract, then expand coverage through citation tracing while avoiding runaway reading. After reading the most relevant sections (results for sports-science frequency; methods for computer science), synthesize recurring findings into a small set of main points, then write a paragraph using a topic sentence and a gap-focused concluding sentence.

How does the method prevent a literature review from becoming unmanageable?

It narrows scope to one subtopic at a time. After choosing a main heading (like injury prediction and prevention), the process breaks it into sub-questions (such as injury frequency). Search results are screened immediately using titles and abstracts, and only papers relevant to that subtopic are kept. Selected papers are organized into folders by topic and subtopic, and synthesis happens by aggregating a small set of recurring findings into main points—rather than trying to write full paragraphs for every paper.

What does “subtopic-first” look like in practice for injury frequency?

The transcript uses “how frequent are running-related injuries?” as an example subtopic. Keywords are brainstormed to capture both the population and the outcome—e.g., “running,” “marathon,” “runner/marathoner,” and injury incidence terms like “incidence” or “frequency.” The search then targets papers that match that specific question, and summaries are written to reflect only the main findings relevant to injury frequency.

Why do titles and abstracts come before deeper reading?

Because search results can produce hundreds of articles, and early screening prevents wasted effort. Titles and abstracts are used to decide relevance quickly. Papers that don’t fit the subtopic are rejected, while relevant ones are saved and later summarized. This keeps the review from turning into parallel reading across unrelated studies.

What is citation tracing, and how is it used without getting overwhelmed?

Citation tracing follows the network of papers: it checks who cited the selected papers and which references those papers cite, often using Google Scholar. It helps uncover newer or related studies that weren’t found in the initial keyword search. The transcript warns against runaway expansion—especially when systematic reviews already summarize many studies—so additional reading should be targeted only when a specific referenced study is needed.

How should reading emphasis change across different subtopics?

Reading emphasis depends on what the thesis needs from that subtopic. For sports-science questions like injury frequency, results sections matter most because the goal is understanding what was found, not reproducing the study’s data collection. For computer science questions (e.g., machine learning advances for prediction), methodology sections become crucial because the thesis may need to understand how the techniques work in detail.

How are main points turned into a paragraph that fits academic writing expectations?

After building a summary document, the researcher extracts a small set of main points that recur across papers (e.g., high incidence across distances, higher medical-care rates in ultra marathoners, a U-shaped pattern for time-loss injuries, highest rates in novice runners, and links between injury incidence and dropping out). Then the paragraph is structured with a topic sentence that signals the paragraph’s focus and a concluding sentence that points to the literature gap—often setting up the next section on risk factors or prevention.

Review Questions

When choosing a subtopic, what specific questions should guide keyword selection and screening criteria?
How would you decide whether to read methodology deeply or focus mainly on results for a given subtopic?
What strategies in citation tracing help balance completeness with the risk of reading overload?

Key Points

1
Draft working chapter headings first, then treat them as flexible placeholders while you build the related-work structure.
2
Convert each main topic into sub-questions so each paragraph can be anchored to a clear, narrow research focus.
3
Use keyword brainstorming to cover both the population (e.g., runners, marathoners) and the outcome (e.g., injury incidence/frequency).
4
Screen papers early by reading titles and abstracts, and reject irrelevant results before full reading begins.
5
Store papers by topic and subtopic in a structured folder system to prevent mixing evidence across paragraphs.
6
Use citation tracing to expand coverage, but avoid reading every paper inside systematic reviews unless a specific study is essential.
7
Synthesize by extracting a small set of recurring main points, then write each paragraph with a topic sentence and a gap-focused concluding sentence.

Highlights

The workflow is designed to build one paragraph at a time by selecting one subtopic, screening papers for that subtopic only, and synthesizing recurring findings into a handful of main points.

Citation tracing (who cited whom, and who is cited) is used to find additional relevant studies, but it must be constrained to prevent thousands of papers from derailing progress.

Reading priorities shift by subtopic: sports-science frequency sections emphasize results, while computer science sections require deeper attention to methodology.

Paragraphs are assembled from aggregated evidence: main points first, then a topic sentence and a concluding sentence that highlights what remains missing in the literature.

Topics

Literature Review
Thesis Writing
Keyword Searching
Citation Tracing
Paragraph Synthesis

Mentioned

Ciara Feely

How to do a Literature Review | Step by Step | PhD Thesis Writing