I Compared Every Popular AI Literature Review Tool So You Don't Have To
Based on Andy Stapleton's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Gemini and Scispace produced the most references (36 and 28), while Answer this produced only six, making it the weakest for citation-rich reviews.
Briefing
AI literature-review tools can generate usable drafts, but performance varies sharply across the basics: how many relevant citations they pull, how much coherent text they produce, whether the writing sounds like graduate-level academic prose, and—crucially—whether the output can be exported into formats researchers can actually edit.
In side-by-side tests using the same prompt about “self-healing nano composite transparent electrodes,” Gemini and Scispace led on reference volume. Gemini produced the most references (36), while Scispace followed closely (28). At the other end, Answer this delivered only six references, even after being pushed to maximize citations—an outcome that made it the weakest option for building a citation-rich literature review. The reference count mattered because a literature review is meant to synthesize a field, not just generate a few paragraphs of generic commentary.
Length and density separated the tools further. Thesis AI produced the longest output by far—described as “girthy”—with an estimated scale consistent with a thesis-level literature review (around 23,000 words for the generated section). Scispace and Gemini also produced substantial drafts, while Answer this was far shorter and didn’t “try its hardest” to fill out the requested material. The practical takeaway: short outputs may be fine for quick orientation, but they often fail to provide enough structure and thematic coverage to serve as a foundation for a real academic write-up.
Readability became the deciding factor for which tool felt most like something a researcher could plausibly adapt. Thesis AI scored best for sounding academically appropriate, even if some sentences ran long. Other tools were criticized for “thesaurus-y” word choices and for using terminology that didn’t match how the specific field typically writes—Answer this was singled out as the least readable, in part because its phrasing leaned into uncommon or unnatural wording.
Exportability—how easily the draft can be moved into a workflow—was another major differentiator. Thesis AI stood out as the most usable: it offered multiple export options including PDF plus formats that fit common academic editing pipelines such as Overleaf, Word, and notebook-style exports (with options like DOCX, LaTeX/“Latte,” and Markdown). Scispace was praised for doing well across other categories, but export options were limited in a way that required paying to download certain formats, and it lacked the “save into Word/Overleaf” convenience the workflow demands. ChatGPT was considered difficult to extract cleanly for editing, while Manis and Gemini were more workable but still not as seamless as Thesis AI.
Finally, AI-detection results were uniformly poor: every tool’s output was flagged as AI-generated with 100% confidence in the test used. The conclusion wasn’t that these tools are unusable—rather that they should be treated as starting points for human rewriting, not submissions as-is. Across the full set of criteria, the strongest overall choice for a researcher building an editable, readable, thesis-ready literature review was Thesis AI, while Scispace and Gemini were positioned as top alternatives when maximizing reference volume and getting a substantial draft quickly.
Cornell Notes
The tests compared six AI literature-review tools on citation coverage, draft length, readability, export/editing options, and AI-detection risk. Gemini and Scispace led on the number of references pulled (36 and 28), while Answer this lagged badly with only six references. Thesis AI produced the longest, densest draft and scored best on academic readability, using language that felt closer to what a graduate-level writer would actually use. Thesis AI also won on exportability, offering multiple editable formats and Overleaf integration, making it easier to turn the output into a working document. All tools were flagged as AI-generated by the detection check used, so outputs still require substantial human rewriting.
Which tools delivered the most citations, and why does that matter for a literature review?
How did draft length differ, and what does that imply for real academic use?
Which tool sounded most like graduate-level academic writing, and what was the criticism of others?
Why was exportability treated as a deciding criterion?
What did the AI-detection check reveal, and how should that affect usage?
If someone prioritizes different goals—citations, readability, or workflow—what trade-offs emerged?
Review Questions
- If a researcher’s top priority is citation coverage, which tools should they start with first, and what citation counts support that choice?
- What combination of factors made Thesis AI the strongest overall option, beyond just producing a long draft?
- How should the uniformly positive AI-detection results change how a researcher uses these outputs in an academic workflow?
Key Points
- 1
Gemini and Scispace produced the most references (36 and 28), while Answer this produced only six, making it the weakest for citation-rich reviews.
- 2
Thesis AI generated the longest, densest draft (around 23,000 words), aligning with thesis-level literature review expectations.
- 3
Thesis AI scored best for readability, using academic language that felt closer to typical graduate writing; other tools were criticized for unnatural or overly “thesaurus-y” phrasing.
- 4
Thesis AI led on exportability, offering multiple editable formats and Overleaf integration, which supports real editing workflows.
- 5
ChatGPT was less practical for academic use because extracting the content for editing was difficult.
- 6
AI-detection testing flagged all tools’ outputs as AI-generated with 100% confidence, so outputs require substantial human rewriting before any submission risk is considered.
- 7
Overall recommendations split by need: Thesis AI for editable, readable thesis-style drafts; Scispace/Gemini for reference volume and substantial synthesis.