Literature Review Using SciSpace Agent: In-Depth Walkthrough

TL;DR

Start with deep domain discovery (deep preview) to ground the research question in the field’s major concepts and research directions before collecting large paper sets.

Briefing Cornell Notes

Briefing

A research agent workflow in SciSpace is built to turn a vague topic into a structured literature review—starting with a domain-grounded research question and ending with review-ready reports (including LaTeX/PDF) that carry citations, tables, and research-gap analysis. The core idea is that strong literature reviews depend less on “asking for papers” and more on iterative control: define the domain, run multi-database searches with query diversity, extract structured data into columns, then generate review outputs that map claims back to specific sources.

The process begins with forming a research question. Instead of jumping straight into paper lists, the workflow pushes users to first understand the domain via SciSpace’s “deep preview” option. Users run targeted queries (example given: algorithmic improvements in reinforcement learning from human feedback for LLM training). Deep preview expands the search through multiple elaborated queries, pulls in citation trails and references, and returns a large, relevance-ranked set (the walkthrough cites 572 papers, with the most relevant rising to the top). The results are not just a reading list: they include follow-up questions and categorization signals, plus table-like summaries that break the field into components such as optimization efficiency, feedback mechanism robustness, and architectural adaptability. The agent also surfaces strengths and weaknesses, helping users translate domain understanding into clearer research directions.

From there, the workflow moves to a comprehensive multi-database search. Earlier SciSpace literature review tools relied heavily on SciSpace Academic, but the agent integrates additional sources such as Google Scholar, PubMed, and arXiv (archive). For each selected research question, it fires multiple queries per database—using Boolean-style prompts where appropriate (especially for Google Scholar) and applying database-specific filters (arXiv filters are emphasized for AI workflows; PubMed filters are positioned for biomedical/chemical domains). Results are combined and reranked, with the system selecting the top 100 and attaching a relevance score plus reasoning. The walkthrough stresses that query breadth matters: humans miss keywords, while the agent can generate multiple query variants across themes.

Next comes result analysis, where the system shifts from “papers” to “data.” Users can add columns to extract structured fields quickly—methods, limitations, future scope, evaluation metrics, and more—often leveraging PDFs when available. A key feature highlighted is research-gap extraction by mining limitations and future directions, then producing “extract insights” outputs that summarize challenges with tight citation grounding. The agent’s approach aims to reduce hallucinations by restricting insights to highly relevant, top-ranked papers and maintaining one-to-one citation mapping.

Finally, the workflow generates custom literature review reports with citations and flexible formatting. Outputs can be written in Markdown or LaTeX, including two-column layouts and multiple citation styles (example: IEEE-style is mentioned as a default). The walkthrough demonstrates a critical review with added columns (e.g., misalignment risks, key limitations, criticisms, proposed solutions, fundamental problems), LaTeX sectioning for accuracy, and compilation to PDF. It also shows other review types: scoping reviews (with full-text PDF download and visualizations) and systematic literature reviews using a PRISMA-style plan (research question → criteria → multi-database search → screening → deduplication → full-text extraction → final PRISMA report). The overall takeaway is a research-assistant model: iterative prompts, structured extraction, and controlled report generation that can be refined through follow-up questions when outputs need correction.

Cornell Notes

SciSpace’s literature-review agent workflow helps users move from domain understanding to review-ready outputs by combining deep domain discovery, multi-database searching, structured data extraction, and citation-backed writing. It starts with “deep preview” to ground a research question in the field’s main concepts and research directions, then runs comprehensive searches across sources like Google Scholar, PubMed, and arXiv using multiple Boolean-style queries per database. Results are combined and reranked (top papers selected), and users add columns to extract methods, limitations, future scope, and other fields from PDFs. Finally, the agent produces custom literature review reports in Markdown or LaTeX (including IEEE-style and two-column formats) and supports review types such as critical reviews, scoping reviews, and systematic literature reviews with PRISMA-style planning.

How does the workflow turn a broad topic into a research question that’s grounded in the actual field?

It starts with SciSpace “deep preview” queries to build domain understanding before paper-heavy work. Deep preview expands a user’s query into multiple elaborated searches, follows citations/references, and returns a large set of relevant papers (the walkthrough cites 572 papers). The output includes follow-up questions and categorization signals, plus table-like summaries that break the field into components (e.g., optimization efficiency, feedback mechanism robustness, architectural adaptability). By reading these structured summaries and the strengths/weaknesses, users can translate domain knowledge into clearer research directions and then generate research questions across major themes.

Why does the agent emphasize multiple queries per database instead of a single keyword search?

Query diversity improves coverage. The walkthrough describes firing multiple queries per database (e.g., several variants for SciSpace and full text, multiple Boolean queries for Google Scholar, and additional variants for arXiv). This matters because humans often miss relevant keywords and synonyms; the agent can generate broader keyword combinations and theme coverage. After collecting results, it combines them and reranks, selecting the top 100 with a relevance score and reasoning—creating a stronger base for later extraction and gap analysis.

What does “result analysis” mean in this workflow, and how do columns change the quality of the output?

Result analysis shifts from reading papers to extracting structured data. Users add columns such as methods, limitations, future scope, evaluation metrics, and application domains. Column extraction is treated like data extraction from each paper (often using the PDF when available), enabling fast, parallelizable organization. This structure supports downstream tasks like research-gap detection: limitations and future directions become direct signals for what’s missing in the literature, rather than relying on unstructured summaries.

How does the workflow generate research gaps with lower risk of unsupported claims?

It mines limitations and future scope from the extracted columns and then uses an “extract insights” step that summarizes challenges with citation backing. The walkthrough highlights that insights are heavily citation-backed and tied to highly relevant, top-ranked papers (e.g., top 30 used for extract insights), with one-to-one mapping between claims and sources. This constraint is presented as a way to reduce hallucinations compared with generic summarization.

What report formats and citation controls are supported for the final literature review?

The workflow can generate reports in Markdown or LaTeX, including LaTeX compiled to PDF. It supports citation formatting instructions such as IEEE-style (mentioned as a default) and other styles like MLA and APA. It also supports layout templates (two-column vs single-column) and journal/conference formats (ACM formats are mentioned). For accuracy, it can write LaTeX sections separately and reference them via LaTeX inputs, then compile and verify the PDF.

How do scoping reviews and systematic literature reviews differ in the agent workflow?

A scoping review emphasizes breadth and often includes full-text PDF download to populate extracted fields (methods, evaluation metrics, application domains). It can also produce visualizations (e.g., complexity/performance themes and an importance vs urgency graph) and output a scoping review document with citations. A systematic literature review follows a PRISMA-style plan: define the research question, finalize inclusion/exclusion criteria, run multi-database search, screen titles/abstracts, download and screen full text, deduplicate, extract data, and then write the final PRISMA report. The walkthrough also notes that users may approve or modify steps via dialog prompts during planning.

Review Questions

What specific steps in the workflow help ensure the research question matches the field’s current directions (not just the user’s interests)?
How does reranking and top-paper selection (e.g., top 100, top 30 for extract insights) influence the reliability of research-gap summaries?
When adding columns like limitations and future scope, what downstream tasks become easier and more systematic?

Key Points

1
Start with deep domain discovery (deep preview) to ground the research question in the field’s major concepts and research directions before collecting large paper sets.
2
Run multi-database searches with multiple query variants per source, using Boolean-style prompts where they work best (notably for Google Scholar).
3
Combine and rerank results, then use the top-ranked papers as the structured base for extraction and synthesis.
4
Add columns to extract structured fields from PDFs (methods, limitations, future scope, evaluation metrics), enabling fast, parallel analysis.
5
Generate research gaps by mining limitations and future directions, then summarize them through citation-backed “extract insights” to reduce unsupported claims.
6
Produce final review outputs in Markdown or LaTeX (including IEEE-style/two-column templates) so citations and formatting match manuscript needs.
7
Use different review modes—critical, scoping, systematic (PRISMA-style)—depending on whether breadth, evaluation, or strict screening is the priority.

Highlights

Deep preview expands a single query into multiple elaborated searches, citation/reference expansion, and a large relevance-ranked set (the walkthrough cites 572 papers) to help users define research themes.

The workflow’s “columns” turn literature review from reading into structured data extraction—making limitations and future scope directly usable for research-gap detection.

Extract insights is designed to be citation-grounded by restricting summaries to highly relevant top-ranked papers, aiming to lower hallucination risk.

LaTeX report generation supports template-driven formatting (two-column layouts and citation styles like IEEE), with section-by-section writing to improve accuracy.

Scoping reviews can automatically download full-text PDFs, extract fields, and generate visualizations; systematic reviews follow a PRISMA-style screening and data-extraction plan.

Topics

Research Question Formation
Multi-Database Search
Reranking and Relevance
Research Gap Extraction
LaTeX Literature Review
PRISMA Systematic Review

Mentioned

SciSpace
LLM
RHF
PRISMA
PDF
MD
IEEE
APA
MLA
arXiv