Get AI summaries of any video or article — Sign up free
First Time User's Guide to Connected Papers for Research thumbnail

First Time User's Guide to Connected Papers for Research

Andy Stapleton·
5 min read

Based on Andy Stapleton's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Connected Papers builds a similarity-based visual graph from a seed paper searched by title, DOI, or URL.

Briefing

Connected Papers turns the messy task of finding relevant research into a visual, similarity-based map. Instead of reading dozens of papers just to guess what matters, it builds a graph from a starting point—searched by title, DOI, or URL—and then clusters related work so researchers can quickly see where a field’s ideas concentrate and how different papers connect.

The interface settles into a three-panel layout: a list of retrieved papers, a central network graph, and a side panel showing the selected paper’s abstract. The seed paper (the origin) appears in purple, and the graph’s structure is the key to understanding what to do next. The connections are not a traditional citation tree. Papers are positioned and grouped by similarity—derived from citation patterns across many works—so clusters represent areas with shared “understandings” and research themes rather than direct “paper A cites paper B” relationships.

Three visual cues guide exploration. First, line thickness reflects how strongly papers are connected to the seed and to each other, producing tight clusters for closely related topics while leaving “lonely” papers isolated when they don’t match the field’s dominant similarity neighborhoods. Second, node size indicates the number of citations: heavily cited papers appear larger, signaling broad attention in the scientific community. Third, color encodes publication time: lighter nodes are older, while darker nodes are more recent. The practical takeaway is to look for papers that are both large and dark—highly cited and relatively new—because they often combine influence with current relevance.

Connected Papers also supports workflow decisions. In list view, results can be sorted by similarity to the origin, year, citations, or references, and the set can be downloaded as a BibTeX (bib) file for import into reference managers. Two sections are especially useful: “prior works” highlights earlier foundational papers that many heavily cited works build on, while “derivative works” surfaces newer papers that cite multiple items in the graph, pointing to where the research is heading.

A less intuitive feature involves blue-highlighted papers. Blue indicates a paper is cited by the “selective derivative works” in the network, meaning it sits on a direct citation path from highly connected later work. Even though the layout is similarity-based, this blue cue helps identify items that are both central to the cluster and directly referenced by other important papers.

The tool’s right panel makes it easy to open abstracts and jump to external sources such as Semantic Scholar, publisher pages, Google Scholar, and PubMed. It also offers filters for keyword, open access/PDF availability, and year. Finally, researchers can refine the map by creating new graphs from a selected paper (“open graph”) or adding multiple origins (“add origin”) to thicken and expand the network; origins can be removed to recalculate the graph. The result is a fast, repeatable method for building a reading list and aligning new research with established and emerging work.

Cornell Notes

Connected Papers builds a visual graph of research around a starting paper (title, DOI, or URL). The graph is similarity-based, not a direct citation tree: papers cluster because their citation patterns suggest related themes. Three cues drive interpretation: thicker links mean stronger connection, node size reflects citation count, and node color reflects publication recency (darker = more recent). The interface also provides “prior works” (foundational earlier papers) and “derivative works” (newer papers citing many items in the cluster), plus BibTeX download for reference managers. Blue nodes flag papers that are cited by selective derivative works, helping identify items that are both central and directly referenced by influential later research.

How does Connected Papers decide which papers belong together in the graph if it isn’t drawing direct citation links?

Papers are arranged by similarity. The tool infers similarity by looking at citation patterns across a range of papers; works that tend to be cited together get grouped into the same research neighborhood. That’s why the network behaves like a clustering of related ideas rather than a literal “who cites whom” tree.

What do node size, link thickness, and color each tell a researcher?

Node size represents the number of citations—larger nodes are more heavily cited. Link thickness reflects how strongly papers connect within the similarity network, with thicker lines indicating stronger relationships. Color encodes publication time: lighter nodes are older, while darker nodes are more recent. A common search strategy is to prioritize large, dark nodes (highly cited and relatively new).

What’s the difference between “prior works” and “derivative works,” and when should each be used?

“Prior works” points backward: it surfaces earlier papers that appear foundational to the selected work, especially those that are heavily cited. “Derivative works” points forward: it highlights newer papers that cite many papers in the graph, suggesting where the field is moving. Use prior works to understand roots; use derivative works to find current momentum.

Why do some papers appear in blue, and what does that imply for reading priority?

Blue indicates a paper is cited by the “selective derivative works” within the network. Even though the layout is similarity-based, this blue cue signals direct citation by highly connected later papers. That combination often makes the blue paper a strong candidate for deeper reading because it sits on a direct line from influential newer work.

How can adding multiple origins change the graph, and what does that accomplish?

Selecting “add origin” adds additional seed papers to the same network. The graph recalculates to find papers similar to all included origins, producing a thicker, more connected map. Removing an origin returns the graph to focus on the remaining seeds, which can change which clusters dominate.

What practical steps help turn the graph into an actionable reading list?

Researchers can click papers to view abstracts, then use the list view to sort by similarity, year, citations, or references. They can download the results as a BibTeX (bib) file for import into reference managers. External links (e.g., Semantic Scholar, Google Scholar, PubMed, publisher pages) help verify and locate full text, while filters (keyword, open access/PDF availability, year) narrow the set.

Review Questions

  1. In what way is Connected Papers’ graph interpretation different from a citation graph, and how does that affect how you should read clusters?
  2. If a paper is small but very dark (recent), what might that suggest compared with a large but light (older) paper?
  3. How would you use “prior works” and “derivative works” together to plan both background reading and future-looking research?

Key Points

  1. 1

    Connected Papers builds a similarity-based visual graph from a seed paper searched by title, DOI, or URL.

  2. 2

    The central network is not a direct citation tree; clusters reflect related research themes inferred from citation patterns.

  3. 3

    Thicker links indicate stronger connections, node size reflects citation count, and node color encodes publication recency (darker is newer).

  4. 4

    “Prior works” helps identify foundational earlier papers, while “derivative works” surfaces newer papers that cite many items in the cluster.

  5. 5

    Blue-highlighted papers are cited by selective derivative works, signaling direct relevance to influential later research.

  6. 6

    List view supports sorting by similarity, year, citations, or references, and results can be exported as a BibTeX (bib) file for reference managers.

  7. 7

    Origins can be added or removed to recalculate the graph around one or multiple starting papers.

Highlights

Connected Papers clusters papers by similarity inferred from citation patterns, not by direct “cites” relationships.
The fastest way to triage relevance is to look for large, dark nodes—highly cited and relatively recent.
“Prior works” and “derivative works” provide a built-in path for reading both backward (foundations) and forward (where the field is going).
Blue nodes flag papers that are directly cited by selective derivative works, combining similarity context with direct citation signals.

Topics

Mentioned