Get AI summaries of any video or article — Sign up free
Efficiently Collect and Organise Information from Research Papers - Protolyst Workflow thumbnail

Efficiently Collect and Organise Information from Research Papers - Protolyst Workflow

Protolyst·
5 min read

Based on Protolyst's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Create a “sources” table for uploading research PDFs and a separate “topics and themes” table to maintain the tag vocabulary.

Briefing

A Protolyst workflow can turn a pile of research PDFs into a searchable knowledge base by separating “sources” (papers) from “topics and themes” (tags) and then extracting key snippets as reusable “atoms.” The setup starts with two tables: one for uploading papers and one for maintaining a controlled list of tags. As reading progresses, highlighted text is captured into atoms that can be viewed instantly anywhere in the workspace, then labeled with the relevant tags so insights from many papers accumulate under the same themes.

The process begins by creating a fresh workspace and adding a “sources” table where PDFs are drag-and-dropped. Each uploaded paper becomes a row in the table. Next comes a “topics and themes” table that holds the tag pages used across the workspace—initial themes like hydrogel-related categories are added up front, with the option to expand later. When a paper is opened, the workflow uses a “capture atom” action: selecting text and clicking capture lifts that snippet out of the PDF so it appears as an atom that can be referenced without reopening the original file and scrolling to the highlight.

Atoms become useful because they’re tagged. After capturing a snippet, the user assigns one or more tags by selecting from the “topics and themes” table (either by browsing the tag list or typing to find a tag). The atom then records its connections: the first tag shown links back to where the atom originated in the sources, and additional tags appear after a hashtag. As more atoms are captured across multiple papers, the sources table effectively becomes a catalog of extracted insights, while the topics table becomes an index of knowledge grouped by theme.

When new themes emerge mid-reading, tags can be added on the fly. Typing a new tag name (for example, “scaffold”) creates a new page in the topics and themes table, and the current atom can be immediately tagged with it. This keeps the system flexible without losing structure.

Beyond browsing, the workflow adds analytical power through atom properties and filters. An additional “atoms” property can be configured with filters based on tags (such as showing only atoms tagged with “methods” across all sources). Multiple filters can be combined to narrow results by tag combinations. Finally, atoms are searchable: a keyword search scans both pages and captured atoms, letting the user quickly find a snippet (e.g., about “drying out” hydrogels), jump back to the original capture location, and regain surrounding context.

In short, the core finding is that disciplined tagging plus atom extraction turns scattered reading into an organized, queryable research memory—one that supports both thematic synthesis and fast retrieval when a specific detail needs to be found again later.

Cornell Notes

The workflow organizes academic reading in Protolyst by separating PDFs (“sources”) from a controlled set of tags (“topics and themes”). Key text is extracted from papers into “atoms” using a capture action, so highlights can be revisited instantly without reopening and scrolling through PDFs. Each atom is labeled with one or more tags, allowing insights from many papers to accumulate under shared themes like hydrogel-related topics. As new themes appear, new tag pages can be created on the fly and immediately used. Atom properties, tag-based filters, and keyword search make it possible to retrieve specific methods or details (such as “drying out”) and jump back to the original context quickly.

How do the two tables—“sources” and “topics and themes”—work together to manage research notes?

The “sources” table is where PDFs are uploaded; each paper becomes a row. The “topics and themes” table stores the tag pages that will be used across the workspace. When text is captured from a paper into an atom, tags are selected from the “topics and themes” table, which ensures consistent labeling and lets atoms be grouped and browsed by theme later.

What is an “atom,” and why does capturing highlighted text matter for long-term research?

An atom is a lifted snippet of highlighted text from a PDF. After clicking the capture atom button, the snippet appears as a standalone item in the workspace. This avoids repeatedly reopening PDFs and hunting for the same highlight later; the atom can be referenced anywhere and remains connected to its source.

How does tagging change what you can do with captured information?

Tags connect each atom to one or more themes. In the sources table, atoms show their origin and then display additional tags after a hashtag. In the topics and themes table, atoms sharing the same tag are pulled together, effectively creating a theme-based index of extracted insights across multiple papers.

Why add new tags “on the fly,” and how is that done in this workflow?

New research questions often emerge while reading, so the tag set needs to evolve. The workflow allows typing a new tag name (e.g., “scaffold”), then creating it as a new page inside the “topics and themes” table. Once created, the current atom can be tagged with it immediately, and the tag appears in the topics table with the associated atom.

How do atom filters and additional “atoms” properties help with targeted retrieval?

An additional atoms property can be configured with filters based on tags. For example, filtering by a “methods” tag displays only atoms tagged with methods across all relevant sources. Multiple filters can be combined by adding more tag-based criteria, enabling focused views like “methods + a specific theme.”

How does keyword search support finding forgotten details?

A keyword search scans both pages and captured atoms. If a user remembers a concept but not where it was tagged, typing a phrase like “drying out” can locate the matching atom quickly. Selecting it then provides a path back to the original capture context on the source page for more surrounding detail.

Review Questions

  1. What are the roles of the “sources” table versus the “topics and themes” table in keeping research organized?
  2. Describe how an atom is created and how tags are applied to it.
  3. How would you use tag-based filters and keyword search to find a specific type of information (e.g., methods) across many papers?

Key Points

  1. 1

    Create a “sources” table for uploading research PDFs and a separate “topics and themes” table to maintain the tag vocabulary.

  2. 2

    Extract important snippets from PDFs into reusable “atoms” so highlights can be revisited without reopening and scrolling through the original files.

  3. 3

    Tag each atom using pages from the “topics and themes” table to group insights across many papers under consistent themes.

  4. 4

    Add new tag pages during reading when new themes emerge, then immediately apply them to relevant atoms.

  5. 5

    Use additional atoms properties with tag filters to build focused views (such as showing only atoms tagged with “methods”).

  6. 6

    Combine multiple tag filters to narrow results to specific tag combinations.

  7. 7

    Rely on keyword search across both pages and atoms to quickly rediscover details and jump back to the original context.

Highlights

Two-table structure: papers live in “sources,” while all tag pages live in “topics and themes,” keeping labeling consistent.
Capturing highlights as atoms turns scattered reading into instantly retrievable snippets across the workspace.
Tag-based filters let the same set of papers produce different thematic views—like a “methods-only” perspective.
Keyword search can locate a forgotten detail (e.g., “drying out”) by searching atoms, then route back to the original context.

Topics

Mentioned