Get AI summaries of any video or article — Sign up free
Graph Analysis - Co Citations thumbnail

Graph Analysis - Co Citations

6 min read

Based on Obsidian Community Talks's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Co-citation turns Obsidian’s link graph into a relationship engine by listing concepts that co-occur in the same snippets.

Briefing

Co-citation analysis is positioned as a practical way to answer “relationship” questions inside an Obsidian vault—without forcing users to manually restructure notes. The core idea: when a single note cites two concepts, those concepts become co-cited, and a dedicated panel can surface the exact snippets where both appear together. That turns scattered daily-note facts into a searchable map of second-order connections—useful for questions like “What does Biden think about global tax?” or “What’s the relationship between Biden and Macron?”

The talk starts with a problem many Obsidian users face: where to store factual snippets so they remain retrievable later. Facts can be anything from life events and news to descriptions of systems, story events, or tabletop RPG happenings. Using a news example about the G20 approving a global tax on October 30, the speaker models the information as a graph of entities (global tax, Joe Biden, Macron, G20, the EU, and so on). The key retrieval needs go beyond “What happened on that date?” People also want to understand why certain entities matter, and—hardest of all—what the relationship is between two concepts.

Daily notes seem like the obvious place to put time-stamped facts, because they minimize “where should I put this?” decisions. But daily-note backlinks don’t naturally reveal relationships between entities. If a daily note links to many people and topics, the backlinks list becomes a long, date-heavy trail with no direct links between the concepts themselves. Even if users can navigate from one entity to the daily note, they still have to sift through many entries to reconstruct why two concepts are connected.

The proposed fix is co-citation. Instead of asking users to traverse backlinks manually, the co-citation panel for a target concept (like “global tax”) lists other concepts that appear together in the same source snippets. This is described as a “second order backlinks” view: the information is stored in the original date-based note, but the panel re-indexes it so relationships become immediately visible. The workflow becomes: open the concept node, expand the co-citation list, and jump straight to the snippets where both concepts co-occur.

A major technical emphasis is scoring strength. Simply knowing two entities appear in the same note isn’t enough; the plugin assigns higher scores when the concepts occur in the same sentence, lower scores when they’re in the same paragraph, and even weaker scores as distance increases (down to cases where connections are considered too weak to matter). The algorithm also incorporates structure signals like headings and outline hierarchies, and supports unresolved links, files, images, and decks.

Demos show how this helps with real vault navigation: expanding a Biden node reveals clusters of co-cited entities (including other politicians) with numeric relevance scores, and a “freeze” option lets users keep a co-citation view stable while exploring tangential notes. Additional examples include browsing conceptual relationships (humans to brain-related ideas, happiness to personality and cited works) and applying the same mechanism to research topics like the bias–variance tradeoff. The overall message is that co-citation turns Obsidian’s link graph into a usable relationship engine, reducing manual overhead while keeping the underlying note structure simple.

Cornell Notes

Co-citation analysis in Obsidian is built to answer relationship questions between concepts. When a note contains links to both concept A and concept B, those concepts are treated as co-cited, and a co-citation panel lists the snippets where they appear together. This creates a “second-order backlinks” workflow: users open a concept node and immediately see other entities strongly connected to it, then jump to the exact supporting sentences. Relevance is ranked using a scoring scheme that favors same-sentence co-occurrence, then same-paragraph, then weaker structural proximity. The result is faster retrieval of “what connects these ideas?” compared with scanning daily-note backlinks or building complex query views.

Why are daily notes alone a weak tool for relationship questions between entities?

Daily notes work well for time-based retrieval (“what happened on Oct 30?”) because the note is easy to find. But they don’t create direct links between the entities mentioned inside the note. If a daily note links to many concepts (e.g., global tax, Joe Biden, Macron, G20, EU), backlinks from “global tax” lead to a long list of daily notes, and backlinks from “Macron” don’t reveal which specific snippets connect Macron to Biden. To reconstruct relationships, users must open notes and inspect backlinks manually, which becomes high-overhead when entities recur across many dates.

What does “co-citation” mean in this system, and how does it answer relationship questions?

Co-citation happens when a single note cites both concepts A and B—meaning that note links to both entities. The co-citation panel for a target concept re-indexes the vault so it can list the other concepts that co-occur with it in the same snippets. For example, opening the “global tax” node and expanding its co-citations reveals snippets where “global tax” and “Joe Biden” appear together, letting users answer “What does Biden think about global tax?” by jumping directly to the supporting sentence(s).

How does the plugin decide whether two co-cited concepts are strongly or weakly related?

It assigns a strength score based on proximity in the text. The highest score is given when both entities appear in the same sentence, since that implies a direct textual connection. Scores drop when entities appear in the same paragraph, and drop further as the distance increases (the talk describes thresholds like 0.85 for very close sentence adjacency, then 0.7, 0.5, 0.25, and below 0.25 as weak connections). This ranking helps users focus on the most meaningful relationships rather than treating any co-occurrence as equally relevant.

What structural signals beyond sentences can improve co-citation scoring?

Beyond sentence-level proximity, the algorithm uses headings and outline hierarchies. Outline hierarchy signals are described as useful for querying and for capturing relationships implied by document structure (e.g., concepts grouped under the same section). This means co-citation strength can reflect not only raw adjacency but also how the author organized the content.

How does the “freeze” option change the research workflow?

Freezing keeps the co-citation panel stable while the user navigates elsewhere. In the demo, after freezing on a relationship view (e.g., between Trump and Biden), the panel doesn’t change even when the user opens a different note (like an article about Trump and the election). That allows side research—such as checking connections to China—without losing the original relationship context.

How can co-citation help with non-news, research-style knowledge?

The talk demonstrates applying co-citation to conceptual and academic domains. In a “human” example, co-citations quickly surface related ideas like the human brain and terms tied to development (e.g., ontogenesis). In a machine learning example, co-citation reveals papers and concepts connected to the bias–variance tradeoff, including related notions like consistency and inequality (e.g., “magenta’s inequality” mentioned as a useful, famously biased concept). The same relationship engine works across personal notes, books, and research literature.

Review Questions

  1. When you store a factual snippet in a daily note, what specific limitation prevents backlinks from directly answering “relationship between two concepts” questions?
  2. Explain the scoring logic for co-citation strength. Why does same-sentence co-occurrence matter more than same-note co-occurrence?
  3. Describe how “freeze” supports exploratory research without breaking the relationship view you started with.

Key Points

  1. 1

    Co-citation turns Obsidian’s link graph into a relationship engine by listing concepts that co-occur in the same snippets.

  2. 2

    Daily notes minimize placement effort but create high overhead for relationship questions because they don’t directly connect entities to each other.

  3. 3

    A co-citation panel provides a second-order backlinks workflow: open a concept node, expand co-citations, and jump to the exact supporting sentences.

  4. 4

    Relevance scoring ranks connections by textual proximity—highest for same-sentence co-occurrence and lower as distance increases (sentence → paragraph → weaker proximity).

  5. 5

    Headings and outline hierarchies are used as additional structure signals to strengthen meaningful connections.

  6. 6

    The system supports more than text links, including unresolved links, files, images, and decks, enabling relationship browsing across media and references.

  7. 7

    A “freeze” option preserves the co-citation view while users explore tangential notes, supporting iterative research.

Highlights

Co-citation is defined as a note linking to both concepts A and B; the panel then surfaces the snippets where both appear together, making relationship questions fast to answer.
The scoring scheme prioritizes same-sentence co-occurrence, then same-paragraph, then weaker structural proximity—so users can ignore low-confidence connections.
Freezing keeps the co-citation results stable while navigating elsewhere, enabling side research without losing the original relationship context.
Co-citation isn’t limited to news facts; it also links conceptual clusters in areas like psychology and machine learning (e.g., bias–variance tradeoff).

Topics

  • Co-Citation Operator
  • Graph Analysis
  • Obsidian Vault
  • Second-Order Backlinks
  • Relevance Scoring