Graph Analysis - The Basics

TL;DR

Graph Analysis distinguishes structural algorithms (link-structure only) from NLP algorithms (word/content-based), and the best workflow depends on which signal you want.

Briefing Cornell Notes

Briefing

Graph Analysis for Obsidian turns a vault’s link structure—and, when enabled, note text—into a set of measurable relationships that help surface “what connects to what” across hundreds or thousands of notes. The practical takeaway is that the plugin’s numbers matter less than the rankings and relative patterns: if one note consistently scores higher than another, that difference often points to a meaningful connection worth checking.

After installing Graph Analysis from Obsidian’s community plugins and opening the “Graph Analysis View,” users can run algorithms against a selected note and see ranked results in a new pane. The plugin distinguishes between structural algorithms, which rely only on the graph of links between notes, and natural-language-processing (NLP) algorithms, which analyze the actual words inside notes. A key usability point is that many algorithms can generate large tables of values; the workflow is designed for scanning relationships—sorting, jumping to linked notes, and using hover previews—rather than treating exact scores as absolute truth.

Structural algorithms come in several families. Link prediction algorithms estimate how likely any other note is to be connected to the currently focused note based purely on graph structure; results can be cross-checked with icons that indicate whether a predicted connection already exists. Similarity algorithms also work from structure, measuring how alike two notes are by comparing neighbor overlap; one example uses a neighbor-overlap ratio that ranges from 0 to 1, highlighting cases where notes are similar enough that a link “should” exist even if it doesn’t. Centrality algorithms identify influential nodes in the network. The plugin’s HITS implementation reports both hub and authority scores: authority rises for notes with many incoming links, while hub rises for notes with many outgoing links. These HITS results are global, meaning they don’t change when the focused note changes.

Community detection algorithms cluster notes into groups based on connectivity patterns. Label propagation assigns each node a starting label and iteratively spreads the most common labels through the graph; after a configurable number of iterations, notes settle into communities. A second community method, Louvain, behaves differently: it’s local to the currently focused note and includes randomness, so refreshes can shift membership. The clustering coefficient algorithm adds another structural lens by estimating how likely a node’s neighbors are connected to one another—capturing “triangle” behavior that can hint at tightly knit subtopics.

A “freeze” control helps manage non-global algorithms by locking results to the currently focused note, so users can browse elsewhere without the ranking changing. For content-based analysis, Graph Analysis can also run NLP algorithms, but it requires a separate NLP plugin. Once enabled (and after an initial indexing delay), NLP features include bag-of-words similarity, a second content-based similarity method (provided by the NLP library), a Monte Carlo method, and sentiment analysis. Sentiment analysis is global and assigns each note a positive/negative score, making it easy to spot outliers such as highly negative notes.

Finally, Graph Analysis includes practical configuration: selecting which algorithms appear by default, excluding notes via tags or regular expressions (including matching against full file paths), optionally including non-Markdown files (images, videos, PowerPoint, etc.), and optionally including unresolved links. The overall message is straightforward: start with quick structural insights, then layer NLP for content-level signals when needed—using relative comparisons to guide exploration rather than obsessing over exact values.

Cornell Notes

Graph Analysis for Obsidian uses algorithms to measure relationships between notes, primarily from the vault’s link structure and optionally from note text. Structural methods include link prediction, similarity (neighbor overlap), centrality (HITS hub/authority), community detection (label propagation and Louvain), and clustering coefficient. NLP methods require a separate NLP plugin and add content-based similarity (bag of words, Monte Carlo, and another library-provided similarity) plus global sentiment scoring. Across algorithms, the plugin’s most useful output is the relative ranking—how one note compares to others—rather than the exact numeric values. Controls like sorting, hover previews, and “freeze” help users explore without getting overwhelmed.

How can link prediction help a user decide which notes to connect next?

Link prediction ranks other notes by how likely they are to be connected to the currently focused note using only the vault’s link structure. Results can be validated with an icon that indicates whether the predicted pair is already linked. For example, the language note ranks “humans” and “brain” highly, and those connections align with intuitive topical overlap; hovering reveals shared neighbors that explain why the model expects a link.

What’s the difference between similarity and link prediction in Graph Analysis?

Link prediction estimates the probability that two notes should be connected (a connection likelihood). Similarity algorithms instead measure how alike two notes are by comparing their neighborhoods in the link graph. In the Jakad-style similarity described, the score reflects the ratio of shared neighbors to the total neighbors considered, producing values from 0 to 1; notes can score highly even when they are not directly linked.

Why do HITS results stay the same when switching the focused note?

HITS is a global centrality algorithm in Graph Analysis. Because it doesn’t depend on the currently focused note, the hub and authority scores remain stable as users click around. Hub scores reflect notes with many outgoing links, while authority scores reflect notes with many incoming links. The default sort emphasizes authority, so a note like “brain” can appear at the top due to many incoming links.

How does label propagation form communities, and what does changing iterations do?

Label propagation starts by assigning each node its own label, then repeatedly updates each node to adopt the most popular label among its neighbors. With more iterations, dominant labels spread further, producing larger communities; with fewer iterations, communities remain smaller and more numerous because labels haven’t had time to consolidate. The plugin also lets users sort and inspect communities, such as a large “happiness” cluster.

What does the “freeze” icon accomplish for non-global algorithms?

For algorithms whose results depend on the currently focused note (non-global), clicking around normally changes the ranking. Freezing locks the results to the note that was focused when freeze was activated, so users can browse other notes without recomputation. Unfreezing restores dynamic behavior and refreshes results based on the new focus.

What extra setup is required to use NLP algorithms, and what do they measure?

NLP algorithms require installing and enabling the separate NLP plugin. A setting in the NLP plugin must be turned on so Graph Analysis can call its NLP features; this can take several seconds on vault startup. NLP algorithms then analyze note content: bag-of-words similarity counts word frequencies to compare notes, Monte Carlo and another library-provided similarity method compare content-based patterns, and sentiment analysis assigns each note a positive/negative score globally.

Review Questions

Which structural algorithms in Graph Analysis are global, and how can you tell from the interface?
When would you prefer similarity over link prediction while curating links in your vault?
How do tags and regular expressions differ when excluding notes from Graph Analysis results?

Key Points

1
Graph Analysis distinguishes structural algorithms (link-structure only) from NLP algorithms (word/content-based), and the best workflow depends on which signal you want.
2
Relative rankings across notes are more reliable than exact numeric values; use sorting and hover previews to interpret results.
3
Link prediction estimates which notes should be connected next, and icons indicate whether predicted links already exist.
4
Similarity algorithms measure neighborhood overlap, helping identify related notes even when no direct link exists.
5
HITS centrality reports hub (outgoing links) and authority (incoming links) scores and runs as a global algorithm that doesn’t change with focus.
6
Community detection includes label propagation (iterations control community size) and Louvain (local to focus and can vary on refresh).
7
Graph Analysis can expand beyond Markdown by including other file types and can exclude notes using tags or regex matched against full file paths.

Highlights

Graph Analysis is designed for exploration: exact scores are less important than how notes rank relative to one another.

HITS provides both hub and authority scores, and its “global” behavior keeps results stable as the focused note changes.

Label propagation’s iterations act like a dial for community granularity—fewer iterations yield more, smaller clusters; more iterations yield larger ones.

NLP features require a separate NLP plugin and add content-level signals like bag-of-words similarity and sentiment scoring.

Topics

Graph Analysis Basics
Structural Algorithms
Community Detection
NLP Similarity
Sentiment Analysis

Mentioned

NLP
HITS