Plugin testing for developers

TL;DR

Write tests that encode intended behavior (especially defaults and user-visible outcomes), not just that “something runs.”

Briefing Cornell Notes

Briefing

Automated testing is the fastest path to finding Obsidian plugin breakage before users notice it—especially when tests run on every change via your IDE, pre-commit hooks, and GitHub Actions. The core message is practical: treat plugin behavior as something you can lock down with repeatable checks, so accidental changes to defaults, data parsing, filtering logic, or recurring scheduling don’t silently ship.

The talk grounds that idea in four real Obsidian plugins—note refactor, dataview, link data helper, and tasks—using TypeScript examples (with the same underlying principles applied in other languages like Python). It starts with a simple unit test pattern: each test has a human-readable description plus assertions that compare expected outcomes to the code under test. A concrete example checks default settings (like a “new file location” default) so a plugin doesn’t unintentionally change behavior when no settings file exists—an issue that can directly affect users.

From there, the discussion contrasts manual testing (edit code, compile, hot reload, then click through behavior) with automated tests that encode intended behavior inside test code. Automated tests initially cost time to set up, but they pay back by reducing repeated manual effort and by making failures actionable. The talk emphasizes that tests are a skill: they take practice, and they work best when they’re run frequently.

Tooling and workflow matter as much as test design. For dataview, tests run under Jest, and the talk recommends choosing a framework and getting proficient rather than chasing “best.” It also shows how to run tests quickly inside JetBrains IDEs—both the full suite and individual test files—so developers can iterate without waiting on slow console runs. When failures happen, the IDE highlights “expected vs received” differences down to character-level granularity, making debugging faster.

For link data helper, the bottleneck isn’t code—it’s data. Downloading and processing gigabytes makes manual verification painful and unstable as inputs evolve. The solution is snapshot testing (Jest snapshots), where the first run records structured output into snapshot files and later runs compare current results against those stored “golden” outputs. When the snapshot changes, the failure output shows exactly what differs, turning large, hard-to-assert datasets into maintainable regression checks.

The tasks plugin section adds two process upgrades: pre-commit/pre-push hooks that block commits when tests or formatting fail, and CI via GitHub Actions that runs linting/build/test on push. The talk also highlights data-driven (parameterized) tests for recurring task logic, where a table of scenarios feeds the same test harness. It argues for boundary-case coverage and for writing tests that fail with meaningful messages—plus a guiding rule: never trust a test until it has been seen fail.

The closing guidance is about starting small: write a trivial “smoke test” to confirm the framework runs, then test the tiniest callable unit in your codebase. The talk also addresses alternatives like logging: logs help understand behavior during debugging and can capture hard-to-reproduce user issues, but automated tests are what prevent regressions reliably and continuously across contributors and CI.

Cornell Notes

Automated tests let Obsidian plugin developers catch regressions before users do, and they become most valuable when wired into daily workflows (IDE runs, pre-commit hooks, and GitHub Actions). The talk demonstrates unit tests for default settings, Jest-based tests for dataview, snapshot testing for link data helper to handle large evolving datasets, and data-driven tests for tasks’ recurring scheduling logic. It stresses that tests are a learnable skill, not instant productivity, and that tests must be run often to provide fast feedback. Finally, it offers a practical on-ramp: start with a smoke test, then add the smallest meaningful testable unit, and grow coverage around likely failure points and boundary cases.

Why do default-value tests matter for plugins, and what does a basic unit test look like?

Default-value tests guard against accidental changes to behavior when users have no settings file. In the note refactor example, a test constructs a settings object and asserts that a specific default property (e.g., “new file location”) equals an expected value (like the “faults folder”). The test includes a description of intent (“testing the default values of the settings”), then uses assertions comparing the code’s output (left side) to a fixed expected value (right side). This kind of test prevents silent user-facing changes to plugin behavior.

How does snapshot testing help when test data is huge or changes over time?

Snapshot testing (Jest snapshots) records structured output the first time a test runs, saving it into a snapshot file alongside the test (e.g., a snapshot named to match the source test file). Later runs compare current output to the stored snapshot. In link data helper, manual testing is slow because the plugin downloads and processes gigabytes of data, and inputs can evolve. Snapshots turn that into a regression check: when output differs, Jest reports which snapshot failed and shows “received vs expected” differences, making it practical to cover cases that would be too expensive to write explicit expect/assert chains for.

What’s the practical difference between manual testing and automated testing in day-to-day development?

Manual testing requires compiling/building the plugin and then clicking through behavior each time code changes, which forces developers to decide how much manual verification to do per change. Automated tests encode intended behavior in code and can be run repeatedly and quickly. The talk notes that automated testing initially slows development while learning and setting up, but proficiency reduces turnaround time and catches breakage earlier—especially when tests run in CI and block merges or pushes.

Why are data-driven (parameterized) tests useful for complex logic like recurring tasks?

Parameterized tests separate scenario data from test logic. In tasks, a bucket of test cases defines inputs such as recurrence interval (e.g., every 7 days) and expected next due dates for specific calendar dates. The test harness then runs the same logic for each scenario (using Jest’s test.each), sometimes setting a fake completion date because it affects results. This approach makes it easy to add new scenarios without duplicating test code, and it encourages boundary-case thinking (e.g., how “before” filters behave on exact boundary dates).

What workflow mechanisms make tests stick—so they actually prevent regressions?

The talk highlights multiple layers: run tests inside the IDE for fast local feedback; use pre-commit/pre-push hooks so commits fail when tests or formatting fail; and rely on GitHub Actions to run build/test/lint on push. This ensures contributors get immediate feedback and that regressions are caught even when developers don’t remember to run tests manually.

Review Questions

What kinds of plugin failures are best caught by unit tests of default settings versus snapshot tests of large computed outputs?
How would you decide what scenario data to include in a parameterized test for scheduling or filtering logic?
Why does the talk emphasize running tests inside the IDE and via CI rather than relying on manual testing or logging alone?

Key Points

1
Write tests that encode intended behavior (especially defaults and user-visible outcomes), not just that “something runs.”
2
Choose a test framework (e.g., Jest or Mocha) and get proficient; the principles of testing stay consistent even if tooling differs.
3
Use IDE integration to run full suites and individual tests quickly, so debugging stays fast when failures occur.
4
Apply snapshot testing when explicit assertions would be too time-consuming or when outputs depend on large, evolving datasets.
5
Wire tests into your workflow with pre-commit/pre-push hooks and GitHub Actions so regressions are blocked automatically.
6
Prefer data-driven tests for complex logic: keep scenario data separate from the test harness and include boundary cases.
7
Never trust a test until it has been seen fail for the right reason; when failures happen, decide whether they indicate a bug, a feature change, or an incorrect test expectation.

Highlights

A default-setting unit test can prevent a plugin from silently changing behavior for users who never create a settings file.

Snapshot testing turns gigabytes of computed output into manageable regression checks by comparing “received vs expected” snapshots.

Pre-commit/pre-push hooks and GitHub Actions make testing enforceable, not optional.

Parameterized tests for recurring tasks let developers add scenarios quickly while keeping the test logic consistent.

A practical rule of thumb: never trust a test until it has been observed failing with meaningful output. 

Topics

Plugin Testing
Jest Snapshots
Parameterized Tests
CI Hooks
Obsidian Plugin Development