An Introduction to Dataview - Part 2
Based on Obsidian Community Talks's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
“Contains” checks substrings on single strings, but on lists it only matches exact elements/objects—substring expectations often break for multi-valued fields like authors.
Briefing
Dataview’s “contains” and related functions let Obsidian users filter notes by matching substrings, list membership, and even patterns in filenames and titles—turning everyday metadata into powerful, query-driven indexes. The core takeaway is that “contains” behaves differently depending on whether it’s applied to a single value or a list: on strings it checks for substring presence, but on arrays it only returns true when an exact element/object matches. That distinction matters because it changes what queries will work when metadata fields are multi-valued (like an authors list).
The session starts with practical examples of “contains.” A string query like “yes contains e” returns true, while “no contains e” returns false. The same idea extends to lists: a list [1,2,3] “contains 2” is true, while “contains 4” is false. From there, the discussion moves into real note filtering: building lists of daily notes by checking whether the filename contains a prefix/suffix (e.g., “dn” in the filename), or finding source notes by testing whether an authors field includes a specific author name (e.g., notes whose authors list contains “robert lamb”). A key caveat emerges when someone tries to “vectorize” contains across arrays: even if an authors field includes a substring match, the query won’t behave like substring search across list elements; it only succeeds when the exact object/value is present.
To handle date-like selection in titles, the conversation pivots to regular-expression-based matching. A common approach is using “regex match” (via rejects match) against file names or titles, with the regex crafted to match the entire string format. Participants note that you can still target substrings by shaping the regex (for example, using patterns that allow digits to appear in the right position). For example, selecting notes whose titles include a date can be done by matching a filename pattern with a four-digit year and two-digit month/day structure, or by using a field like file.day when available—where null results naturally fail the where condition.
The session then broadens into other aggregation and filtering functions. “Length” supports queries like “notes whose filename length is at least 20 characters” and “notes whose tags list length equals zero” to find untagged notes. “Sum” can add numeric lists, enabling workflows like totaling study time across repeated entries stored as list items. Questions about counting embedded items (like “blocking beds” or block embeds) highlight current limitations: Dataview can’t reliably access rendered embed content in live/preview contexts, and counting embedded elements may require workarounds such as link-based queries or regex detection, with some functionality only visible in edit mode.
Beyond functions, the discussion turns to data hygiene and maintainability. Participants compare manual curation versus dynamic “maps of content” generated from tags and templates. One attendee describes using a template (e.g., dataview mlc) to generate an index/map without cluttering graph views with real links, and sorting it by recently modified items. The group repeatedly returns to the same practical reality: Dataview’s power depends on consistent YAML metadata, and templates can reduce errors and repetitive maintenance. Finally, the session closes with guidance on contributing—using GitHub issues for feature requests/bugs, forking repositories, and updating documentation—so the plugin can evolve alongside user needs.
Cornell Notes
Dataview filtering hinges on how functions behave on different data types. “Contains” checks substrings on single strings, but on lists it only returns true for exact element/object matches—so substring expectations often fail for multi-valued metadata like authors. For date-like selection in filenames or titles, regex-based matching (via rejects match) can target strict formats, while field-based approaches like file.day can cleanly filter notes by returning null when a date isn’t present. Additional functions like length and sum enable practical tasks such as finding untagged notes or totaling numeric lists. The session also emphasizes that reliable results depend on consistent YAML metadata and that templates can reduce maintenance overhead.
Why does “contains” work for substring searches on strings but not for substring searches inside list-valued metadata?
How can Dataview select notes whose titles or filenames include dates?
What are practical uses of the length function in Dataview queries?
How does sum enable aggregation, and what limitation came up when trying to count embedded items?
Why do templates and consistent YAML metadata matter for Dataview results?
What contribution paths were suggested for improving Dataview?
Review Questions
- When would you expect “contains” to fail for substring matching, and what data type difference causes that?
- Compare regex-based date filtering with field-based filtering using file.day: what makes each approach robust or fragile?
- What kinds of tasks are good fits for length and sum, and what kinds of counting tasks may run into embed-content limitations?
Key Points
- 1
“Contains” checks substrings on single strings, but on lists it only matches exact elements/objects—substring expectations often break for multi-valued fields like authors.
- 2
Filename/title date filtering can be done with regex matching, but regexes may need to match the full string format unless carefully constructed for substring targeting.
- 3
Field-based filtering like file.day can be cleaner than regex because null values naturally exclude notes lacking the date.
- 4
Length supports practical filters such as finding untagged notes (tags list length equals zero) or selecting notes by minimum filename size.
- 5
Sum enables numeric aggregation across list-valued metadata, such as totaling repeated study durations.
- 6
Counting embedded elements is constrained by Dataview’s access to rendered content in preview/live contexts, pushing users toward workarounds like link queries or regex detection.
- 7
Consistent YAML metadata and templates reduce query breakage over time; maintenance remains a core requirement for reliable Dataview results.