KGC 2023 Masterclass: Taxonomy-Driven Ontology Design — Heather Hedden, PoolParty
Based on The Knowledge Graph Conference 's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Taxonomy-driven ontology design starts with controlled vocabularies for consistent tagging and retrieval, then adds ontology semantics for richer querying.
Briefing
Taxonomy-driven ontology design hinges on a practical idea: start with controlled, hierarchical (or faceted) taxonomies for consistent tagging and search, then add an ontology as a semantic “layer” that models relationships, attributes, and constraints so data and content can be queried in richer, multi-step ways. The payoff is fewer missed items, better normalization across synonyms and languages, and the ability to move beyond keyword search into structured discovery—especially when organizations face messy, siloed data with inconsistent naming.
The session frames both taxonomies and ontologies as knowledge organization systems, but draws a sharp line between their roles. A taxonomy is defined as a collection of controlled vocabulary terms organized into a structure—typically hierarchical, sometimes faceted—where each concept has an unambiguous, non-redundant meaning. Control means concepts, not raw strings: terms are governed so that different names (including synonyms, alternative labels, and hidden labels) map to the same concept. That concept-centric approach is what enables precision and recall improvements over keyword search, supports browsing and faceted filtering, and provides consistent metadata for tagging and indexing.
Ontologies, by contrast, are treated as knowledge representation: they model a domain using classes, properties, and semantic relations expressed as subject–predicate–object triples (a theme tied to RDF and semantic web standards). Ontologies add meaning beyond “broader/narrower/related” by specifying semantic relationships between classes and by attaching attributes (data type properties) with constraints that can support reasoning and inference. The result is a model that can connect multiple vocabularies and enable complex queries—like finding contacts based on chains of relationships (e.g., employment, industry membership, and location), not just “aboutness” tags.
A key architectural claim is that an ontology can sit above existing taxonomies and other controlled vocabularies as a semantic layer. In the PoolParty context, the taxonomy supplies the controlled concepts used for tagging and user-facing navigation, while the ontology supplies additional classes and relations that connect those concepts across dimensions. The talk illustrates this with examples such as recipe taxonomies (concepts like “appetizer,” with multilingual labels and scope notes) and an ontology-driven “cocktail” model where relationships like “consists of,” “part of,” and “uses garnish” enable different navigation paths than faceted browsing alone.
The discussion also clarifies why taxonomies and ontologies are often extended together rather than replaced. Taxonomies excel at consistent naming, multilingual synonym management, and retrieval workflows; ontologies add explicit relationships, multi-part search, data-centric modeling, and reasoning. Knowledge graphs then enter as the broader system: instance data extracted from spreadsheets, databases, and other repositories is stored in graph databases (with RDF triple stores emphasized for semantic web alignment), linked to ontology and taxonomy metadata, and used by applications for search, discovery, personalization, and AI-driven recommendations.
Finally, the session offers guidance on ontology building approaches: top-down (starting from a foundation or upper ontology and extending) versus bottom-up (starting from existing taxonomies and control vocabularies, importing them as concept schemes, and modeling only what’s needed in the ontology). The practical recommendation is to begin with taxonomies already grounded in real tagging and retrieval use cases, then extend into ontologies where business needs demand richer relationships, attributes, and cross-domain querying.
Cornell Notes
The session distinguishes taxonomies from ontologies by function. Taxonomies organize controlled vocabulary concepts into structured hierarchies or facets to standardize tagging and improve search and browsing through synonym normalization and consistent metadata. Ontologies add a semantic modeling layer—classes, properties, and explicit relations—so systems can answer multi-step questions and support reasoning beyond “aboutness.” A taxonomy-driven approach starts with existing taxonomies (often already used for tagging) and extends them with an ontology that connects those concepts via semantic relations and attributes. This combination feeds enterprise knowledge graphs, where instance data from multiple repositories is stored in graph databases and linked to ontology/taxonomy metadata for richer discovery and applications.
What makes a taxonomy “controlled,” and why does it matter for search?
How do taxonomies and ontologies differ in what they model?
Why extend a taxonomy into an ontology instead of building an ontology from scratch?
What does “ontology as a semantic layer over taxonomies” mean in practice?
How do knowledge graphs fit into the taxonomy–ontology–data picture?
What are the two main approaches to building ontologies mentioned, and when would each help?
Review Questions
- Explain how synonym management in a taxonomy improves precision and recall compared with keyword search.
- Give an example of a question that requires ontology-style semantic relations rather than taxonomy-only browsing.
- Describe how instance data, ontologies, and taxonomies combine inside an enterprise knowledge graph.
Key Points
- 1
Taxonomy-driven ontology design starts with controlled vocabularies for consistent tagging and retrieval, then adds ontology semantics for richer querying.
- 2
A taxonomy’s control targets concepts (not just strings), enabling multilingual and synonym normalization through preferred/alternative/hidden labels.
- 3
Ontologies model classes, properties, and semantic relations using triple-based statements, supporting constraints and reasoning beyond broader/narrower navigation.
- 4
An ontology can act as a semantic layer that connects and enriches existing taxonomies and other controlled vocabularies rather than replacing them.
- 5
Knowledge graphs combine instance data stored in graph databases with ontology/taxonomy metadata to enable cross-repository discovery and applications.
- 6
Bottom-up ontology building is often practical when organizations already have taxonomies grounded in real business tagging and search needs.