KGC 2023 Masterclass: Build a Semantic Layer in a Knowledge Graph — Stardog
Based on The Knowledge Graph Conference 's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Semantic knowledge graphs add a semantic layer that connects entities and relationships across disconnected enterprise data sources without requiring full data consolidation.
Briefing
Stardog’s masterclass frames semantic knowledge graphs as the practical way to answer richer business questions across disconnected enterprise data—without forcing every system into a single database. The core idea is that a knowledge graph sits on top of existing data sources and builds a semantic layer: a machine-understandable model of entities, relationships, and meaning that enables downstream analytics, AI, and machine learning to work from consistent definitions.
The session starts with a simple limitation: with only one dataset, organizations can answer narrow questions (like contact details for a single customer). Knowledge graphs matter because they connect that data to other sources—purchases, product categories, locations, and even behavioral signals from social channels—so questions become multi-faceted. The talk connects this to three pillars of modern data management: storage, governance, and analytics. In real enterprises, data is spread across many catalogs and systems, and governance functions are fragmented. The semantic layer is positioned as the unifying “single language” that lets teams work across that sprawl while keeping data physically where it already lives.
Stardog’s approach is built around three foundational components. First is virtualization: data exists everywhere at different speeds and in different technologies, so the solution can’t rely on consolidating everything into one place. Second is the semantic graph: it creates a connected view of concepts across sources (for example, how a Postgres customer table relates to a Databricks lakehouse purchase history) without moving all the underlying data. Third is an inference engine: it encodes meaning and patterns directly in the graph so new relationships can be derived at query time. The emphasis is on making implicit knowledge explicit—so machines can interpret not just data values but what those values mean.
The masterclass then narrows to semantic graph fundamentals. Semantic graphs use nodes and edges where relationships are treated like sentence-like triples (subject–predicate–object), and relationship meaning can include time and qualifiers. The instructor contrasts this with labeled property graphs, noting that the session focuses on semantic graphs.
Hands-on work in Stardog Explorer and Designer shows how a “Customer 360” knowledge graph is built iteratively. Participants start by defining concepts (Customer, Product Category, Product, Purchase) and connecting them with relationships (e.g., purchases link customers to products, products belong to categories). Designer’s Query Builder helps craft queries using the ontology/metadata so users can ask questions without learning query syntax. The workflow then moves to publishing the model to a Stardog endpoint and mapping CSV data sources into the graph.
Mapping is presented as a lightweight alternative to classic ETL: instead of moving data, users define correspondences between relational columns and graph concepts, then publish those mappings so Stardog can generate the necessary queries across systems. Auto-mapping accelerates this by suggesting identifiers and attributes based on the data model, while users can add missing properties inline.
Finally, the inference engine turns the graph from a static representation into a reasoning system. Rules classify patterns (like “2022 orders” or “large orders”) and infer higher-level concepts (like “sports category shopper”). Inferences are computed on the fly so results update immediately as underlying data changes. The session also highlights explainability via proof trees—answering not only “what is connected” but “why,” including which asserted facts and rules led to the inference.
By the end, the masterclass positions semantic knowledge graphs as an iterative, software-like process: start small, model the essentials, map data sources, add rules, and expand connectivity. The payoff is a connected enterprise view that hides enterprise messiness behind the scenes while delivering consistent, machine-readable meaning for analytics and AI use cases.
Cornell Notes
Semantic knowledge graphs in Stardog are presented as a way to answer richer enterprise questions across disconnected systems by adding a semantic layer on top of existing data. The workflow builds a data model (concepts and relationships), maps enterprise sources into that model without moving data, and then uses an inference engine to derive new relationships from patterns encoded as rules. This reasoning happens at query time, so inferred facts behave like asserted graph data and update immediately when inputs change. Explainability is emphasized through proof trees that show which facts and rules produced an inference. The practical goal is a “Customer 360” style graph that enables downstream analytics and AI/ML to work from consistent meaning rather than isolated tables.
Why does a knowledge graph enable questions that a single dataset can’t?
What are the three pillars Stardog uses to make semantic graphs work in an enterprise?
How does Stardog’s semantic graph differ from labeled property graphs in the session’s framing?
What does the hands-on workflow look like in practice (model → publish → map → query)?
How do inference rules change what the graph can answer?
What does explainability mean in this inference setup?
Review Questions
- What limitations of isolated datasets does a semantic knowledge graph address, and how does connectivity change the types of questions that can be answered?
- Describe the end-to-end workflow for building a Customer 360 knowledge graph in Stardog, including what happens during mapping and publishing.
- How does query-time inference differ from precomputing inferred relationships, and why does that matter for data freshness and explainability?
Key Points
- 1
Semantic knowledge graphs add a semantic layer that connects entities and relationships across disconnected enterprise data sources without requiring full data consolidation.
- 2
Stardog’s approach is built on virtualization, a semantic graph, and an inference engine to handle enterprise sprawl and enable meaning-driven reasoning.
- 3
Semantic graphs treat relationships as first-class meaning units (triple-style subject–predicate–object), supporting richer event and relationship context.
- 4
Designer supports an iterative modeling workflow: define a minimal set of concepts and relationships, publish, then map additional data sources as requirements expand.
- 5
Mapping in Stardog focuses on correspondences between source columns and graph concepts/attributes, so data can remain in place while the semantic layer unifies access.
- 6
Inference rules classify patterns and infer new relationships at query time, making implicit knowledge behave like asserted graph data.
- 7
Explainability is delivered through proof trees that show which facts and rules produced an inference, supporting auditability and trust.