Masterclass: Knowledge Graphs & Massive Language Models — The Future of AI, RelationalAI | KGC 2023
Based on The Knowledge Graph Conference 's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
LLMs are framed as “instructable computers” because they can read language inputs and follow instructions to perform professional tasks, not because they behave like humans.
Briefing
Conversational AI is being treated as a new “computer for humans,” but the practical breakthrough isn’t that it behaves like people—it’s that it can reliably consume and generate language at scale, then be steered to perform professional tasks. The session frames today’s large language models (using GPT-4 as the exemplar) as instructable systems: they can read text and soon multimodal inputs (images, audio, video), operate within private intranets, and follow instructions well enough to become an operational layer for knowledge work in finance, law, healthcare, and entertainment.
A central theme is how this capability emerged. The talk contrasts older “computer science 1.0” approaches—where engineers specify algorithms and data models explicitly—with “computer science 2.0,” where code is effectively learned from data. Neural networks learn parameterized continuous functions (with GPT-4 described as having up to a trillion parameters), trained via stochastic gradient descent to minimize prediction errors. For language, the key training mechanism is self-supervision: predict the next token from prior tokens. Scale matters: more data, more parameters, and better architectures (especially Transformers with self-attention) enable generalization across many tasks without building a separate system for each.
Yet raw language modeling isn’t enough for “instructable computers” that stay on task. The session highlights dialogue management and reinforcement learning with human feedback (RLHF) as a way to align outputs with preferences. Instead of asking humans to label the “correct” answer, evaluators compare alternatives and choose which is better for the goal—simplifying feedback into scalable preference signals.
From there, the talk pivots to enterprise knowledge infrastructure, especially knowledge graphs. Knowledge graphs remain valuable because they provide structured, verifiable representations with integrity constraints, and they reduce ambiguity that embeddings and free-form text can struggle with. But large language models can also generate knowledge graphs “from thin air,” producing ontologies and extracting facts from documents using instructions like “answer questions based on this ontology” or “emit a Datalog schema.” The tradeoff is control: models may hallucinate entities or relationships, so the workflow must treat the model as an unruly collaborator—iteratively asking it to revise the ontology and checking outputs against constraints.
The session also lays out practical architectures for combining systems. One approach uses knowledge graphs as a cache or pre-processing layer, then lets the language model handle natural-language querying, reasoning, and explanation. Another approach relies on vector databases (embeddings) as external memory for retrieval-augmented generation, then uses multi-hop prompting to iteratively ask for missing information. The talk emphasizes that multi-step reasoning can be expensive due to multiple model calls and token costs, so retrieval and context selection matter.
The “elephant in the room” is the enterprise decision: should teams invest in knowledge graphs and logical models, or in embeddings plus LLMs? The answer offered is not either/or. Knowledge graphs can improve reliability and entity consistency over time, while LLMs reduce the friction of building and querying those structures in natural language. The future described is a hybrid stack—knowledge graphs for precision and governance, language models for instruction-following and flexible interaction, and retrieval mechanisms to keep context grounded.
Cornell Notes
The session argues that modern LLMs (GPT-4 as the example) function as “instructable computers” for professionals because they can read and generate language, follow instructions, and be aligned to goals via dialogue management and RLHF. Their capabilities come from learning continuous, parameterized functions from massive text using self-supervised next-token prediction and Transformer architectures, with scale as a key driver. In enterprise settings, LLMs can draft ontologies and extract facts from documents, but they can also hallucinate—so outputs must be checked and revised against integrity constraints. Knowledge graphs remain valuable for reliability and structured reasoning, and the most practical direction is hybrid: use knowledge graphs for governance and verification, and LLMs plus retrieval/vector search for natural-language interaction and multi-step problem solving.
Why does the talk treat GPT-4-like systems as a new “computer for humans,” and what makes them different from traditional software?
How did the field move from “computer science 1.0” to “computer science 2.0” in this explanation?
What training objective makes language models work without human labeling for every example?
Why are knowledge graphs still important if LLMs can generate ontologies and extract facts?
How do vector databases and multi-hop prompting fit into enterprise retrieval and reasoning?
What’s the practical “hybrid” architecture direction implied by the enterprise tradeoff?
Review Questions
- What specific training mechanism allows language models to learn from unlabeled raw text, and how does it translate into next-token prediction?
- In the hybrid KG + LLM approach, what role do integrity constraints play in controlling hallucinations or invented entities?
- Why can multi-hop retrieval and reasoning increase cost, and what strategies does the talk suggest to manage that cost?
Key Points
- 1
LLMs are framed as “instructable computers” because they can read language inputs and follow instructions to perform professional tasks, not because they behave like humans.
- 2
The shift from hand-coded algorithms to learned parameterized functions is central: neural networks approximate continuous functions trained via stochastic gradient descent.
- 3
Language model training relies on self-supervision—predicting the next token—so massive text can generate training signals without manual labeling for every example.
- 4
Alignment for instruction-following depends on dialogue management and RLHF-style preference feedback, where humans compare outputs rather than provide absolute correctness labels.
- 5
LLMs can generate ontologies and extract facts from documents, but they can also hallucinate; enterprise workflows must treat them as iterative collaborators and verify against constraints.
- 6
Knowledge graphs remain valuable for reliability and entity consistency, while vector databases and retrieval loops help ground answers in relevant enterprise documents.
- 7
The most practical direction is hybrid: combine knowledge graphs (precision/governance) with LLMs (natural-language interaction/reasoning) and retrieval (context selection).