Supervised vs Unsupervised vs Semi / Self Supervised vs Reinforcement Learning | Machine Learning
Based on Ciara Feely's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Machine learning is described as applied statistics: models improve from data experience by learning statistical patterns.
Briefing
Machine learning is essentially applied statistics that lets systems improve from experience—either from labeled examples provided by humans or from patterns found in data without labels. The practical payoff is that computers can learn to recognize speech, filter spam, generate text, and recommend content, often by extracting “features” (input variables) and mapping them to an outcome (the label or target). That framing matters because it clarifies why different learning approaches exist: the availability and cost of labels, the structure of the data, and whether feedback arrives immediately or only after actions.
The transcript starts by defining machine learning as a branch of artificial intelligence that uses statistical methods to improve with experience from data. Deep learning is introduced as a subcategory that relies on multi-layer networks to make decisions. Features are described as the input variables—like housing attributes such as number of rooms, bathrooms, and square footage—while the output variable is the target, such as housing price. This input-output setup becomes the backbone for later distinctions among learning types.
Several real-world examples tied to YouTube illustrate where machine learning shows up in everyday products. Auto-generated captions rely on speech recognition that converts spoken audio into text, with accuracy affected by accents. Spam classification in YouTube Studio filters comments into “likely spam” or “needs review” by learning from previously labeled examples. Text generation appears in suggested replies, where the system learns what responses fit based on patterns from prior interactions. Recommender systems—highlighted as the creator’s research area—power personalized video suggestions on platforms like Netflix, Amazon, and YouTube, using signals such as “people who liked this also liked that,” video similarity, and preferences from similar users.
From there, the transcript lays out the main learning categories. Supervised learning trains on a dataset with labels, meaning the “right answer” is known for each example. The goal is to learn a general mapping from features to labels so predictions can be made for new inputs without known outcomes. A key limitation is that labeled data can be expensive or impossible to exhaustively collect—speech recognition, for instance, faces an effectively infinite space of possible sentences.
Unsupervised learning flips the assumption: the data has no labels, so the task is to find structure such as clusters or anomalies. The transcript uses anomaly detection as an example, where most points form clusters representing normal behavior and outliers fall outside those clusters. Because human intuition breaks down in high-dimensional spaces, algorithms help identify outliers that are not visually obvious.
Self-supervised learning is presented as a way to manufacture labels from unlabeled data by masking part of the input and training the model to predict the missing piece. The same idea is applied to text (fill in a missing word or phrase) and images (reconstruct a missing region), often using deep learning due to the complexity of learning these underlying properties.
Semi-supervised learning addresses situations where labels are scarce but unlabeled data is abundant, such as web content tagging or speech tasks. It relies on assumptions like input similarity implying output similarity, and it works best when the labeling step is simpler than full human annotation.
Finally, reinforcement learning is introduced as a reward-driven approach rather than a label-driven one. An agent takes actions in an environment, receives a reward signal (often delayed), and learns through trial and error to maximize long-term success. Chess is used as the archetype: the agent explores moves repeatedly, learns how board states and opponent responses affect winning, and updates its strategy based on outcomes that may only become clear after sequences of actions. The transcript closes by emphasizing that reinforcement learning’s delayed rewards and repeated interaction are central to how it learns.
Cornell Notes
Machine learning is framed as applied statistics: systems improve from experience by learning patterns in data. Supervised learning uses labeled training data to learn a mapping from features (inputs) to labels (outputs), but labels can be costly or infeasible to obtain at scale. Unsupervised learning uses unlabeled data to discover structure such as clusters and anomalies. Self-supervised learning creates labels from unlabeled data by masking part of the input and training the model to predict the missing piece, often using deep learning. Semi-supervised learning sits between the two extremes by using a small labeled set plus a large unlabeled set under assumptions like “similar inputs lead to similar outputs.” Reinforcement learning differs again: an agent interacts with an environment, receives reward feedback (often delayed), and learns via trial and error to maximize long-term reward.
How does supervised learning differ from unsupervised learning in terms of data and goals?
Why is labeled data often the bottleneck in supervised learning?
What makes self-supervised learning “self-supervised” rather than purely unsupervised?
What assumptions does semi-supervised learning rely on to work with few labels?
How does reinforcement learning’s feedback mechanism change what the agent learns?
Review Questions
- In your own words, what is the role of labels in supervised learning, and why does that create a practical limitation?
- Give one example each of a supervised, unsupervised, and self-supervised task, and explain how labels (or their absence) drive the learning objective.
- What does “delayed reward” mean in reinforcement learning, and why does it matter for learning strategies?
Key Points
- 1
Machine learning is described as applied statistics: models improve from data experience by learning statistical patterns.
- 2
Features are the input variables (e.g., rooms, bathrooms, square footage), while labels/targets are the outputs to predict (e.g., housing price).
- 3
Supervised learning requires labeled training data and learns a feature-to-label mapping, but label collection can be infeasible in domains with enormous output variety.
- 4
Unsupervised learning uses unlabeled data to discover structure such as clusters and anomalies, which becomes especially important in high-dimensional spaces.
- 5
Self-supervised learning turns unlabeled data into a labeled training problem by masking parts of inputs and predicting the missing content.
- 6
Semi-supervised learning combines a small labeled set with a large unlabeled set, relying on assumptions like similar inputs producing similar outputs.
- 7
Reinforcement learning trains an agent to maximize reward through trial and error, often with delayed feedback that depends on action sequences.