Python Tutorial: Sets - Set Methods and Operations to Solve Common Problems
Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Create an empty set with `set()`; `{}` creates an empty dictionary.
Briefing
Python sets are a fast, built-in way to work with collections where duplicates don’t matter—especially when the task involves comparing values across groups. Instead of manually filtering lists, sets provide direct operations like intersection and difference, plus efficient membership checks that can dramatically speed up common interview-style and real-world problems.
A set behaves like a list without duplicates: it stores unique elements and prints with curly braces. Creating a set can be done either by calling `set([...])` on an iterable or by using curly braces with values. One important gotcha: `{}` creates an empty dictionary, not an empty set—an empty set must be created with `set()`.
Once created, sets support mutation. A single element can be added with `add()`, and multiple elements can be merged in with `update()`, which accepts another iterable such as a list or another set. Removing elements can be done with `remove()` or `discard()`. The difference is error handling: `remove()` raises a `KeyError` if the element isn’t present, while `discard()` silently does nothing when the element is missing.
The real payoff comes from set operations. `intersection()` returns elements shared across sets—useful for finding overlap like “employees who are both gym members and developers.” `difference()` returns elements in one set but not another, such as “employees who are neither gym members nor developers” by subtracting both groups from the full employee list. For cases where both sides’ unique elements matter, `symmetric_difference()` returns everything that differs between two sets (e.g., values unique to either set).
These operations also scale better than hand-written list logic. The tutorial highlights a common workflow: removing duplicates from a list by converting it to a set, then converting back to a list if ordering or list output is needed. It also emphasizes performance for membership tests. Checking whether an item exists in a list is typically `O(n)` because it may scan the list, while set membership is effectively `O(1)` because the structure is optimized for fast lookups. That makes sets particularly valuable when repeatedly asking “is this value present?” across large datasets.
Overall, sets are positioned as the go-to tool whenever the problem is fundamentally about uniqueness and comparison—overlap, exclusion, and fast existence checks—whether the data comes from user input, databases, or coding interview scenarios.
Cornell Notes
Python sets store unique values and make comparisons between collections straightforward. They support `add()`/`update()` for insertion, `remove()`/`discard()` for deletion (with `remove()` raising `KeyError` when missing), and core operations like `intersection()`, `difference()`, and `symmetric_difference()` to compute overlap and differences between groups. Converting a list to a set is an efficient way to remove duplicates. Sets also speed up membership tests: checking `x in some_set` is constant time (`O(1)`) versus scanning a list (`O(n)`). These properties make sets ideal for tasks like finding employees who are both in two categories, excluding certain groups, and answering “does this value exist?” quickly.
Why does `{}` not create an empty set in Python, and what should be used instead?
What’s the practical difference between `remove()` and `discard()` on a set?
How do `intersection()` and `difference()` map to real comparison tasks?
When should `symmetric_difference()` be used instead of `difference()`?
Why are sets faster for membership tests than lists?
What’s an efficient way to remove duplicates from a list using sets?
Review Questions
- Given two sets A and B, which method would you use to get elements shared by both, and what would the result represent?
- If you need to remove an element that might not exist in a set without raising an error, which method should you choose and why?
- How does set membership complexity (`O(1)`) compare to list membership (`O(n)`), and what kind of workload benefits most from converting lists to sets?
Key Points
- 1
Create an empty set with `set()`; `{}` creates an empty dictionary.
- 2
Use `add()` to insert one element and `update()` to merge multiple elements from an iterable or another set.
- 3
Choose `remove()` when missing elements should trigger an error; choose `discard()` when missing elements should be ignored safely.
- 4
Use `intersection()` to find overlap between groups and `difference()` to exclude one group from another.
- 5
Use `symmetric_difference()` when you need all elements that are unique to either set.
- 6
Remove duplicates from a list by converting it to a set, then convert back to a list if a list output is required.
- 7
For repeated “is this value present?” checks on large collections, sets provide faster membership tests than lists (`O(1)` vs `O(n)`).