Get AI summaries of any video or article — Sign up free
Python Tutorial: Sets - Set Methods and Operations to Solve Common Problems thumbnail

Python Tutorial: Sets - Set Methods and Operations to Solve Common Problems

Corey Schafer·
4 min read

Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Create an empty set with `set()`; `{}` creates an empty dictionary.

Briefing

Python sets are a fast, built-in way to work with collections where duplicates don’t matter—especially when the task involves comparing values across groups. Instead of manually filtering lists, sets provide direct operations like intersection and difference, plus efficient membership checks that can dramatically speed up common interview-style and real-world problems.

A set behaves like a list without duplicates: it stores unique elements and prints with curly braces. Creating a set can be done either by calling `set([...])` on an iterable or by using curly braces with values. One important gotcha: `{}` creates an empty dictionary, not an empty set—an empty set must be created with `set()`.

Once created, sets support mutation. A single element can be added with `add()`, and multiple elements can be merged in with `update()`, which accepts another iterable such as a list or another set. Removing elements can be done with `remove()` or `discard()`. The difference is error handling: `remove()` raises a `KeyError` if the element isn’t present, while `discard()` silently does nothing when the element is missing.

The real payoff comes from set operations. `intersection()` returns elements shared across sets—useful for finding overlap like “employees who are both gym members and developers.” `difference()` returns elements in one set but not another, such as “employees who are neither gym members nor developers” by subtracting both groups from the full employee list. For cases where both sides’ unique elements matter, `symmetric_difference()` returns everything that differs between two sets (e.g., values unique to either set).

These operations also scale better than hand-written list logic. The tutorial highlights a common workflow: removing duplicates from a list by converting it to a set, then converting back to a list if ordering or list output is needed. It also emphasizes performance for membership tests. Checking whether an item exists in a list is typically `O(n)` because it may scan the list, while set membership is effectively `O(1)` because the structure is optimized for fast lookups. That makes sets particularly valuable when repeatedly asking “is this value present?” across large datasets.

Overall, sets are positioned as the go-to tool whenever the problem is fundamentally about uniqueness and comparison—overlap, exclusion, and fast existence checks—whether the data comes from user input, databases, or coding interview scenarios.

Cornell Notes

Python sets store unique values and make comparisons between collections straightforward. They support `add()`/`update()` for insertion, `remove()`/`discard()` for deletion (with `remove()` raising `KeyError` when missing), and core operations like `intersection()`, `difference()`, and `symmetric_difference()` to compute overlap and differences between groups. Converting a list to a set is an efficient way to remove duplicates. Sets also speed up membership tests: checking `x in some_set` is constant time (`O(1)`) versus scanning a list (`O(n)`). These properties make sets ideal for tasks like finding employees who are both in two categories, excluding certain groups, and answering “does this value exist?” quickly.

Why does `{}` not create an empty set in Python, and what should be used instead?

`{}` creates an empty dictionary. To create an empty set, use `set()` with no arguments. This matters because using `{}` when intending a set will lead to dictionary behavior and errors when set methods are called.

What’s the practical difference between `remove()` and `discard()` on a set?

`remove(value)` deletes the element if it exists, but raises a `KeyError` if the value isn’t in the set. `discard(value)` also deletes if present, but does nothing if the value is missing—no exception. In the transcript, attempting to remove a non-existent element triggers a `KeyError`, while switching to `discard()` avoids the error.

How do `intersection()` and `difference()` map to real comparison tasks?

`intersection()` finds overlap—elements present in both (or multiple) sets. For example, intersecting a gym-members set with a developers set yields employees who are in both groups. `difference()` finds exclusion—elements in one set but not another. For instance, `employees - gym_members - developers` (implemented via chained `difference()` calls) returns employees who are in neither category.

When should `symmetric_difference()` be used instead of `difference()`?

Use `symmetric_difference()` when the goal is to return all elements that are unique to either set—everything that differs between them. In the example with two sets, it returns values like `1` and `4` that appear in only one side, regardless of which set is listed first.

Why are sets faster for membership tests than lists?

Membership tests on lists are typically `O(n)` because Python may scan through the list until it finds the value. Set membership is `O(1)` (constant time) because sets are optimized for fast lookup. That’s why converting large lists to sets can speed up repeated checks like `if name in developers:`.

What’s an efficient way to remove duplicates from a list using sets?

Convert the list to a set to remove duplicates, then convert back to a list if needed. The transcript uses `L2 = list(set(L1))`-style logic: the inner conversion removes duplicates, and the outer conversion produces a list output. This approach is described as simpler and faster than writing a custom loop.

Review Questions

  1. Given two sets A and B, which method would you use to get elements shared by both, and what would the result represent?
  2. If you need to remove an element that might not exist in a set without raising an error, which method should you choose and why?
  3. How does set membership complexity (`O(1)`) compare to list membership (`O(n)`), and what kind of workload benefits most from converting lists to sets?

Key Points

  1. 1

    Create an empty set with `set()`; `{}` creates an empty dictionary.

  2. 2

    Use `add()` to insert one element and `update()` to merge multiple elements from an iterable or another set.

  3. 3

    Choose `remove()` when missing elements should trigger an error; choose `discard()` when missing elements should be ignored safely.

  4. 4

    Use `intersection()` to find overlap between groups and `difference()` to exclude one group from another.

  5. 5

    Use `symmetric_difference()` when you need all elements that are unique to either set.

  6. 6

    Remove duplicates from a list by converting it to a set, then convert back to a list if a list output is required.

  7. 7

    For repeated “is this value present?” checks on large collections, sets provide faster membership tests than lists (`O(1)` vs `O(n)`).

Highlights

`{}` is an empty dictionary; `set()` is the correct way to create an empty set.
`remove()` raises `KeyError` for missing elements, while `discard()` fails silently.
`intersection()` and `difference()` turn multi-list comparison problems into one-liners.
Set membership checks are optimized for speed—`O(1)` lookups versus `O(n)` scans in lists.
Converting a list to a set is a fast, built-in method for duplicate removal.

Topics

  • Python Sets
  • Set Methods
  • Set Operations
  • Membership Testing
  • Duplicate Removal

Mentioned

  • O(1)
  • O(n)