Get AI summaries of any video or article — Sign up free
Python Coding Problem: Creating Your Own Iterators thumbnail

Python Coding Problem: Creating Your Own Iterators

Corey Schafer·
4 min read

Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Implement a custom iterator by storing iteration state (like an `index`) and advancing it inside `__next__`.

Briefing

Creating a custom Python iterator for a sentence—and contrasting it with a generator that yields the same words—shows how iteration state is tracked in classes and how that complexity disappears with generators. The practical goal is simple: build something that can be used in a `for` loop to return one word at a time from a space-delimited string, stopping cleanly when the words run out.

The class-based solution centers on implementing the iterator protocol. A `Sentence` class stores the input string, splits it into a list of words using `sentence.split()` (which defaults to splitting on spaces), and maintains an `index` attribute to remember where iteration is currently positioned. The `__iter__` method returns the iterator object itself (`return self`), which is what makes the instance both iterable and an iterator. The `__next__` method contains the core logic: it checks whether `self.index` has reached or exceeded `len(self.words)`. If so, it raises `StopIteration` to signal the end of the sequence. Otherwise, it returns the current word at `self.words[self.index]`, increments `self.index`, and repeats on the next call.

Testing demonstrates why `StopIteration` matters. When the object is used in a `for` loop, Python automatically catches `StopIteration` and ends the loop without showing an error. But when `next()` is called manually (e.g., printing `next(my_sentence)` repeatedly), the exception surfaces once the iterator is exhausted—exactly as expected. This difference reinforces the idea that the iterator protocol is the mechanism behind both `for` loops and manual stepping.

After completing the class, the transcript shifts to a generator function that performs the same task with less machinery. A separate function (e.g., `sentence(sentence)`) loops over `sentence.split()` and uses `yield` to emit each word one at a time. Because generators automatically handle the internal `__iter__` and `__next__` behavior, the end-of-sequence condition is handled implicitly: when the loop over the split words finishes, the generator naturally terminates, and Python treats it like an iterator that raises `StopIteration` at the right moment.

The takeaway is a practical comparison: custom iterator classes require explicit state management (`index`) and explicit termination (`raise StopIteration`), while generators let developers focus on the word-yielding logic and let Python manage the iteration protocol. Both approaches produce objects that can be looped over to print words from a sentence, but generators are typically faster to write when the iteration pattern is straightforward.

Cornell Notes

The transcript builds a `Sentence` class that can be iterated to return one word at a time from a space-delimited string. The class implements the iterator protocol by storing a list of words (`sentence.split()`), tracking position with an `index`, returning itself from `__iter__`, and raising `StopIteration` in `__next__` when the words are exhausted. It then shows an equivalent generator function that loops over `sentence.split()` and `yield`s each word, letting Python handle the iterator mechanics automatically. The difference matters because `for` loops hide `StopIteration`, while manual `next()` calls reveal it when the iterator runs out.

Why does the `Sentence` class need both `__iter__` and `__next__`?

`__iter__` makes the object usable in a `for` loop by returning an iterator. In this solution, `__iter__` returns `self`, meaning the instance is its own iterator. `__next__` performs the actual step-by-step iteration: it returns the next word or raises `StopIteration` when no words remain. Without `__next__`, Python wouldn’t know how to advance; without `__iter__`, Python wouldn’t know what iterator to use.

How does the class decide when iteration is finished?

It compares the current `index` against the length of the word list. The check is effectively: if `self.index >= len(self.words)`, then all words have already been returned, so `__next__` raises `StopIteration`. Otherwise, it returns `self.words[self.index]` and increments `self.index` so the next call moves forward.

What does `sentence.split()` contribute to the iterator?

It converts the input string into a list of words. With no delimiter argument, `split()` separates on whitespace, which matches the problem’s “split by spaces only” requirement. The iterator then reads from that list by index, returning one element per `__next__` call.

What changes when the same behavior is implemented as a generator function?

The generator function loops over `sentence.split()` and uses `yield word` for each word. There’s no explicit `index` variable and no manual `raise StopIteration`. When the loop ends (because there are no more words), the generator terminates automatically, and Python treats it as an iterator that stops at the correct time.

Why does `StopIteration` appear when calling `next()` manually but not in a `for` loop?

A `for` loop is designed to consume iterators and stop when `StopIteration` is raised, without treating it as an error. Manual calls to `next()` don’t get that automatic loop-handling, so once the iterator is exhausted, the exception surfaces to the caller.

Review Questions

  1. In the `Sentence` class approach, what exact condition triggers `StopIteration`, and what value is returned otherwise?
  2. How do `__iter__` and `__next__` work together to make an object both iterable and an iterator?
  3. Rewrite the generator logic in words: what does the generator loop over, and what does it `yield` each iteration?

Key Points

  1. 1

    Implement a custom iterator by storing iteration state (like an `index`) and advancing it inside `__next__`.

  2. 2

    Use `__iter__` to return the iterator object; returning `self` makes the instance both iterable and an iterator.

  3. 3

    Raise `StopIteration` in `__next__` when the index reaches the end of the word list.

  4. 4

    A space-delimited sentence can be tokenized with `sentence.split()` and iterated word-by-word from the resulting list.

  5. 5

    Generators can replace iterator-class boilerplate by using `yield` inside a loop over the split words.

  6. 6

    `for` loops automatically handle `StopIteration`, while manual `next()` calls will surface the exception when the iterator is exhausted.

Highlights

The iterator protocol hinges on `__iter__` and `__next__`: `__iter__` supplies the iterator, and `__next__` returns the next item or raises `StopIteration`.
Tracking an `index` is the simplest way to remember where iteration is in a custom iterator class.
A generator function that `yield`s words from `sentence.split()` achieves the same behavior without explicit `__next__` or `StopIteration` logic.
Manual `next()` calls reveal exhaustion via `StopIteration`, while `for` loops quietly stop when that exception occurs.

Topics

  • Iterators
  • Iterables
  • Python Generators
  • Iterator Protocol
  • StopIteration

Mentioned