Get AI summaries of any video or article — Sign up free
Python Tutorial: Itertools Module - Iterator Functions for Efficient Looping thumbnail

Python Tutorial: Itertools Module - Iterator Functions for Efficient Looping

Corey Schafer·
5 min read

Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

itertools.count generates an infinite sequence, so pair it with a finite iterable (e.g., via zip) or consume it carefully with next() to avoid infinite loops.

Briefing

Itertools is a set of Python standard-library tools built for working with iterators—sequential data you can consume one item at a time without loading everything into memory. The core payoff is efficiency: many itertools helpers can generate values indefinitely (or until a clear stopping condition), yet still let code pull just the next item when needed. That combination—lazy evaluation plus iterator-friendly composition—shows up repeatedly across the module’s functions.

The tutorial starts with itertools.count, which returns an infinite counter iterator. By default it begins at 0 and increments by 1 forever, making it easy to accidentally create an infinite loop if values are consumed without a stop condition. The practical workaround is to fetch items incrementally using next(), which retrieves one value at a time from the iterator. This matters because it enables patterns like pairing unknown-length data with generated indices. Using zip with itertools.count, the code pairs each data element with a running day index (0, 1, 2, …) and stops automatically when the finite data iterable ends. The counter is also configurable: count(start=5) begins at 5, step=5 jumps by fives, and a negative or fractional step supports counting backward or by decimals.

Next comes zip_longest, which differs from built-in zip by continuing until the longest iterable is exhausted. When the shorter iterable runs out, zip_longest fills missing positions with a default placeholder (None unless specified). This is especially useful when one iterable is finite and another is longer or conceptually unbounded—like pairing a finite list with a generated sequence—without forcing everything into memory.

The tutorial then moves through three infinite iterator generators: cycle, which repeats an iterable’s values in order forever; repeat, which yields the same value indefinitely (or a fixed number of times via times); and repeat’s common role as a constant stream feeding functions like map. It also introduces itertools.starmap, a variant of map that expects arguments already grouped into tuples, letting a function receive multiple parameters per iteration.

For terminating combinatorics, itertools.combinations and itertools.permutations generate groupings from an iterable where order does or doesn’t matter, respectively. Neither repeats elements from the original input, so a separate tool is needed when repeats are allowed. itertools.product provides Cartesian-product-style combinations with repetition (controlled by repeat), which can model things like all possible 4-digit codes from a set of digits. For combinations with repetition but without order, itertools.combinations_with_replacement is the matching counterpart.

Finally, the tutorial covers practical data-manipulation helpers: chain to concatenate multiple iterables without building a combined list; islice to slice iterators like list slicing while keeping memory usage low; compress to filter items based on a parallel boolean selector iterable; dropwhile and takewhile to skip or collect items until a predicate flips; accumulate to produce running totals (or running products via operator.mul); and groupby to group items by a key function. groupby requires the input to be sorted by the grouping key, otherwise items with the same key may end up in separate groups. The module closes with tee (itertools.tee) for replicating an iterator into multiple independent iterators, with the important warning that the original iterator should not be used after teeing to avoid exhaustion side effects.

Cornell Notes

Itertools provides memory-efficient tools for iterators, including generators that can run indefinitely and helpers that terminate predictably. itertools.count creates an infinite counter that can be paired with finite data via zip to assign indices without knowing the data length. zip_longest extends pairing until the longest iterable ends, filling missing values with None by default. The tutorial also demonstrates cycle and repeat for infinite repetition, combinations/permutations/product for combinatorics (with and without repetition), and utilities like chain, islice, compress, dropwhile/takewhile, accumulate, and groupby for common data-processing patterns. groupby requires the input to be sorted by the key to form correct groups.

Why is itertools.count considered both powerful and risky?

itertools.count returns an infinite iterator: it keeps producing values (default 0, 1, 2, …) and never stops. That makes it easy to create an infinite loop if values are consumed in a for-loop without a stopping condition. The tutorial shows the safe pattern: pull only what’s needed using next(counter) to retrieve one value at a time, or pair it with a finite iterable using zip so consumption stops when the finite side ends.

How does zip differ from itertools.zip_longest, and when does that matter?

built-in zip stops when the shortest iterable is exhausted. itertools.zip_longest continues until the longest iterable ends, pairing remaining items from the longer iterable with a fill value (None by default). This matters when one iterable is shorter and you still need aligned output for the rest—such as pairing a finite list of data with a longer generated sequence while keeping placeholders for missing elements.

What roles do cycle and repeat play among itertools’ infinite iterators?

itertools.cycle takes an iterable and loops through its values repeatedly forever, restarting from the beginning after reaching the end. itertools.repeat yields the same value indefinitely; it can also stop after a fixed number of times using times=. The tutorial highlights repeat’s usefulness as a constant stream feeding functions like map or zip, where one side needs a repeated constant.

When should you use combinations vs permutations vs product?

itertools.combinations generates groupings where order doesn’t matter (e.g., for ABCD choose 2 gives AB but not BA). itertools.permutations generates groupings where order matters (AB and BA both appear). itertools.product is for allowing repeats and building Cartesian-product-style arrangements; with repeat=4 over digits 0,1,2,3, it generates all 4-length codes including repeats like 0 0 0 0.

What’s special about itertools.groupby compared with SQL-style grouping?

itertools.groupby groups consecutive items based on a key function and expects the input iterable to be sorted by that key. If items with the same key appear in multiple separated runs, groupby will produce multiple groups for that key. The tutorial demonstrates grouping people by state and notes that sorting is required for correct aggregation.

How do islice and chain help keep memory usage low?

chain concatenates multiple iterables lazily: it iterates through the first iterable, then the next, without creating a combined list in memory. islice slices an iterator like list slicing (start/stop/step) but without converting the entire iterator to a list. The tutorial uses islice to read only the first few lines of a log file by treating the file object as an iterator.

Review Questions

  1. Which itertools functions in the tutorial can produce infinite sequences, and what stopping mechanism is used to keep code safe?
  2. Explain why groupby requires sorted input and what symptom you would see if the data weren’t sorted by the key.
  3. Give one scenario where zip_longest is preferable to zip, and describe what placeholder values appear.

Key Points

  1. 1

    itertools.count generates an infinite sequence, so pair it with a finite iterable (e.g., via zip) or consume it carefully with next() to avoid infinite loops.

  2. 2

    zip_longest keeps pairing until the longest iterable ends, using None as a default fill value when the shorter iterable runs out.

  3. 3

    cycle and repeat provide infinite repetition patterns: cycle repeats an iterable’s sequence, while repeat repeats a single constant value (optionally for a fixed times).

  4. 4

    combinations and permutations differ by whether order matters, while product (and combinations_with_replacement) handle cases where repeats are allowed.

  5. 5

    chain concatenates iterables lazily, preventing the memory cost of building a combined list.

  6. 6

    islice enables iterator slicing (start/stop/step) without materializing the full iterator, which is useful for large files and logs.

  7. 7

    groupby groups by a key function but requires the input to be sorted by that key to form correct groups.

Highlights

itertools.count can run forever, but zip can safely stop consumption when the finite iterable ends—turning an infinite generator into practical indexed pairing.
zip_longest continues past the end of the shorter iterable and fills gaps with None, enabling aligned output even when lengths differ.
repeat is often used to supply a constant stream into map or zip, while starmap handles functions that expect tuple-packed arguments.
product is the go-to tool when repeats are allowed (e.g., generating all 4-digit codes from digits 0–3).
groupby only groups adjacent items and therefore depends on the input being sorted by the grouping key.

Topics

  • Itertools Module
  • Infinite Iterators
  • Combinatorics
  • Iterator Slicing
  • Grouping Iterables

Mentioned