Python Tutorial: File Objects - Reading and Writing to Files

TL;DR

Use with open(...) to guarantee files close automatically, preventing file-descriptor leaks and errors.

Briefing Cornell Notes

Briefing

Python file objects hinge on two practical choices: opening files in the right mode and managing their lifecycle safely. Using open() without cleanup can leave files open and eventually exhaust the system’s file descriptors, so the tutorial pushes the with context manager as the default pattern. Inside that with block, the file object stays usable; once the block ends, the file is automatically closed—even if the variable still exists—so attempts to read afterward raise an “I/O operation on a closed file” error.

Once a file is open for reading (mode “r”), the tutorial walks through several ways to pull data out. f.read() loads the entire contents into memory, which is fine for small files but risky for very large ones. f.readlines() returns a list of lines, preserving newline characters, while f.readline() advances one line at a time. For large files, iterating directly—“for line in f”—streams line by line without loading everything at once. When more control is needed, f.read(size) reads a fixed number of characters and advances the file pointer accordingly. The pointer position can be inspected via f.tell(), and repositioned with f.seek(). A chunked read loop uses f.read(size) repeatedly until it returns an empty string at end-of-file, avoiding both memory blowups and hardcoded assumptions about file length.

Writing follows the same mode-driven logic. Opening a file in read mode prevents writing and triggers an error (“not writable”), so writing requires mode “w” (overwrite/create) or “a” (append). After writing, the file pointer advances just like reading, so consecutive writes append at the current position. f.seek() can reposition during writing too, but it can be confusing: seeking to the start and writing again overwrites only the portion covered by the new content, leaving any remaining bytes from the earlier write intact.

To combine reading and writing, the tutorial demonstrates copying files using two context managers at once: open the source in read mode and open the destination in write mode, then write each line from the source to the target. The same approach fails for images when treated as text, producing a UnicodeDecodeError. The fix is binary mode: add “b” to the read/write modes (e.g., “rb” and “wb”) so the code transfers bytes rather than decoding text. Finally, it shows chunk-based copying for large binary files: read a fixed-size block (e.g., 496 bytes) in a loop and write each chunk until the read returns an empty result, producing an exact copy without excessive memory use.

Overall, the core takeaway is that reliable file handling in Python comes down to correct modes, context-managed cleanup, and choosing the right read/write strategy—line iteration for text, binary mode for images, and chunked loops for large data.

Cornell Notes

File handling in Python depends on opening files with the correct mode and using context managers to ensure automatic cleanup. A file opened with open() must be closed; otherwise, open handles can accumulate and trigger file-descriptor errors. Inside a with open(...) block, reading methods like read(), readlines(), readline(), and line iteration work differently—read() loads everything, while iteration streams line by line. For large files, f.read(size) plus a loop reads fixed-size chunks until end-of-file (empty string). Writing requires write mode (“w”) or append mode (“a”); writing in read mode fails. For images and other non-text data, use binary mode (“rb”/“wb”) and copy in chunks to avoid decoding errors and memory issues.

Why does code that reads from a file need a context manager, even if it “works” without one?

Opening a file without closing it can leak file handles. Over time, that can exhaust the system’s allowed file descriptors and cause runtime errors. The with open(...) pattern automatically closes the file when the block ends—also if an exception occurs. Even if the variable referencing the file object still exists after the block, the file is closed, so reading afterward raises an “I/O operation on a closed file” error.

When should f.read(), f.readlines(), f.readline(), or iterating “for line in f” be used?

f.read() loads the entire file into memory, which is practical for small files but risky for large ones. f.readlines() returns a list of lines (including newline characters). f.readline() reads one line per call and advances the file position each time. Iterating with “for line in f” is typically best for large text files because it streams one line at a time without loading the whole file at once.

How does chunked reading with f.read(size) prevent memory problems?

f.read(size) reads only a specified number of characters and advances the file pointer. A loop can repeatedly read chunks until f.read(size) returns an empty string at end-of-file. This approach avoids loading the entire file into memory while still allowing controlled processing. The tutorial also uses f.tell() to show the current position and f.seek() to move the pointer when needed.

What changes when switching from reading to writing files?

Writing requires opening the file in a writable mode. Opening in read mode (“r”) makes f.write(...) fail with a “not writable” error. Mode “w” creates the file if it doesn’t exist and overwrites it if it does; mode “a” appends instead. The file pointer advances after each write, and f.seek() can reposition it—though seeking during writing can overwrite only part of the existing content, leaving the rest unchanged.

Why does copying an image fail in text mode, and how is binary mode different?

Text mode tries to decode bytes into characters, which breaks for image data and can trigger UnicodeDecodeError (e.g., “can’t decode byte ...”). Binary mode (“rb” and “wb”) treats the file as raw bytes, so the copy transfers data exactly without decoding. The tutorial then copies the image either line-by-line in binary mode or, more robustly, in fixed-size byte chunks.

How does chunked binary copying work for large files?

The code reads a fixed-size byte block from the source (e.g., chunk size 496) using rf.read(chunk_size), then writes that block to the destination. It repeats in a while loop until the read returns an empty bytes object, indicating end-of-file. This ensures an exact copy while keeping memory usage bounded.

Review Questions

What happens if you try to read from a file object after exiting its with open(...) block?
Compare f.read() and iterating “for line in f” in terms of memory usage and suitability for large files.
When copying a JPEG, why is binary mode required, and what modes should be used for reading and writing?

Key Points

1
Use with open(...) to guarantee files close automatically, preventing file-descriptor leaks and errors.
2
Open files with the correct mode: “r” for reading, “w” for overwrite/create writing, “a” for append, and “rb”/“wb” for binary data like images.
3
Choose reading methods based on size: f.read() for small files, line iteration for large text files, and f.read(size) loops for controlled chunk processing.
4
Track and control the file pointer with f.tell() and f.seek(); seeking changes where the next read/write occurs.
5
Writing in read mode fails; writing advances the file pointer, so consecutive writes continue where the last one left off.
6
Copy files by opening source for reading and destination for writing, then transferring data (lines for text, bytes for binary).
7
For large binary files, copy in fixed-size chunks in a loop until the read returns empty, ensuring exact output without excessive memory use.

Highlights

After leaving a with open(...) block, the file object may still exist, but it’s closed—any read attempt triggers an “I/O operation on a closed file” error.

f.read(size) advances through the file and returns an empty string at end-of-file, enabling safe chunked loops for large text files.

Attempting to copy a JPEG in text mode causes a UnicodeDecodeError; binary mode (“rb”/“wb”) fixes it by transferring raw bytes.

Chunked binary copying (read N bytes, write N bytes until empty) produces exact copies while keeping memory usage stable.

Topics

File Modes
Context Managers
Reading Strategies
Chunked I/O
Binary File Copying

Mentioned

Corey Schafer