Python Tutorial: Pathlib - The Modern Way to Handle File Paths
Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
pathlib models file paths as objects, replacing fragile string-based os.path patterns with clearer, chainable operations.
Briefing
Pathlib is poised to replace Python’s OS-based path handling because it turns file paths into readable, safer objects instead of brittle strings—making cross-platform code easier to write and harder to break. Introduced in Python 3.4, pathlib models paths as objects with intuitive, chainable operations (like moving from a file to its parent directory) and attributes (like suffix and stem). That shift matters most in real projects where path logic quickly becomes hard to parse, especially when developers need absolute locations, directory traversal, or consistent behavior across operating systems.
A key contrast comes from how developers compute “base directories” in larger codebases such as Django settings. Older patterns using os.path often require reading expressions from right to left: grab the absolute path of __file__, then repeatedly take parent directories to reach the project root. The pathlib approach compresses the same intent into a left-to-right chain: resolve the absolute path of the current file, then take parent directories as needed. The result is shorter code that communicates structure more directly—an advantage that helps teams maintain and review path-heavy configuration.
The tutorial then walks through practical pathlib usage. Creating Path objects is straightforward: Path() defaults to the current working directory, and Path('directory') or Path('file.txt') uses relative paths. Path objects expose common metadata without manual string splitting: .name returns the final component, .suffix gives the extension (empty for directories), and .stem returns the filename without the extension. Paths can be combined using the / operator (e.g., directory_path / 'new_file.txt'), and the same behavior is available via joinpath for those who prefer method calls. Importantly, these path objects can represent files that don’t exist yet; existence checks use .exists().
Absolute vs. relative handling is where pathlib’s ergonomics stand out. The .parent attribute supports chaining to move up multiple levels, while .absolute and .resolve differ in how they interpret relative directory references like '..'. The tutorial emphasizes that .resolve is usually the better choice because it resolves relative references and follows symlinks, producing the intended absolute location. For user home directories, pathlib’s behavior depends on how the path is constructed: using a tilde (~) inside a string won’t automatically expand under resolve, so .expanduser() (for strings) or Path.home() (for new paths) is the correct approach.
For searching and discovery, pathlib offers .glob() for pattern matching in a single directory and .glob('**/pattern', recursive=True) via recursive_glob behavior to include subdirectories. Case sensitivity can be controlled with a flag so mixed-case filenames still match when desired. pathlib objects also integrate cleanly with file I/O: they can be passed directly to open(), and they provide an .open() method.
Finally, pathlib supports basic filesystem operations: creating directories with .mkdir() (optionally with parents=True), removing empty directories with .rmdir(), creating files with .touch(), and deleting files with .unlink(). Renaming is handled by .rename() or .replace() (the latter is safer for overwriting across platforms). The tutorial closes by outlining when not to use pathlib: OS remains appropriate for environment variables and other OS-level concerns, while shutil is better for tasks like copying files and deleting non-empty directories. Overall, the takeaway is a pragmatic migration path: use pathlib for path manipulation, keep OS for system queries, and reach for shutil when operations go beyond paths into full file management.
Cornell Notes
Pathlib replaces string-based os.path path handling with object-based paths that are easier to read, safer to compose, and more consistent across platforms. It provides intuitive attributes like .name, .suffix, and .stem, plus chainable navigation via .parent and absolute path generation via .resolve. The / operator (and joinpath) makes building paths concise, while .exists() lets code check whether a path actually exists even if the path object was created for a file that isn’t there yet. For searching, .glob() and recursive glob patterns find matching files and folders, and pathlib paths can be passed directly to open() without converting to strings. Basic filesystem operations like mkdir, touch, unlink, and rmdir are supported, but shutil is recommended for non-empty directory removal and richer copy/move workflows.
Why does pathlib often produce clearer code than os.path when computing directories like a project “base directory”?
What are the most useful pathlib attributes for extracting information from a path?
How do you build a path to a new file inside an existing directory using pathlib?
What’s the practical difference between .absolute() and .resolve() when dealing with '..' and symlinks?
How should code handle a user home directory when using pathlib?
When should pathlib be avoided in favor of OS or shutil?
Review Questions
- How would you rewrite an os.path-based “two levels up from __file__” base directory calculation using pathlib chaining?
- When would you choose .expanduser() over Path.home(), and why?
- What combination of methods would you use to recursively find all .json files in a directory while ignoring case?
Key Points
- 1
pathlib models file paths as objects, replacing fragile string-based os.path patterns with clearer, chainable operations.
- 2
Use .name, .suffix, and .stem to extract path components without manual string splitting.
- 3
Build paths with the / operator (or joinpath) and use .exists() to check whether a path actually exists.
- 4
Prefer .resolve() over .absolute() for correct absolute paths, especially when '..' and symlinks are involved.
- 5
Handle home directories correctly: use .expanduser() when parsing '~' from a string, and Path.home() when constructing paths from scratch.
- 6
Use .glob() for pattern matching in one directory and recursive glob patterns to search subdirectories; control case sensitivity when needed.
- 7
Reach for OS for environment/system queries and shutil for tasks like deleting non-empty directories or copying files.