Chapter 16: Useful Standard Library Modules

Python ships with a massive standard library — "batteries included" is the official motto. You don't need to install anything for these; they're part of every Python installation.

This chapter is a practical tour of the modules you'll reach for most often. By the end you'll know what each one does, when to use it, and the key functions to remember. Think of this chapter as a toolbox reference — you don't need to memorize everything, but knowing what tools exist means you'll never reinvent something Python already built.

`os` and `os.path` — The File System

The os module lets you interact with the operating system: create and delete files and directories, read environment variables, navigate paths, and run system commands.

import os

# Current working directory
print(os.getcwd())          # C:\Users\Alice\projects

# List files in a directory
print(os.listdir("."))       # ['main.py', 'data', 'README.md']
print(os.listdir("data"))   # ['sales.csv', 'users.csv']

# Create a directory (raises FileExistsError if it exists)
os.mkdir("output")

# Create nested directories
os.makedirs("output/reports/2026", exist_ok=True)   # no error if exists

# Delete a file
os.remove("temp.txt")

# Delete an empty directory
os.rmdir("old_folder")

# Delete a directory and all its contents (use carefully!)
import shutil
shutil.rmtree("old_folder")

# Rename or move
os.rename("old_name.txt", "new_name.txt")

# Check if a path exists
print(os.path.exists("data.csv"))    # True or False

# Is it a file? A directory?
print(os.path.isfile("data.csv"))    # True
print(os.path.isdir("output"))       # True

# Join paths safely (handles / vs \ automatically)
path = os.path.join("data", "2026", "sales.csv")
print(path)   # data/2026/sales.csv  (or data\2026\sales.csv on Windows)

# Split a path into directory and filename
folder, filename = os.path.split("/home/alice/data/sales.csv")
print(folder)    # /home/alice/data
print(filename)  # sales.csv

# Get just the extension
name, ext = os.path.splitext("report.pdf")
print(name)   # report
print(ext)    # .pdf

# Absolute path
print(os.path.abspath("data.csv"))   # full path from root

Environment variables

import os

# Read an environment variable
api_key = os.environ.get("API_KEY", "not_set")
print(api_key)

# Set one for the current process
os.environ["MY_VAR"] = "hello"

# All environment variables
for key, value in os.environ.items():
    print(f"{key}={value}")

Environment variables are the standard way to keep secrets (API keys, passwords) out of your code. Never hardcode credentials — read them from the environment.

`pathlib` — Modern File Paths

pathlib is the modern, object-oriented replacement for os.path. Use it for all new code.

from pathlib import Path

# Create a path object
p = Path("data/sales.csv")

print(p.name)       # sales.csv
print(p.stem)       # sales
print(p.suffix)     # .csv
print(p.parent)     # data
print(p.is_file())  # True/False
print(p.exists())   # True/False

# Join paths with /
base = Path("data")
full = base / "2026" / "sales.csv"
print(full)   # data/2026/sales.csv

# Read and write files
text = full.read_text(encoding="utf-8")
full.write_text("new content", encoding="utf-8")

# Create directories
Path("output/reports").mkdir(parents=True, exist_ok=True)

# List files matching a pattern
for csv_file in Path("data").glob("*.csv"):
    print(csv_file)

# Recursive glob
for py_file in Path(".").rglob("*.py"):
    print(py_file)

# Absolute path
print(p.resolve())   # full absolute path

# Home directory
print(Path.home())   # /home/alice or C:\Users\Alice

Why pathlib over os.path? The / operator for joining, readable attributes (.name, .stem, .suffix), built-in read/write, and glob() built in. It's cleaner in every way.

`sys` — The Python Interpreter

import sys

# Python version
print(sys.version)        # 3.12.4 (main, Jun 6 2024, ...)
print(sys.version_info)   # sys.version_info(major=3, minor=12, ...)

# Minimum version check
if sys.version_info < (3, 10):
    print("This script requires Python 3.10 or higher.")
    sys.exit(1)

# Command-line arguments
# If you run: python script.py hello world
print(sys.argv)        # ['script.py', 'hello', 'world']
print(sys.argv[0])     # script.py  (the script name)
print(sys.argv[1:])    # ['hello', 'world']  (the arguments)

# The module search path
print(sys.path)

# Exit the program with a status code
# 0 = success, anything else = error
sys.exit(0)

# Standard streams
sys.stdout.write("Hello\n")   # same as print()
sys.stderr.write("Error!\n")  # writes to stderr

# Size of an object in bytes
print(sys.getsizeof([]))        # 56
print(sys.getsizeof([1,2,3]))   # 88

# Platform
print(sys.platform)   # 'win32', 'darwin', 'linux'

A simple command-line argument example:

import sys

if len(sys.argv) != 3:
    print(f"Usage: python {sys.argv[0]} <name> <age>")
    sys.exit(1)

name = sys.argv[1]
age = int(sys.argv[2])
print(f"Hello, {name}! You are {age} years old.")

Run it: python greet.py Alice 30 -> Hello, Alice! You are 30 years old.

`datetime` — Dates and Times

from datetime import datetime, date, time, timedelta

# Current date and time
now = datetime.now()
print(now)                          # 2026-03-09 14:30:45.123456
print(now.year)                     # 2026
print(now.month)                    # 3
print(now.day)                      # 9
print(now.hour)                     # 14
print(now.minute)                   # 30
print(now.second)                   # 45

# Create a specific datetime
birthday = datetime(1990, 7, 15, 8, 30)
print(birthday)   # 1990-07-15 08:30:00

# Just a date
today = date.today()
print(today)         # 2026-03-09
print(today.weekday())  # 0=Monday ... 6=Sunday

# Formatting — strftime (string from time)
print(now.strftime("%Y-%m-%d"))             # 2026-03-09
print(now.strftime("%d/%m/%Y %H:%M"))       # 09/03/2026 14:30
print(now.strftime("%B %d, %Y"))            # March 09, 2026
print(now.strftime("%A, %B %d"))            # Monday, March 09

# Parsing — strptime (string parse time)
date_str = "2026-03-09"
parsed = datetime.strptime(date_str, "%Y-%m-%d")
print(parsed)   # 2026-03-09 00:00:00

# Arithmetic with timedelta
tomorrow  = today + timedelta(days=1)
last_week = today - timedelta(weeks=1)
in_90_days = today + timedelta(days=90)

print(f"Tomorrow:    {tomorrow}")
print(f"Last week:   {last_week}")
print(f"In 90 days:  {in_90_days}")

# Difference between two dates
start = date(2026, 1, 1)
end   = date(2026, 12, 31)
diff  = end - start
print(f"Days in 2026: {diff.days}")   # 364

# Age calculator
def calculate_age(birthdate):
    today = date.today()
    years = today.year - birthdate.year
    if (today.month, today.day) < (birthdate.month, birthdate.day):
        years -= 1
    return years

print(calculate_age(date(1990, 7, 15)))   # e.g. 35

Common `strftime` format codes

Code	Meaning	Example
`%Y`	4-digit year	2026
`%m`	Zero-padded month	03
`%d`	Zero-padded day	09
`%H`	Hour (24h)	14
`%M`	Minute	30
`%S`	Second	45
`%A`	Full weekday name	Monday
`%B`	Full month name	March
`%I`	Hour (12h)	02
`%p`	AM/PM	PM

`random` — Randomness

import random

# Random float between 0.0 and 1.0
print(random.random())           # 0.7342...

# Random float in a range
print(random.uniform(1.0, 10.0)) # 7.234...

# Random integer (inclusive on both ends)
print(random.randint(1, 6))      # die roll: 1, 2, 3, 4, 5, or 6

# Random integer from range (exclusive end, like range())
print(random.randrange(0, 100, 5))  # 0, 5, 10, ..., 95

# Choose one random item from a sequence
fruits = ["apple", "banana", "cherry", "date"]
print(random.choice(fruits))     # one random fruit

# Choose multiple items (without replacement)
print(random.sample(fruits, 2))  # 2 unique random fruits

# Choose multiple items (with replacement)
print(random.choices(fruits, k=3))  # 3 items, may repeat

# Shuffle a list in place
deck = list(range(1, 53))
random.shuffle(deck)
print(deck[:5])   # first 5 cards after shuffle

# Reproducible randomness — set a seed
random.seed(42)
print(random.randint(1, 100))   # always the same number when seed=42

Seeding is essential for testing — same seed, same sequence, reproducible results every time.

`json` — Reading and Writing JSON

JSON (JavaScript Object Notation) is the most common data format for APIs and configuration files.

import json

# Python dict -> JSON string
data = {
    "name": "Alice",
    "age": 30,
    "scores": [88, 92, 76],
    "active": True,
    "address": None
}

json_string = json.dumps(data)
print(json_string)
# {"name": "Alice", "age": 30, "scores": [88, 92, 76], "active": true, "address": null}

# Pretty-printed
pretty = json.dumps(data, indent=2)
print(pretty)

# JSON string -> Python dict
back = json.loads(json_string)
print(back["name"])   # Alice
print(type(back))     # <class 'dict'>

# Write JSON to a file
with open("data.json", "w") as f:
    json.dump(data, f, indent=2)

# Read JSON from a file
with open("data.json", "r") as f:
    loaded = json.load(f)

print(loaded["scores"])   # [88, 92, 76]

JSON ↔ Python type mapping

JSON	Python
object `{}`	`dict`
array `[]`	`list`
string `""`	`str`
number	`int` or `float`
`true` / `false`	`True` / `False`
`null`	`None`

`re` — Regular Expressions

Regular expressions let you search, match, and transform text using patterns. They're concise but powerful — one pattern can match thousands of variations.

import re

text = "Contact us at support@example.com or sales@company.org"

# Search for one match
match = re.search(r"\b\w+@\w+\.\w+\b", text)
if match:
    print(match.group())   # support@example.com

# Find all matches
emails = re.findall(r"\b\w+@\w+\.\w+\b", text)
print(emails)   # ['support@example.com', 'sales@company.org']

# Replace matches
cleaned = re.sub(r"\b\w+@\w+\.\w+\b", "[EMAIL]", text)
print(cleaned)   # Contact us at [EMAIL] or [EMAIL]

# Match from the start of a string
phone = "555-1234"
if re.match(r"\d{3}-\d{4}", phone):
    print("Valid phone format")

# Split on a pattern
parts = re.split(r"[,;\s]+", "one, two;  three    four")
print(parts)   # ['one', 'two', 'three', 'four']

# Groups — capture parts of a match
date_text = "Today is 2026-03-09."
match = re.search(r"(\d{4})-(\d{2})-(\d{2})", date_text)
if match:
    year, month, day = match.groups()
    print(f"Year: {year}, Month: {month}, Day: {day}")

Compile for repeated use

# Compile a pattern once, use it many times (faster in loops)
email_pattern = re.compile(r"\b[\w.+-]+@[\w-]+\.[a-z]{2,}\b", re.IGNORECASE)

addresses = [
    "alice@example.com",
    "not-an-email",
    "Bob@Company.ORG",
    "bad@",
]

for addr in addresses:
    if email_pattern.match(addr):
        print(f"  Valid: {addr}")
    else:
        print(f"Invalid: {addr}")

Regular expressions are covered in full depth in Chapter 28. This section gives you enough to be productive.

`itertools` — Powerful Iteration Tools

itertools provides fast, memory-efficient tools for working with iterators. All of them return iterators (lazy) — wrap in list() to see the values.

import itertools

# chain — combine multiple iterables into one
combined = list(itertools.chain([1, 2, 3], [4, 5], [6]))
print(combined)   # [1, 2, 3, 4, 5, 6]

# chain.from_iterable — flatten one level
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(itertools.chain.from_iterable(nested))
print(flat)   # [1, 2, 3, 4, 5, 6]

# product — Cartesian product (all combinations)
sizes = ["S", "M", "L"]
colors = ["red", "blue"]
variants = list(itertools.product(sizes, colors))
print(variants)
# [('S','red'), ('S','blue'), ('M','red'), ('M','blue'), ('L','red'), ('L','blue')]

# permutations — all orderings
print(list(itertools.permutations("ABC", 2)))
# [('A','B'), ('A','C'), ('B','A'), ('B','C'), ('C','A'), ('C','B')]

# combinations — all subsets (order doesn't matter)
print(list(itertools.combinations("ABCD", 2)))
# [('A','B'), ('A','C'), ('A','D'), ('B','C'), ('B','D'), ('C','D')]

# combinations_with_replacement — same but allows repeats
print(list(itertools.combinations_with_replacement("ABC", 2)))
# [('A','A'), ('A','B'), ('A','C'), ('B','B'), ('B','C'), ('C','C')]

# groupby — group consecutive items by a key
data = [
    {"name": "Alice",  "dept": "Engineering"},
    {"name": "Bob",    "dept": "Engineering"},
    {"name": "Carlos", "dept": "Marketing"},
    {"name": "Diana",  "dept": "Marketing"},
    {"name": "Eve",    "dept": "Engineering"},
]
# Note: groupby only groups CONSECUTIVE equal keys — sort first
data.sort(key=lambda x: x["dept"])
for dept, members in itertools.groupby(data, key=lambda x: x["dept"]):
    names = [m["name"] for m in members]
    print(f"{dept}: {names}")
# Engineering: ['Alice', 'Bob', 'Eve']
# Marketing: ['Carlos', 'Diana']

# islice — slice an iterator (no index needed)
gen = (n**2 for n in range(1, 1000))
first_five = list(itertools.islice(gen, 5))
print(first_five)   # [1, 4, 9, 16, 25]

# cycle — repeat a sequence forever (always use with islice or break)
colors_cycle = itertools.cycle(["red", "green", "blue"])
first_seven = list(itertools.islice(colors_cycle, 7))
print(first_seven)   # ['red', 'green', 'blue', 'red', 'green', 'blue', 'red']

# count — infinite counter
counter = itertools.count(start=10, step=5)
print(list(itertools.islice(counter, 5)))   # [10, 15, 20, 25, 30]

`functools` — Higher-Order Functions

from functools import reduce, lru_cache, partial

# reduce — apply a function cumulatively
numbers = [1, 2, 3, 4, 5]
product = reduce(lambda a, b: a * b, numbers)
print(product)   # 120  (1x2x3x4x5)

total = reduce(lambda a, b: a + b, numbers)
print(total)     # 15

`lru_cache` — Memoization (cache function results)

from functools import lru_cache

@lru_cache(maxsize=None)
def fibonacci(n):
    """Calculate the nth Fibonacci number with caching."""
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

print(fibonacci(10))   # 55
print(fibonacci(50))   # 12586269025  (instant — cached)

# See cache statistics
print(fibonacci.cache_info())
# CacheInfo(hits=48, misses=51, maxsize=None, currsize=51)

Without lru_cache, fibonacci(50) would make billions of recursive calls. With it, each value is computed once and reused. This is called memoization and it's one of the most powerful optimizations in programming.

`partial` — Pre-fill function arguments

from functools import partial

def power(base, exponent):
    return base ** exponent

# Create a new function with exponent pre-filled
square = partial(power, exponent=2)
cube   = partial(power, exponent=3)

print(square(5))   # 25
print(cube(3))     # 27

# Useful for passing to map/filter
numbers = [1, 2, 3, 4, 5]
squares = list(map(square, numbers))
print(squares)   # [1, 4, 9, 16, 25]

`wraps` — Preserve function metadata in decorators

from functools import wraps

def my_decorator(func):
    @wraps(func)   # preserves func's name, docstring, etc.
    def wrapper(*args, **kwargs):
        print(f"Calling {func.__name__}")
        return func(*args, **kwargs)
    return wrapper

You'll use this in Chapter 22 (Decorators).

Putting It All Together: A File Report Tool

import os
import sys
import json
from pathlib import Path
from datetime import datetime
from collections import defaultdict, Counter


def analyze_directory(directory):
    """
    Analyze all files in a directory and return a report dict.
    """
    path = Path(directory)

    if not path.exists():
        print(f"Error: '{directory}' does not exist.")
        sys.exit(1)

    if not path.is_dir():
        print(f"Error: '{directory}' is not a directory.")
        sys.exit(1)

    files = list(path.rglob("*"))
    files = [f for f in files if f.is_file()]

    total_size = sum(f.stat().st_size for f in files)
    extensions = Counter(f.suffix.lower() or "(no ext)" for f in files)
    by_extension = defaultdict(list)

    for f in files:
        by_extension[f.suffix.lower() or "(no ext)"].append(f.name)

    # Find newest and oldest files
    by_mtime = sorted(files, key=lambda f: f.stat().st_mtime)
    oldest = by_mtime[0]  if files else None
    newest = by_mtime[-1] if files else None

    return {
        "directory":   str(path.resolve()),
        "generated":   datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
        "total_files": len(files),
        "total_size_kb": round(total_size / 1024, 1),
        "extensions":  dict(extensions.most_common()),
        "oldest_file": oldest.name if oldest else None,
        "newest_file": newest.name if newest else None,
    }


def print_report(report):
    print(f"\n{'='*50}")
    print(f"Directory Report: {report['directory']}")
    print(f"Generated:        {report['generated']}")
    print(f"{'='*50}")
    print(f"Total files:    {report['total_files']}")
    print(f"Total size:     {report['total_size_kb']} KB")
    print(f"\nFile types:")
    for ext, count in report["extensions"].items():
        print(f"  {ext:<15} {count:>4} file(s)")
    print(f"\nOldest file:  {report['oldest_file']}")
    print(f"Newest file:  {report['newest_file']}")


# Entry point
if len(sys.argv) > 1:
    target = sys.argv[1]
else:
    target = "."

report = analyze_directory(target)
print_report(report)

# Save report to JSON
output_path = Path("directory_report.json")
with open(output_path, "w") as f:
    json.dump(report, f, indent=2)
print(f"\nReport saved to: {output_path.resolve()}")

Run it with: python report.py C:\Users\Alice\Documents

Uses: pathlib, sys, os, json, datetime, collections.Counter, collections.defaultdict — seven modules from this chapter in one useful tool.

What You Learned in This Chapter

os — file system operations, environment variables, path manipulation.
pathlib — modern path objects: Path / "subdir", .name, .stem, .suffix, .glob(), .read_text().
sys — Python version, command-line args (sys.argv), exit codes (sys.exit()), platform.
datetime — create, format (strftime), parse (strptime), and do math with dates/times using timedelta.
random — randint, choice, sample, shuffle, seed for reproducibility.
json — dumps/loads for strings, dump/load for files, Python↔JSON type mapping.
re — search, findall, sub, match, split, groups, compiled patterns.
itertools — chain, product, permutations, combinations, groupby, islice, cycle, count.
functools — reduce, lru_cache (memoization), partial (pre-filled functions), wraps.

What's Next?

Part 2 is complete. You've gone deep into Python's essential tools — strings, tuples, sets, dictionaries, error handling, comprehensions, modules, and the standard library.

Part 3 is where programming gets really interesting. In Chapter 17 you'll learn object-oriented programming — how to design your own types, bundle data and behavior together into classes, and think about code the way professional software engineers do. It's one of the most powerful ideas in programming, and Python makes it approachable.

Your turn: Build a command-line tool called rename_files.py that takes a directory path as sys.argv[1] and a pattern as sys.argv[2] (e.g. "*.txt"). List all matching files using pathlib.glob(), show the user how many were found, and ask for a prefix to add to each filename. Rename them all using os.rename() and save a log of all renames (old name -> new name, with timestamp) to a JSON file.

Chapter 16: Useful Standard Library Modules

os and os.path — The File System

Environment variables

pathlib — Modern File Paths

sys — The Python Interpreter

datetime — Dates and Times

Common strftime format codes

random — Randomness

json — Reading and Writing JSON