Chapter 35: Professional Python Practices
There is a gap between code that works and code that is professional. Both produce correct output. But professional code is readable by others, maintainable six months later, testable, consistently styled, and documented clearly. This chapter closes that gap.
These aren't rules for their own sake. Every practice here exists because it makes real projects easier to work on — alone or in a team.
Code Style — PEP 8 and Why It Matters
PEP 8 is Python's official style guide. The reason for following it isn't aesthetics — it's that consistent style lets your brain stop parsing formatting and focus on logic.
The key rules:
# ── Naming ────────────────────────────────────────────────────────────────────
# Variables and functions: snake_case
user_name = "Alice"
total_price = 99.99
def calculate_tax(price, rate):
return price * rate
# Classes: PascalCase
class BankAccount:
pass
# Constants: UPPER_SNAKE_CASE
MAX_RETRIES = 3
DEFAULT_TIMEOUT = 30.0
# Private (by convention): leading underscore
_internal_helper = "not for public use"
__name_mangled = "class-private"
# ── Spacing ───────────────────────────────────────────────────────────────────
# Two blank lines between top-level definitions
def first_function():
pass
def second_function():
pass
class MyClass:
# One blank line between methods
def method_one(self):
pass
def method_two(self):
pass
# Spaces around operators
x = 1 + 2
y = x * 3
# No spaces around = for default arguments
def connect(host, port=8080, timeout=30):
pass
# Spaces after commas
items = [1, 2, 3]
result = function(a, b, c)
# ── Line length ───────────────────────────────────────────────────────────────
# Keep lines under 88 characters (Black's default, slightly more than PEP 8's 79)
# Break long lines with implicit continuation inside brackets
result = some_function(
argument_one,
argument_two,
argument_three,
)
# Or with backslash (less preferred)
total = (first_value
+ second_value
+ third_value)
# ── Imports ───────────────────────────────────────────────────────────────────
# Standard library first, then third-party, then local — blank line between groups
import os
import sys
from pathlib import Path
import requests
from pydantic import BaseModel
from mypackage import utils
from mypackage.models import User
Let tools enforce style — don't do it manually
Don't spend mental energy on formatting. Let ruff format (or black) handle it automatically:
ruff format . # format all files
ruff format --check . # check without changing (for CI)
Configure your editor to format on save. After that, style stops being a decision you make.
Writing Good Docstrings
Docstrings are the official documentation for your code. They're accessible via help(), shown in editors as you type, and used by documentation generators.
The Google style (most common in modern Python)
def calculate_shipping(
weight_kg: float,
distance_km: float,
express: bool = False,
) -> float:
"""
Calculate shipping cost based on weight, distance, and service level.
Args:
weight_kg: Weight of the package in kilograms.
distance_km: Shipping distance in kilometers.
express: If True, apply the express multiplier (2x).
Returns:
Shipping cost in USD, rounded to two decimal places.
Raises:
ValueError: If weight_kg or distance_km is negative.
Examples:
>>> calculate_shipping(2.5, 100)
12.5
>>> calculate_shipping(2.5, 100, express=True)
25.0
"""
if weight_kg < 0 or distance_km < 0:
raise ValueError("Weight and distance must be non-negative.")
base_cost = weight_kg * 0.05 * distance_km
if express:
base_cost *= 2
return round(base_cost, 2)
Class docstrings
class PaymentProcessor:
"""
Process payments through multiple payment gateways.
Supports Stripe, PayPal, and Square. Automatically retries
on transient failures and logs all transactions.
Attributes:
gateway: The active payment gateway name.
currency: ISO 4217 currency code (default: "USD").
Examples:
>>> processor = PaymentProcessor("stripe", currency="EUR")
>>> processor.charge(amount=99.99, card_token="tok_visa")
Transaction(id="ch_abc", status="succeeded", amount=99.99)
"""
def __init__(self, gateway: str, currency: str = "USD") -> None:
self.gateway = gateway
self.currency = currency
Module docstrings
"""
payments.py — Payment processing utilities.
This module provides a unified interface for multiple payment gateways.
All monetary values are in the smallest currency unit (cents for USD).
Typical usage:
from payments import PaymentProcessor
processor = PaymentProcessor("stripe")
result = processor.charge(amount=9999, card_token=token)
"""
doctest — runnable examples in docstrings
The Examples: section above isn't just documentation — it can run as a test:
def add(a: int, b: int) -> int:
"""
Add two integers.
>>> add(2, 3)
5
>>> add(-1, 1)
0
>>> add(0, 0)
0
"""
return a + b
python -m doctest my_module.py -v # run all docstring examples as tests
pytest --doctest-modules # run doctests with pytest
Doctests serve as both documentation and light test coverage. Use them for simple, illustrative examples.
The Zen of Python Applied
import this
The Zen of Python has 19 aphorisms. Here are the seven that matter most in practice:
"Beautiful is better than ugly." Code is read far more than it's written. Spend time making it readable.
"Explicit is better than implicit."
# Implicit — what does True mean here?
connect(host, True)
# Explicit
connect(host, secure=True)
"Simple is better than complex."
# Complex
def get_items(data):
return list(filter(lambda x: x is not None,
map(lambda x: x.get("value"), data)))
# Simple
def get_items(data):
return [item["value"] for item in data if item.get("value") is not None]
"Flat is better than nested."
# Nested — hard to follow
def process(data):
if data:
if isinstance(data, list):
if len(data) > 0:
return data[0]
# Flat — early return pattern
def process(data):
if not data:
return None
if not isinstance(data, list):
return None
return data[0] if data else None
"Errors should never pass silently."
# Bad — swallows all exceptions
try:
result = risky()
except:
pass
# Good — catch specific exceptions, log them
try:
result = risky()
except ValueError as e:
logger.warning(f"Invalid value: {e}")
result = default_value
"If the implementation is hard to explain, it's a bad idea." If you struggle to explain what your code does, refactor until you can.
"There should be one obvious way to do it." When choosing between several approaches, the simplest one that clearly expresses intent is usually right.
Code Review Practices
Code review is how teams catch bugs, share knowledge, and maintain standards. Whether you're reviewing others' code or preparing your own for review, these habits matter.
Writing reviewable code
- Keep pull requests small. A PR with 50 lines is reviewed carefully. One with 500 lines is rubber-stamped.
- One logical change per PR. Mixing a refactor with a bug fix with a new feature makes review impossible.
- Write a clear PR description. Explain why, not just what. Link to the issue. Show before/after if the behavior changed.
- Test everything. Reviewers shouldn't have to run your code to verify correctness.
## What this does
Adds retry logic to the payment processor for transient 5xx errors.
## Why
We're seeing intermittent 503s from Stripe during peak hours.
Retrying with exponential backoff reduces payment failures by ~40%.
## Changes
- `payments.py`: Added `@retry(times=3, delay=1.0)` to `charge()`
- `test_payments.py`: Added tests for retry behavior with mocked failures
## Notes
The retry count (3) and initial delay (1.0s) are configurable via env vars.
Reviewing others' code
Ask questions rather than making demands:
# Less collaborative
"This is wrong. Use a dict instead."
# More collaborative
"Would a dict here give us O(1) lookup? The current list search
might be slow if users_to_check grows large."
Look for:
- Correctness (does it do what the description says?)
- Edge cases (what happens with empty input? None? Very large input?)
- Tests (are they testing behavior, not implementation?)
- Readability (can you follow the logic without reading the commit message?)
- Security (any SQL injection risk? Hardcoded credentials? Unvalidated user input?)
Git Workflow for Python Projects
.gitignore
# Python
__pycache__/
*.py[cod]
*.pyo
*.pyd
.Python
*.egg-info/
dist/
build/
*.egg
# Virtual environments
.venv/
venv/
env/
# Testing
.pytest_cache/
.coverage
htmlcov/
.mypy_cache/
.ruff_cache/
# IDE
.vscode/settings.json
.idea/
*.swp
# Secrets — NEVER commit these
.env
*.pem
*.key
secrets.json
Commit message convention
Follow the Conventional Commits format — makes changelogs automatic and history readable:
<type>(<scope>): <short description>
[optional body]
[optional footer]
Types: feat, fix, docs, style, refactor, test, chore, perf, ci
feat(auth): add JWT token refresh
Tokens now auto-refresh 5 minutes before expiry.
Fixes intermittent 401 errors for long-running sessions.
Closes #142
fix(payments): retry on transient 503 errors
docs(readme): update installation instructions for Python 3.12
test(auth): add tests for token expiry edge cases
refactor(database): replace raw SQL with SQLAlchemy ORM
Branch strategy
main <- production-ready, tagged releases
└── develop <- integration branch (optional)
├── feature/add-payment-retry
├── fix/auth-token-expiry
└── chore/update-dependencies
Simple projects can skip develop and branch directly off main.
# Start a feature
git checkout -b feature/add-payment-retry
# Work, commit often
git add -p # stage interactively — review each change
git commit -m "feat(payments): add retry logic for 503 errors"
# Keep branch up to date
git fetch origin
git rebase origin/main
# Push and open a PR
git push -u origin feature/add-payment-retry
Use git add -p (patch mode) to review each change before staging. It forces you to think about what you're committing.
Environment Variables and Configuration
Never hardcode secrets, URLs, or environment-specific values in your code. Read them from environment variables:
# config.py
import os
from pathlib import Path
class Config:
# Database
DATABASE_URL: str = os.environ.get("DATABASE_URL", "sqlite:///dev.db")
DATABASE_POOL_SIZE: int = int(os.environ.get("DATABASE_POOL_SIZE", "5"))
# Auth
SECRET_KEY: str = os.environ.get("SECRET_KEY", "")
TOKEN_EXPIRE_MIN: int= int(os.environ.get("TOKEN_EXPIRE_MIN", "30"))
# External APIs
STRIPE_API_KEY: str = os.environ.get("STRIPE_API_KEY", "")
SENDGRID_API_KEY: str= os.environ.get("SENDGRID_API_KEY", "")
# App
DEBUG: bool = os.environ.get("DEBUG", "false").lower() == "true"
LOG_LEVEL: str = os.environ.get("LOG_LEVEL", "INFO")
def validate(self) -> None:
"""Raise on startup if required secrets are missing."""
required = {"SECRET_KEY", "STRIPE_API_KEY"}
missing = [k for k in required if not getattr(self, k)]
if missing:
raise RuntimeError(
f"Missing required environment variables: {', '.join(missing)}"
)
config = Config()
Use a .env file in development:
# .env — never commit this file
DATABASE_URL=postgresql://user:password@localhost/myapp
SECRET_KEY=dev-secret-key-change-in-production
STRIPE_API_KEY=sk_test_...
DEBUG=true
LOG_LEVEL=DEBUG
Load it with python-dotenv:
from dotenv import load_dotenv
load_dotenv() # reads .env into os.environ
from config import config
config.validate()
Or with pydantic-settings (cleaner, with type coercion):
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8")
database_url: str = "sqlite:///dev.db"
secret_key: str = ""
stripe_api_key: str = ""
debug: bool = False
log_level: str = "INFO"
token_expire_min: int = 30
settings = Settings()
Logging in Production
Good production logging is structured, levelled, and goes to the right place.
import logging
import sys
from pathlib import Path
def setup_logging(level: str = "INFO", log_file: Path | None = None) -> None:
"""Configure structured logging for production."""
fmt = "%(asctime)s %(levelname)-8s %(name)-30s %(message)s"
handlers: list[logging.Handler] = [
logging.StreamHandler(sys.stdout),
]
if log_file:
handlers.append(
logging.handlers.RotatingFileHandler(
log_file,
maxBytes = 10 * 1024 * 1024, # 10MB
backupCount = 5,
)
)
logging.basicConfig(
level = getattr(logging, level.upper()),
format = fmt,
datefmt = "%Y-%m-%d %H:%M:%S",
handlers = handlers,
force = True,
)
# Silence noisy third-party loggers
logging.getLogger("urllib3").setLevel(logging.WARNING)
logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)
# Get a logger per module — never use the root logger directly
log = logging.getLogger(__name__)
# Use structured context with extra
log.info("User logged in", extra={"user_id": 42, "ip": "192.168.1.1"})
log.error("Payment failed", extra={"order_id": 99, "amount": 49.99})
For high-volume production logging, consider structlog for JSON-formatted structured logs that work well with log aggregators like Datadog, Splunk, or Elasticsearch.
Writing Clean Code — Practical Rules
Functions should do one thing
# Bad — three responsibilities in one function
def process_user(user_id):
user = db.query(f"SELECT * FROM users WHERE id = {user_id}") # SQL injection!
if user:
user["email"] = user["email"].lower()
send_email(user["email"], "Welcome!")
db.query(f"UPDATE users SET processed=1 WHERE id = {user_id}")
# Good — one responsibility each
def get_user(db, user_id: int) -> dict | None:
return db.execute("SELECT * FROM users WHERE id = ?", (user_id,)).fetchone()
def normalize_user(user: dict) -> dict:
return {**user, "email": user["email"].lower()}
def send_welcome_email(email: str) -> None:
send_email(email, "Welcome!")
def mark_user_processed(db, user_id: int) -> None:
db.execute("UPDATE users SET processed=1 WHERE id = ?", (user_id,))
db.commit()
Avoid magic numbers
# Bad — what are these numbers?
if score > 0.85:
tier = "gold"
elif score > 0.65:
tier = "silver"
else:
tier = "bronze"
# Good — named constants
GOLD_THRESHOLD = 0.85
SILVER_THRESHOLD = 0.65
if score > GOLD_THRESHOLD:
tier = "gold"
elif score > SILVER_THRESHOLD:
tier = "silver"
else:
tier = "bronze"
Return early, don't nest deeply
# Bad — deep nesting
def process_order(order):
if order is not None:
if order["status"] == "pending":
if order["total"] > 0:
if has_inventory(order):
return fulfill(order)
# Good — guard clauses
def process_order(order):
if order is None:
raise ValueError("Order cannot be None")
if order["status"] != "pending":
raise ValueError(f"Cannot process order in status {order['status']!r}")
if order["total"] <= 0:
raise ValueError("Order total must be positive")
if not has_inventory(order):
raise InsufficientInventoryError(order["id"])
return fulfill(order)
Name things clearly
# Bad names
def f(d, n):
r = []
for i in d:
if i > n:
r.append(i)
return r
# Good names
def filter_above_threshold(values: list[float], threshold: float) -> list[float]:
return [v for v in values if v > threshold]
The function with good names doesn't need a comment. The code IS the documentation.
Avoid comments that explain what — write code that shows what
# Bad — comment restates the code
# Increment counter by 1
counter += 1
# Bad — comment explains code that should be clearer
# Check if user is an adult
if user.age >= 18:
# Good — comment explains WHY, not WHAT
# Stripe requires amounts in the smallest currency unit (cents)
amount_cents = int(amount_dollars * 100)
# RFC 3339 format required by the external API
timestamp = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
The Habits of Professional Python Developers
Beyond specific practices, professionals have habits:
They read error messages fully. Every line, not just the last one.
They write tests as they write code. Not after. The test is how you know you're done.
They commit small and often. A commit is a checkpoint. Frequent checkpoints mean less work lost.
They know their tools. Shell shortcuts, editor keybindings, debugger commands. Time saved per day adds up to weeks per year.
They ask "what could go wrong?" Every function that accepts input — what if it's None? Empty? Malformed? Extremely large?
They document decisions, not mechanics. "We use Redis here because PostgreSQL locks caused 200ms latency spikes under this query pattern" is valuable. "# connect to database" is noise.
They favor boring solutions. The newest, most clever approach is usually not the right one. Battle-tested libraries, simple data structures, and readable code win over novelty.
They read others' code. The Python standard library is beautifully written. So is requests, flask, pydantic. Reading excellent code is the fastest way to improve your own.
They refactor without fear. Because they have tests. Without tests, every refactor is a gamble. With tests, it's just work.
Project: A Professional Python Module
Let's put everything together in one module that demonstrates every practice from this chapter:
"""
ratelimiter.py — Thread-safe rate limiter with configurable strategies.
Provides a token bucket rate limiter for controlling how frequently
a resource can be accessed. Thread-safe for concurrent use.
Typical usage::
limiter = RateLimiter(rate=10, per=1.0) # 10 requests per second
@limiter.limit
def call_api(endpoint):
return requests.get(endpoint)
# Or as a context manager
with limiter:
call_api("/users")
"""
from __future__ import annotations
import logging
import threading
import time
from dataclasses import dataclass, field
from functools import wraps
from typing import Callable, TypeVar
log = logging.getLogger(__name__)
F = TypeVar("F", bound=Callable)
# The minimum wait time we bother sleeping for (below this, just spin)
_MIN_SLEEP_SECONDS = 0.001
@dataclass
class RateLimiterStats:
"""Runtime statistics for a RateLimiter instance."""
total_calls: int = field(default=0)
total_waited: float = field(default=0.0)
total_rejected: int = field(default=0)
@property
def avg_wait_ms(self) -> float:
"""Average wait time per call in milliseconds."""
if self.total_calls == 0:
return 0.0
return (self.total_waited / self.total_calls) * 1000
class RateLimitExceeded(Exception):
"""Raised when a non-blocking rate limiter cannot acquire a token."""
class RateLimiter:
"""
Token bucket rate limiter — thread-safe.
Allows up to `rate` operations per `per` seconds. When the bucket
is empty, callers either wait (blocking mode) or raise
`RateLimitExceeded` (non-blocking mode).
Args:
rate: Maximum number of operations allowed per interval.
per: Interval in seconds (default: 1.0 second).
blocking: If True (default), callers wait when the bucket is empty.
If False, raises RateLimitExceeded immediately.
Examples:
>>> limiter = RateLimiter(rate=5, per=1.0)
>>> with limiter:
... print("allowed")
allowed
>>> @limiter.limit
... def my_function():
... return "result"
>>> my_function()
'result'
"""
def __init__(
self,
rate: float,
per: float = 1.0,
blocking: bool = True,
) -> None:
if rate <= 0:
raise ValueError(f"rate must be positive, got {rate}")
if per <= 0:
raise ValueError(f"per must be positive, got {per}")
self._rate = rate
self._per = per
self._blocking = blocking
self._tokens = rate # start full
self._last = time.monotonic()
self._lock = threading.Lock()
self._stats = RateLimiterStats()
log.debug(
"RateLimiter created: rate=%.1f per=%.1fs blocking=%s",
rate, per, blocking,
)
# ── Public interface ──────────────────────────────────────────────────────
def acquire(self) -> float:
"""
Acquire one token, waiting if necessary (blocking mode).
Returns:
The number of seconds waited (0.0 if no wait was needed).
Raises:
RateLimitExceeded: If non-blocking and no token is available.
"""
with self._lock:
return self._acquire_locked()
def limit(self, func: F) -> F:
"""Decorator: apply rate limiting to a function."""
@wraps(func)
def wrapper(*args, **kwargs):
self.acquire()
return func(*args, **kwargs)
return wrapper # type: ignore
@property
def stats(self) -> RateLimiterStats:
"""Return a snapshot of runtime statistics."""
return self._stats
def reset(self) -> None:
"""Reset the bucket to full capacity and clear statistics."""
with self._lock:
self._tokens = self._rate
self._last = time.monotonic()
self._stats = RateLimiterStats()
log.debug("RateLimiter reset")
# ── Context manager ───────────────────────────────────────────────────────
def __enter__(self) -> "RateLimiter":
self.acquire()
return self
def __exit__(self, *args: object) -> None:
pass
# ── Internal helpers ──────────────────────────────────────────────────────
def _acquire_locked(self) -> float:
"""Must be called with self._lock held."""
now = time.monotonic()
elapsed = now - self._last
self._tokens = min(self._rate, self._tokens + elapsed * (self._rate / self._per))
self._last = now
if self._tokens >= 1:
self._tokens -= 1
self._stats.total_calls += 1
return 0.0
if not self._blocking:
self._stats.total_rejected += 1
raise RateLimitExceeded(
f"Rate limit exceeded: {self._rate:.0f} per {self._per:.1f}s"
)
# Calculate wait time and sleep outside the lock
wait = (1 - self._tokens) * (self._per / self._rate)
self._lock.release()
try:
if wait > _MIN_SLEEP_SECONDS:
time.sleep(wait)
finally:
self._lock.acquire()
# Now take the token
self._tokens = max(0.0, self._tokens - 1)
self._stats.total_calls += 1
self._stats.total_waited += wait
log.debug("Rate limiter waited %.3fs", wait)
return wait
def __repr__(self) -> str:
return (
f"RateLimiter(rate={self._rate}, per={self._per}, "
f"blocking={self._blocking})"
)
# ── Demo ──────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG, format="%(levelname)s %(message)s")
limiter = RateLimiter(rate=3, per=1.0)
@limiter.limit
def simulate_api_call(n: int) -> str:
return f"Response #{n}"
print("Making 9 API calls at 3/second...")
start = time.perf_counter()
for i in range(9):
result = simulate_api_call(i + 1)
print(f" {result} at {time.perf_counter() - start:.2f}s")
s = limiter.stats
print(f"\nStats: {s.total_calls} calls, "
f"avg wait {s.avg_wait_ms:.1f}ms, "
f"{s.total_rejected} rejected")
This module demonstrates every professional practice from this chapter:
- Google-style docstrings on the module, class, and every public method
- Named constants (
_MIN_SLEEP_SECONDS) - Single-responsibility methods
- Guard clauses with clear error messages
- Type hints throughout
- Structured logging with
log.debug - A custom exception class (
RateLimitExceeded) __repr__for easy debugging- A
statsproperty for introspection - Context manager protocol (
__enter__/__exit__) - Decorator support (
.limit) - Thread safety with a
Lock - Early returns where appropriate
What You Learned in This Chapter
- PEP 8 defines naming (
snake_case,PascalCase,UPPER_SNAKE_CASE), spacing, and line length. Letruff formatenforce it automatically. - Docstrings document what a function does, its args, return value, exceptions, and examples. Use Google style. Make examples runnable with
doctest. - The Zen of Python in practice: explicit over implicit, simple over complex, flat over nested, errors never silent, one obvious way.
- Code review: small PRs, one logical change, clear descriptions. Ask questions rather than make demands when reviewing.
.gitignoreexcludes__pycache__, virtual environments,.env, dist, and coverage files.- Conventional Commits (
feat:,fix:,docs:,refactor:) make history readable and changelogs automatic. - Environment variables for all secrets and environment-specific config. Use
pydantic-settingsfor typed, validated settings. - Production logging: levelled, named loggers per module,
RotatingFileHandler, silence noisy third-party loggers. - Clean code rules: one responsibility per function, named constants instead of magic numbers, early returns instead of deep nesting, names that explain intent.
- Professional habits: read error messages fully, write tests while writing code, commit small and often, document decisions not mechanics, favor boring solutions, refactor fearlessly with test coverage.
What's Next?
Chapter 36 covers Advanced Python Internals — how Python actually works under the hood: the CPython interpreter, bytecode, the GIL in depth, memory management, reference counting, and the descriptor protocol. Understanding internals makes you a more effective Python developer — you'll know why things work the way they do, not just that they do.