Python: Zero to Hero
Home/Testing and Code Quality
Share

Chapter 35: Professional Python Practices

There is a gap between code that works and code that is professional. Both produce correct output. But professional code is readable by others, maintainable six months later, testable, consistently styled, and documented clearly. This chapter closes that gap.

These aren't rules for their own sake. Every practice here exists because it makes real projects easier to work on — alone or in a team.

Code Style — PEP 8 and Why It Matters

PEP 8 is Python's official style guide. The reason for following it isn't aesthetics — it's that consistent style lets your brain stop parsing formatting and focus on logic.

The key rules:

# ── Naming ────────────────────────────────────────────────────────────────────

# Variables and functions: snake_case
user_name = "Alice"
total_price = 99.99

def calculate_tax(price, rate):
    return price * rate

# Classes: PascalCase
class BankAccount:
    pass

# Constants: UPPER_SNAKE_CASE
MAX_RETRIES = 3
DEFAULT_TIMEOUT = 30.0

# Private (by convention): leading underscore
_internal_helper = "not for public use"
__name_mangled   = "class-private"

# ── Spacing ───────────────────────────────────────────────────────────────────

# Two blank lines between top-level definitions
def first_function():
    pass


def second_function():
    pass


class MyClass:
    # One blank line between methods
    def method_one(self):
        pass

    def method_two(self):
        pass

# Spaces around operators
x = 1 + 2
y = x * 3

# No spaces around = for default arguments
def connect(host, port=8080, timeout=30):
    pass

# Spaces after commas
items = [1, 2, 3]
result = function(a, b, c)

# ── Line length ───────────────────────────────────────────────────────────────

# Keep lines under 88 characters (Black's default, slightly more than PEP 8's 79)
# Break long lines with implicit continuation inside brackets
result = some_function(
    argument_one,
    argument_two,
    argument_three,
)

# Or with backslash (less preferred)
total = (first_value
         + second_value
         + third_value)

# ── Imports ───────────────────────────────────────────────────────────────────

# Standard library first, then third-party, then local — blank line between groups
import os
import sys
from pathlib import Path

import requests
from pydantic import BaseModel

from mypackage import utils
from mypackage.models import User

Let tools enforce style — don't do it manually

Don't spend mental energy on formatting. Let ruff format (or black) handle it automatically:

ruff format .          # format all files
ruff format --check .  # check without changing (for CI)

Configure your editor to format on save. After that, style stops being a decision you make.

Writing Good Docstrings

Docstrings are the official documentation for your code. They're accessible via help(), shown in editors as you type, and used by documentation generators.

The Google style (most common in modern Python)

def calculate_shipping(
    weight_kg: float,
    distance_km: float,
    express: bool = False,
) -> float:
    """
    Calculate shipping cost based on weight, distance, and service level.

    Args:
        weight_kg:   Weight of the package in kilograms.
        distance_km: Shipping distance in kilometers.
        express:     If True, apply the express multiplier (2x).

    Returns:
        Shipping cost in USD, rounded to two decimal places.

    Raises:
        ValueError: If weight_kg or distance_km is negative.

    Examples:
        >>> calculate_shipping(2.5, 100)
        12.5
        >>> calculate_shipping(2.5, 100, express=True)
        25.0
    """
    if weight_kg < 0 or distance_km < 0:
        raise ValueError("Weight and distance must be non-negative.")
    base_cost = weight_kg * 0.05 * distance_km
    if express:
        base_cost *= 2
    return round(base_cost, 2)

Class docstrings

class PaymentProcessor:
    """
    Process payments through multiple payment gateways.

    Supports Stripe, PayPal, and Square. Automatically retries
    on transient failures and logs all transactions.

    Attributes:
        gateway:  The active payment gateway name.
        currency: ISO 4217 currency code (default: "USD").

    Examples:
        >>> processor = PaymentProcessor("stripe", currency="EUR")
        >>> processor.charge(amount=99.99, card_token="tok_visa")
        Transaction(id="ch_abc", status="succeeded", amount=99.99)
    """

    def __init__(self, gateway: str, currency: str = "USD") -> None:
        self.gateway  = gateway
        self.currency = currency

Module docstrings

"""
payments.py — Payment processing utilities.

This module provides a unified interface for multiple payment gateways.
All monetary values are in the smallest currency unit (cents for USD).

Typical usage:
    from payments import PaymentProcessor
    processor = PaymentProcessor("stripe")
    result = processor.charge(amount=9999, card_token=token)
"""

doctest — runnable examples in docstrings

The Examples: section above isn't just documentation — it can run as a test:

def add(a: int, b: int) -> int:
    """
    Add two integers.

    >>> add(2, 3)
    5
    >>> add(-1, 1)
    0
    >>> add(0, 0)
    0
    """
    return a + b
python -m doctest my_module.py -v   # run all docstring examples as tests
pytest --doctest-modules            # run doctests with pytest

Doctests serve as both documentation and light test coverage. Use them for simple, illustrative examples.

The Zen of Python Applied

import this

The Zen of Python has 19 aphorisms. Here are the seven that matter most in practice:

"Beautiful is better than ugly." Code is read far more than it's written. Spend time making it readable.

"Explicit is better than implicit."

# Implicit — what does True mean here?
connect(host, True)

# Explicit
connect(host, secure=True)

"Simple is better than complex."

# Complex
def get_items(data):
    return list(filter(lambda x: x is not None,
                       map(lambda x: x.get("value"), data)))

# Simple
def get_items(data):
    return [item["value"] for item in data if item.get("value") is not None]

"Flat is better than nested."

# Nested — hard to follow
def process(data):
    if data:
        if isinstance(data, list):
            if len(data) > 0:
                return data[0]

# Flat — early return pattern
def process(data):
    if not data:
        return None
    if not isinstance(data, list):
        return None
    return data[0] if data else None

"Errors should never pass silently."

# Bad — swallows all exceptions
try:
    result = risky()
except:
    pass

# Good — catch specific exceptions, log them
try:
    result = risky()
except ValueError as e:
    logger.warning(f"Invalid value: {e}")
    result = default_value

"If the implementation is hard to explain, it's a bad idea." If you struggle to explain what your code does, refactor until you can.

"There should be one obvious way to do it." When choosing between several approaches, the simplest one that clearly expresses intent is usually right.

Code Review Practices

Code review is how teams catch bugs, share knowledge, and maintain standards. Whether you're reviewing others' code or preparing your own for review, these habits matter.

Writing reviewable code

  • Keep pull requests small. A PR with 50 lines is reviewed carefully. One with 500 lines is rubber-stamped.
  • One logical change per PR. Mixing a refactor with a bug fix with a new feature makes review impossible.
  • Write a clear PR description. Explain why, not just what. Link to the issue. Show before/after if the behavior changed.
  • Test everything. Reviewers shouldn't have to run your code to verify correctness.
## What this does
Adds retry logic to the payment processor for transient 5xx errors.

## Why
We're seeing intermittent 503s from Stripe during peak hours.
Retrying with exponential backoff reduces payment failures by ~40%.

## Changes
- `payments.py`: Added `@retry(times=3, delay=1.0)` to `charge()`
- `test_payments.py`: Added tests for retry behavior with mocked failures

## Notes
The retry count (3) and initial delay (1.0s) are configurable via env vars.

Reviewing others' code

Ask questions rather than making demands:

# Less collaborative
"This is wrong. Use a dict instead."

# More collaborative
"Would a dict here give us O(1) lookup? The current list search
 might be slow if users_to_check grows large."

Look for:

  • Correctness (does it do what the description says?)
  • Edge cases (what happens with empty input? None? Very large input?)
  • Tests (are they testing behavior, not implementation?)
  • Readability (can you follow the logic without reading the commit message?)
  • Security (any SQL injection risk? Hardcoded credentials? Unvalidated user input?)

Git Workflow for Python Projects

.gitignore

# Python
__pycache__/
*.py[cod]
*.pyo
*.pyd
.Python
*.egg-info/
dist/
build/
*.egg

# Virtual environments
.venv/
venv/
env/

# Testing
.pytest_cache/
.coverage
htmlcov/
.mypy_cache/
.ruff_cache/

# IDE
.vscode/settings.json
.idea/
*.swp

# Secrets — NEVER commit these
.env
*.pem
*.key
secrets.json

Commit message convention

Follow the Conventional Commits format — makes changelogs automatic and history readable:

<type>(<scope>): <short description>

[optional body]

[optional footer]

Types: feat, fix, docs, style, refactor, test, chore, perf, ci

feat(auth): add JWT token refresh

Tokens now auto-refresh 5 minutes before expiry.
Fixes intermittent 401 errors for long-running sessions.

Closes #142


fix(payments): retry on transient 503 errors


docs(readme): update installation instructions for Python 3.12


test(auth): add tests for token expiry edge cases


refactor(database): replace raw SQL with SQLAlchemy ORM

Branch strategy

main          <- production-ready, tagged releases
└── develop   <- integration branch (optional)
    ├── feature/add-payment-retry
    ├── fix/auth-token-expiry
    └── chore/update-dependencies

Simple projects can skip develop and branch directly off main.

# Start a feature
git checkout -b feature/add-payment-retry

# Work, commit often
git add -p                              # stage interactively — review each change
git commit -m "feat(payments): add retry logic for 503 errors"

# Keep branch up to date
git fetch origin
git rebase origin/main

# Push and open a PR
git push -u origin feature/add-payment-retry

Use git add -p (patch mode) to review each change before staging. It forces you to think about what you're committing.

Environment Variables and Configuration

Never hardcode secrets, URLs, or environment-specific values in your code. Read them from environment variables:

# config.py
import os
from pathlib import Path


class Config:
    # Database
    DATABASE_URL: str    = os.environ.get("DATABASE_URL", "sqlite:///dev.db")
    DATABASE_POOL_SIZE: int = int(os.environ.get("DATABASE_POOL_SIZE", "5"))

    # Auth
    SECRET_KEY: str      = os.environ.get("SECRET_KEY", "")
    TOKEN_EXPIRE_MIN: int= int(os.environ.get("TOKEN_EXPIRE_MIN", "30"))

    # External APIs
    STRIPE_API_KEY: str  = os.environ.get("STRIPE_API_KEY", "")
    SENDGRID_API_KEY: str= os.environ.get("SENDGRID_API_KEY", "")

    # App
    DEBUG: bool          = os.environ.get("DEBUG", "false").lower() == "true"
    LOG_LEVEL: str       = os.environ.get("LOG_LEVEL", "INFO")

    def validate(self) -> None:
        """Raise on startup if required secrets are missing."""
        required = {"SECRET_KEY", "STRIPE_API_KEY"}
        missing  = [k for k in required if not getattr(self, k)]
        if missing:
            raise RuntimeError(
                f"Missing required environment variables: {', '.join(missing)}"
            )


config = Config()

Use a .env file in development:

# .env — never commit this file
DATABASE_URL=postgresql://user:password@localhost/myapp
SECRET_KEY=dev-secret-key-change-in-production
STRIPE_API_KEY=sk_test_...
DEBUG=true
LOG_LEVEL=DEBUG

Load it with python-dotenv:

from dotenv import load_dotenv
load_dotenv()   # reads .env into os.environ

from config import config
config.validate()

Or with pydantic-settings (cleaner, with type coercion):

from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8")

    database_url:     str   = "sqlite:///dev.db"
    secret_key:       str   = ""
    stripe_api_key:   str   = ""
    debug:            bool  = False
    log_level:        str   = "INFO"
    token_expire_min: int   = 30

settings = Settings()

Logging in Production

Good production logging is structured, levelled, and goes to the right place.

import logging
import sys
from pathlib import Path


def setup_logging(level: str = "INFO", log_file: Path | None = None) -> None:
    """Configure structured logging for production."""
    fmt = "%(asctime)s %(levelname)-8s %(name)-30s %(message)s"

    handlers: list[logging.Handler] = [
        logging.StreamHandler(sys.stdout),
    ]
    if log_file:
        handlers.append(
            logging.handlers.RotatingFileHandler(
                log_file,
                maxBytes    = 10 * 1024 * 1024,  # 10MB
                backupCount = 5,
            )
        )

    logging.basicConfig(
        level    = getattr(logging, level.upper()),
        format   = fmt,
        datefmt  = "%Y-%m-%d %H:%M:%S",
        handlers = handlers,
        force    = True,
    )

    # Silence noisy third-party loggers
    logging.getLogger("urllib3").setLevel(logging.WARNING)
    logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)


# Get a logger per module — never use the root logger directly
log = logging.getLogger(__name__)

# Use structured context with extra
log.info("User logged in", extra={"user_id": 42, "ip": "192.168.1.1"})
log.error("Payment failed", extra={"order_id": 99, "amount": 49.99})

For high-volume production logging, consider structlog for JSON-formatted structured logs that work well with log aggregators like Datadog, Splunk, or Elasticsearch.

Writing Clean Code — Practical Rules

Functions should do one thing

# Bad — three responsibilities in one function
def process_user(user_id):
    user = db.query(f"SELECT * FROM users WHERE id = {user_id}")  # SQL injection!
    if user:
        user["email"] = user["email"].lower()
        send_email(user["email"], "Welcome!")
        db.query(f"UPDATE users SET processed=1 WHERE id = {user_id}")

# Good — one responsibility each
def get_user(db, user_id: int) -> dict | None:
    return db.execute("SELECT * FROM users WHERE id = ?", (user_id,)).fetchone()

def normalize_user(user: dict) -> dict:
    return {**user, "email": user["email"].lower()}

def send_welcome_email(email: str) -> None:
    send_email(email, "Welcome!")

def mark_user_processed(db, user_id: int) -> None:
    db.execute("UPDATE users SET processed=1 WHERE id = ?", (user_id,))
    db.commit()

Avoid magic numbers

# Bad — what are these numbers?
if score > 0.85:
    tier = "gold"
elif score > 0.65:
    tier = "silver"
else:
    tier = "bronze"

# Good — named constants
GOLD_THRESHOLD   = 0.85
SILVER_THRESHOLD = 0.65

if score > GOLD_THRESHOLD:
    tier = "gold"
elif score > SILVER_THRESHOLD:
    tier = "silver"
else:
    tier = "bronze"

Return early, don't nest deeply

# Bad — deep nesting
def process_order(order):
    if order is not None:
        if order["status"] == "pending":
            if order["total"] > 0:
                if has_inventory(order):
                    return fulfill(order)

# Good — guard clauses
def process_order(order):
    if order is None:
        raise ValueError("Order cannot be None")
    if order["status"] != "pending":
        raise ValueError(f"Cannot process order in status {order['status']!r}")
    if order["total"] <= 0:
        raise ValueError("Order total must be positive")
    if not has_inventory(order):
        raise InsufficientInventoryError(order["id"])
    return fulfill(order)

Name things clearly

# Bad names
def f(d, n):
    r = []
    for i in d:
        if i > n:
            r.append(i)
    return r

# Good names
def filter_above_threshold(values: list[float], threshold: float) -> list[float]:
    return [v for v in values if v > threshold]

The function with good names doesn't need a comment. The code IS the documentation.

Avoid comments that explain what — write code that shows what

# Bad — comment restates the code
# Increment counter by 1
counter += 1

# Bad — comment explains code that should be clearer
# Check if user is an adult
if user.age >= 18:

# Good — comment explains WHY, not WHAT
# Stripe requires amounts in the smallest currency unit (cents)
amount_cents = int(amount_dollars * 100)

# RFC 3339 format required by the external API
timestamp = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")

The Habits of Professional Python Developers

Beyond specific practices, professionals have habits:

They read error messages fully. Every line, not just the last one.

They write tests as they write code. Not after. The test is how you know you're done.

They commit small and often. A commit is a checkpoint. Frequent checkpoints mean less work lost.

They know their tools. Shell shortcuts, editor keybindings, debugger commands. Time saved per day adds up to weeks per year.

They ask "what could go wrong?" Every function that accepts input — what if it's None? Empty? Malformed? Extremely large?

They document decisions, not mechanics. "We use Redis here because PostgreSQL locks caused 200ms latency spikes under this query pattern" is valuable. "# connect to database" is noise.

They favor boring solutions. The newest, most clever approach is usually not the right one. Battle-tested libraries, simple data structures, and readable code win over novelty.

They read others' code. The Python standard library is beautifully written. So is requests, flask, pydantic. Reading excellent code is the fastest way to improve your own.

They refactor without fear. Because they have tests. Without tests, every refactor is a gamble. With tests, it's just work.

Project: A Professional Python Module

Let's put everything together in one module that demonstrates every practice from this chapter:

"""
ratelimiter.py — Thread-safe rate limiter with configurable strategies.

Provides a token bucket rate limiter for controlling how frequently
a resource can be accessed. Thread-safe for concurrent use.

Typical usage::

    limiter = RateLimiter(rate=10, per=1.0)  # 10 requests per second

    @limiter.limit
    def call_api(endpoint):
        return requests.get(endpoint)

    # Or as a context manager
    with limiter:
        call_api("/users")
"""
from __future__ import annotations

import logging
import threading
import time
from dataclasses import dataclass, field
from functools import wraps
from typing import Callable, TypeVar

log = logging.getLogger(__name__)

F = TypeVar("F", bound=Callable)

# The minimum wait time we bother sleeping for (below this, just spin)
_MIN_SLEEP_SECONDS = 0.001


@dataclass
class RateLimiterStats:
    """Runtime statistics for a RateLimiter instance."""
    total_calls:    int   = field(default=0)
    total_waited:   float = field(default=0.0)
    total_rejected: int   = field(default=0)

    @property
    def avg_wait_ms(self) -> float:
        """Average wait time per call in milliseconds."""
        if self.total_calls == 0:
            return 0.0
        return (self.total_waited / self.total_calls) * 1000


class RateLimitExceeded(Exception):
    """Raised when a non-blocking rate limiter cannot acquire a token."""


class RateLimiter:
    """
    Token bucket rate limiter — thread-safe.

    Allows up to `rate` operations per `per` seconds. When the bucket
    is empty, callers either wait (blocking mode) or raise
    `RateLimitExceeded` (non-blocking mode).

    Args:
        rate:     Maximum number of operations allowed per interval.
        per:      Interval in seconds (default: 1.0 second).
        blocking: If True (default), callers wait when the bucket is empty.
                  If False, raises RateLimitExceeded immediately.

    Examples:
        >>> limiter = RateLimiter(rate=5, per=1.0)
        >>> with limiter:
        ...     print("allowed")
        allowed

        >>> @limiter.limit
        ... def my_function():
        ...     return "result"
        >>> my_function()
        'result'
    """

    def __init__(
        self,
        rate:     float,
        per:      float = 1.0,
        blocking: bool  = True,
    ) -> None:
        if rate <= 0:
            raise ValueError(f"rate must be positive, got {rate}")
        if per <= 0:
            raise ValueError(f"per must be positive, got {per}")

        self._rate     = rate
        self._per      = per
        self._blocking = blocking
        self._tokens   = rate           # start full
        self._last     = time.monotonic()
        self._lock     = threading.Lock()
        self._stats    = RateLimiterStats()

        log.debug(
            "RateLimiter created: rate=%.1f per=%.1fs blocking=%s",
            rate, per, blocking,
        )

    # ── Public interface ──────────────────────────────────────────────────────

    def acquire(self) -> float:
        """
        Acquire one token, waiting if necessary (blocking mode).

        Returns:
            The number of seconds waited (0.0 if no wait was needed).

        Raises:
            RateLimitExceeded: If non-blocking and no token is available.
        """
        with self._lock:
            return self._acquire_locked()

    def limit(self, func: F) -> F:
        """Decorator: apply rate limiting to a function."""
        @wraps(func)
        def wrapper(*args, **kwargs):
            self.acquire()
            return func(*args, **kwargs)
        return wrapper  # type: ignore

    @property
    def stats(self) -> RateLimiterStats:
        """Return a snapshot of runtime statistics."""
        return self._stats

    def reset(self) -> None:
        """Reset the bucket to full capacity and clear statistics."""
        with self._lock:
            self._tokens = self._rate
            self._last   = time.monotonic()
            self._stats  = RateLimiterStats()
        log.debug("RateLimiter reset")

    # ── Context manager ───────────────────────────────────────────────────────

    def __enter__(self) -> "RateLimiter":
        self.acquire()
        return self

    def __exit__(self, *args: object) -> None:
        pass

    # ── Internal helpers ──────────────────────────────────────────────────────

    def _acquire_locked(self) -> float:
        """Must be called with self._lock held."""
        now     = time.monotonic()
        elapsed = now - self._last
        self._tokens = min(self._rate, self._tokens + elapsed * (self._rate / self._per))
        self._last   = now

        if self._tokens >= 1:
            self._tokens -= 1
            self._stats.total_calls += 1
            return 0.0

        if not self._blocking:
            self._stats.total_rejected += 1
            raise RateLimitExceeded(
                f"Rate limit exceeded: {self._rate:.0f} per {self._per:.1f}s"
            )

        # Calculate wait time and sleep outside the lock
        wait = (1 - self._tokens) * (self._per / self._rate)
        self._lock.release()
        try:
            if wait > _MIN_SLEEP_SECONDS:
                time.sleep(wait)
        finally:
            self._lock.acquire()

        # Now take the token
        self._tokens = max(0.0, self._tokens - 1)
        self._stats.total_calls  += 1
        self._stats.total_waited += wait
        log.debug("Rate limiter waited %.3fs", wait)
        return wait

    def __repr__(self) -> str:
        return (
            f"RateLimiter(rate={self._rate}, per={self._per}, "
            f"blocking={self._blocking})"
        )


# ── Demo ──────────────────────────────────────────────────────────────────────

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG, format="%(levelname)s %(message)s")

    limiter = RateLimiter(rate=3, per=1.0)

    @limiter.limit
    def simulate_api_call(n: int) -> str:
        return f"Response #{n}"

    print("Making 9 API calls at 3/second...")
    start = time.perf_counter()
    for i in range(9):
        result = simulate_api_call(i + 1)
        print(f"  {result} at {time.perf_counter() - start:.2f}s")

    s = limiter.stats
    print(f"\nStats: {s.total_calls} calls, "
          f"avg wait {s.avg_wait_ms:.1f}ms, "
          f"{s.total_rejected} rejected")

This module demonstrates every professional practice from this chapter:

  • Google-style docstrings on the module, class, and every public method
  • Named constants (_MIN_SLEEP_SECONDS)
  • Single-responsibility methods
  • Guard clauses with clear error messages
  • Type hints throughout
  • Structured logging with log.debug
  • A custom exception class (RateLimitExceeded)
  • __repr__ for easy debugging
  • A stats property for introspection
  • Context manager protocol (__enter__/__exit__)
  • Decorator support (.limit)
  • Thread safety with a Lock
  • Early returns where appropriate

What You Learned in This Chapter

  • PEP 8 defines naming (snake_case, PascalCase, UPPER_SNAKE_CASE), spacing, and line length. Let ruff format enforce it automatically.
  • Docstrings document what a function does, its args, return value, exceptions, and examples. Use Google style. Make examples runnable with doctest.
  • The Zen of Python in practice: explicit over implicit, simple over complex, flat over nested, errors never silent, one obvious way.
  • Code review: small PRs, one logical change, clear descriptions. Ask questions rather than make demands when reviewing.
  • .gitignore excludes __pycache__, virtual environments, .env, dist, and coverage files.
  • Conventional Commits (feat:, fix:, docs:, refactor:) make history readable and changelogs automatic.
  • Environment variables for all secrets and environment-specific config. Use pydantic-settings for typed, validated settings.
  • Production logging: levelled, named loggers per module, RotatingFileHandler, silence noisy third-party loggers.
  • Clean code rules: one responsibility per function, named constants instead of magic numbers, early returns instead of deep nesting, names that explain intent.
  • Professional habits: read error messages fully, write tests while writing code, commit small and often, document decisions not mechanics, favor boring solutions, refactor fearlessly with test coverage.

What's Next?

Chapter 36 covers Advanced Python Internals — how Python actually works under the hood: the CPython interpreter, bytecode, the GIL in depth, memory management, reference counting, and the descriptor protocol. Understanding internals makes you a more effective Python developer — you'll know why things work the way they do, not just that they do.

© 2026 Abhilash Sahoo. Python: Zero to Hero.