Chapter 51: Git and Version Control for Python Developers
Every professional developer uses Git. Not occasionally — every day, on every project.
Git is a tool that tracks changes to your code over time. It lets you go back to any previous version, work on multiple features simultaneously without breaking each other, collaborate with a team, and understand exactly what changed, when, and why.
This chapter teaches you Git the way Python developers actually use it.
Why Git Matters
Imagine you spend a week adding a new feature. It breaks everything. Without Git, you're stuck. With Git, you type one command and your code is back to where it was — instantly.
Or imagine working with three other developers on the same codebase. Without Git, you'd email files back and forth and constantly overwrite each other's work. With Git, everyone works on their own branch, and Git merges the changes together.
Git is non-negotiable. Let's learn it.
Setup
# Check if Git is installed
git --version
# Install on Windows: https://git-scm.com
# Install on Mac: brew install git
# Install on Linux: sudo apt install git
# Configure your identity (required before first commit)
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
# Set VS Code as your default editor
git config --global core.editor "code --wait"
# Make 'main' the default branch name (modern standard)
git config --global init.defaultBranch main
The Core Concepts
Working Directory -> Staging Area -> Repository
(your files) (git add) (git commit)
- Working directory — the files on your disk
- Staging area — a holding area where you prepare what goes into the next commit
- Repository — the permanent history of all commits
Think of it like packing a box (staging area) before sealing and labelling it (committing).
Starting a Repository
# Create a new project
mkdir my-project
cd my-project
git init # creates a hidden .git/ folder
# OR clone an existing repository from GitHub
git clone https://github.com/username/repo-name.git
cd repo-name
The Basic Workflow
# 1. See what's changed
git status
# 2. Stage changes (prepare for commit)
git add filename.py # add one file
git add src/ # add a directory
git add . # add everything (use carefully)
git add -p # interactive: stage chunks, not whole files
# 3. Commit the staged changes
git commit -m "feat: add user authentication"
# 4. See the history
git log # full log
git log --oneline # compact, one line per commit
git log --oneline --graph # visual branch graph
.gitignore — What NOT to Track
Some files should never go into Git: virtual environments, compiled files, secrets, logs.
Create a .gitignore in your project root:
# Python
__pycache__/
*.py[cod]
*.pyo
*.pyd
.Python
*.egg
*.egg-info/
dist/
build/
.eggs/
# Virtual environments
.venv/
venv/
env/
ENV/
# Testing
.pytest_cache/
.coverage
htmlcov/
.tox/
# Type checking
.mypy_cache/
.dmypy.json
# IDEs
.vscode/settings.json
.idea/
*.swp
*.swo
# Secrets and environment files
.env
.env.local
*.key
secrets.json
# macOS
.DS_Store
# Database files
*.db
*.sqlite3
# Logs
*.log
logs/
# Build artifacts
*.whl
*.tar.gz
GitHub provides a standard Python .gitignore at github.com/github/gitignore.
Viewing Changes
# What changed in the working directory (not yet staged)?
git diff
# What changed in the staging area (staged but not committed)?
git diff --staged
# What changed in a specific file?
git diff filename.py
# See a specific commit
git show abc1234
# See what a file looked like in a previous commit
git show HEAD~3:src/module.py
# Compare two commits
git diff abc1234 def5678
Undoing Mistakes
# Unstage a file (keep the changes in working dir)
git restore --staged filename.py
# Discard changes in working directory (PERMANENT — cannot undo)
git restore filename.py
# Undo the last commit but keep the changes staged
git reset --soft HEAD~1
# Undo the last commit and unstage the changes
git reset HEAD~1
# Create a new commit that reverses a past commit (safe for shared history)
git revert abc1234
# Stash changes temporarily (save without committing)
git stash
git stash pop # restore the stashed changes
git stash list # see all stashes
git stash drop stash@{0} # delete a stash
The golden rule: never rewrite history that has been pushed to a shared repository. Use git revert instead of git reset when working with others.
Branches — Work in Isolation
Branches let you work on a feature without touching the main codebase.
# See all branches
git branch # local branches
git branch -a # local and remote
# Create a new branch
git branch feature/user-auth
# Switch to it
git checkout feature/user-auth
# Create AND switch in one command (modern way)
git switch -c feature/user-auth
# Go back to main
git switch main
# Merge a branch into main
git switch main
git merge feature/user-auth
# Delete a branch after merging
git branch -d feature/user-auth
# Force delete (unmerged branch)
git branch -D feature/user-auth
Merge Conflicts — How to Resolve Them
When two branches change the same line, Git can't merge automatically. You get a conflict:
<<<<<<< HEAD
def greet(name):
return f"Hello, {name}!"
=======
def greet(name: str) -> str:
return f"Hi there, {name}!"
>>>>>>> feature/greetings
The section between <<< and === is your current branch. Between === and >>> is the incoming branch.
Resolve it by editing the file to the correct final version:
def greet(name: str) -> str:
return f"Hello, {name}!"
Then:
git add filename.py
git commit -m "resolve: merge conflict in greet function"
Working with GitHub
Push your code
# Connect your local repo to a GitHub remote
git remote add origin https://github.com/username/repo-name.git
# Push and set up tracking
git push -u origin main
# After that, just:
git push
Pull changes from GitHub
# Fetch + merge in one command
git pull
# Fetch only (don't merge yet)
git fetch origin
git diff origin/main # see what's different
git merge origin/main # merge when ready
The fetch/pull difference
git fetch downloads changes but doesn't apply them. git pull downloads and applies. When in doubt, fetch first, check the diff, then merge.
The Professional Git Workflow
Real teams use a consistent branching strategy. Here's the most common one:
main <- stable, production code only
└── develop <- integration branch (optional for larger teams)
├── feature/add-search <- new features
├── fix/login-bug <- bug fixes
└── hotfix/security-patch <- urgent production fixes
The day-to-day workflow:
# 1. Always start from an up-to-date main
git switch main
git pull
# 2. Create a feature branch
git switch -c feature/add-search
# 3. Work, commit often
git add -p
git commit -m "feat: add basic search endpoint"
git add -p
git commit -m "test: add tests for search endpoint"
# 4. Push your branch
git push -u origin feature/add-search
# 5. Open a Pull Request on GitHub
# 6. Code review
# 7. Merge to main (on GitHub)
# 8. Delete your branch
git switch main
git pull
git branch -d feature/add-search
Conventional Commits
Use a consistent format for commit messages. This makes the history readable and enables automated changelog generation:
<type>(<scope>): <short description>
[optional body]
[optional footer]
Types:
| Type | When to use |
|---|---|
feat |
A new feature |
fix |
A bug fix |
docs |
Documentation only |
test |
Adding or changing tests |
refactor |
Code change that is neither a feature nor a fix |
perf |
Performance improvement |
chore |
Build process, dependency updates |
ci |
CI configuration |
Examples:
feat(auth): add JWT token refresh endpoint
fix(search): handle empty query string correctly
docs: update README with installation instructions
test(products): add tests for price filter
refactor(cleaner): extract price parsing to separate function
perf(db): add index on products.created_at column
chore: upgrade dependencies to latest versions
Good commit messages:
- Describe what and why, not how (the code shows how)
- Use imperative mood: "add" not "added", "fix" not "fixes"
- Keep the first line under 72 characters
- Separate subject from body with a blank line
Useful Git Commands for Python Developers
# Find which commit introduced a bug (binary search)
git bisect start
git bisect bad # current commit is broken
git bisect good v1.0.0 # this version was good
# Git checks out middle commits — you test and mark good/bad
git bisect good # or: git bisect bad
git bisect reset # when done
# See who changed a line (blame)
git blame src/module.py
git blame -L 10,20 src/module.py # only lines 10-20
# Search commit messages
git log --grep="fix"
git log --grep="authentication" --oneline
# Search code changes across all history
git log -S "def authenticate" --oneline # commits that added/removed this string
# See all files changed in a commit
git show --stat abc1234
# Cherry-pick a single commit from another branch
git cherry-pick abc1234
# Rebase (replay your commits on top of another branch)
git rebase main
git add -p — Stage Like a Pro
This is the most underused and most powerful Git command for developers. It shows you each change and lets you decide whether to stage it:
git add -p
For each change, you'll see a diff and a prompt. Options:
y— yes, stage this chunkn— no, skip its— split into smaller chunkse— edit the chunk manuallyq— quit
This lets you make many changes at once but commit them as separate, logical units — keeping your history clean.
Git Tags — Mark Releases
# Create a lightweight tag
git tag v1.0.0
# Create an annotated tag (recommended for releases)
git tag -a v1.0.0 -m "Release version 1.0.0"
# List all tags
git tag
# Push tags to GitHub
git push origin --tags
# Push a specific tag
git push origin v1.0.0
# Delete a tag locally
git tag -d v1.0.0
# Delete a remote tag
git push origin --delete v1.0.0
A Clean Python Project on GitHub
Here's what a well-configured Python project on GitHub looks like:
my-project/
├── .github/
│ ├── workflows/
│ │ ├── ci.yml <- test on every push
│ │ └── publish.yml <- publish on version tags
│ └── PULL_REQUEST_TEMPLATE.md
├── src/
│ └── mypackage/
├── tests/
├── docs/
├── .gitignore <- Python gitignore
├── pyproject.toml <- package config + tool config
├── README.md
└── LICENSE
PULL_REQUEST_TEMPLATE.md — fills the PR description automatically:
## What does this PR do?
## Type of change
- [ ] Bug fix
- [ ] New feature
- [ ] Refactor
- [ ] Documentation
## Testing
- [ ] I added tests for this change
- [ ] All existing tests pass
## Checklist
- [ ] My code follows the project style guide
- [ ] I've updated the documentation if needed
What You Learned in This Chapter
- Git tracks changes to your code.
git initstarts a repo.git clonecopies one. - The three areas: working directory -> staging area (
git add) -> repository (git commit). .gitignorekeeps generated files, secrets, and virtual environments out of the repo.git diffshows uncommitted changes.git log --onelineshows history compactly.git restore --stagedunstages.git restorediscards working-dir changes.git reset --soft HEAD~1undoes the last commit.- Branches isolate work.
git switch -c feature/namecreates and switches.git mergecombines. Conflicts are resolved by editing the marked sections. git push -u origin mainconnects your local repo to GitHub.git pullfetches and merges remote changes.- Conventional commits —
feat:,fix:,docs:,test:— make history readable and automatable. git add -pstages changes chunk by chunk, keeping commits logical and focused.git tag -a v1.0.0marks a release. Push tags withgit push origin --tags.
What's Next?
Chapter 52 covers Environment Management and Project Structure — virtual environments in depth, choosing between pip, poetry, and uv, professional project layouts, and managing secrets safely.