Chapter 14: Comprehensions — Elegant, Fast Code
You've been writing loops to build lists and dictionaries. They work — but Python has a more elegant way that experienced programmers reach for constantly: comprehensions.
A comprehension lets you create a collection in a single, readable line. What takes four lines of loop code becomes one line that reads almost like English. They're faster, shorter, and when written well, clearer than the equivalent loop.
By the end of this chapter you'll use list comprehensions, dictionary comprehensions, set comprehensions, and generator expressions fluently — and you'll know when to use each one.
List Comprehensions — Deep Dive
You've seen the basic form. Let's go deep.
The pattern
[expression for item in iterable]
Read it as: "Give me expression for each item in iterable."
# The loop way
squares = []
for n in range(1, 6):
squares.append(n ** 2)
print(squares) # [1, 4, 9, 16, 25]
# The comprehension way
squares = [n ** 2 for n in range(1, 6)]
print(squares) # [1, 4, 9, 16, 25]
Same result. Half the lines. And once you're used to reading comprehensions, the single line is actually clearer — because the intent is stated in one place.
The expression can be anything
names = ["alice", "bob", "carlos", "diana"]
# Capitalize each name
capitalized = [name.title() for name in names]
print(capitalized) # ['Alice', 'Bob', 'Carlos', 'Diana']
# Length of each name
lengths = [len(name) for name in names]
print(lengths) # [5, 3, 6, 5]
# Tuple of (name, length) for each
pairs = [(name.title(), len(name)) for name in names]
print(pairs) # [('Alice', 5), ('Bob', 3), ('Carlos', 6), ('Diana', 5)]
Filtering with a condition
Add an if clause at the end to filter which items get included:
[expression for item in iterable if condition]
numbers = [1, -3, 7, -2, 4, -8, 5, 0]
# Only positive numbers
positives = [n for n in numbers if n > 0]
print(positives) # [1, 7, 4, 5]
# Squares of even numbers
even_squares = [n ** 2 for n in range(1, 11) if n % 2 == 0]
print(even_squares) # [4, 16, 36, 64, 100]
# Words longer than 4 characters
words = ["cat", "elephant", "ant", "hippopotamus", "bee", "giraffe"]
long_words = [w for w in words if len(w) > 4]
print(long_words) # ['elephant', 'hippopotamus', 'giraffe']
if/else inside the expression (not the filter)
You can also use if/else inside the expression to transform values conditionally. Notice the position — this goes before for, not after.
[value_if_true if condition else value_if_false for item in iterable]
numbers = [1, -3, 7, -2, 4, -8, 5, 0]
# Replace negatives with 0
clamped = [n if n > 0 else 0 for n in numbers]
print(clamped) # [1, 0, 7, 0, 4, 0, 5, 0]
# Label each number
labels = ["positive" if n > 0 else "negative" if n < 0 else "zero"
for n in numbers]
print(labels)
# ['positive', 'negative', 'positive', 'negative', 'positive', 'negative', 'positive', 'zero']
This is the inline conditional (also called a ternary expression). The rule:
if conditionat the end = filter (exclude items)if/elsebeforefor= transform (include all items, change some)
Nested comprehensions
You can nest loops inside a comprehension — useful for flattening or combining sequences.
# Flatten a nested list
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [n for row in matrix for n in row]
print(flat) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
Read the nested comprehension left to right, same as nested loops: "for each row in matrix, for each n in row, give me n."
# All pairs (i, j) where i != j, from 1 to 4
pairs = [(i, j) for i in range(1, 5) for j in range(1, 5) if i != j]
print(pairs)
# [(1, 2), (1, 3), (1, 4), (2, 1), (2, 3), ...]
# Transpose a matrix (rows become columns)
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transposed = [[row[i] for row in matrix] for i in range(3)]
print(transposed)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
A word of caution: nested comprehensions get hard to read quickly. If it takes more than a moment to understand, use nested loops instead. Readability beats cleverness.
Your turn: Given a list of sentences (strings), write a comprehension that produces a list of all words across all sentences that are longer than 3 characters. Use nested comprehension: one loop over sentences, one over words in each sentence.
Dictionary Comprehensions
The pattern:
{key_expr: value_expr for item in iterable if condition}
# Word -> length mapping
words = ["apple", "banana", "cherry", "date"]
word_lengths = {word: len(word) for word in words}
print(word_lengths)
# {'apple': 5, 'banana': 6, 'cherry': 6, 'date': 4}
# Squares dictionary
squares = {n: n**2 for n in range(1, 11)}
print(squares)
# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}
# Invert a dictionary
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted) # {1: 'a', 2: 'b', 3: 'c'}
# Filter: only passing grades
scores = {"Alice": 88, "Bob": 45, "Carlos": 92, "Diana": 38, "Eve": 76}
passing = {name: score for name, score in scores.items() if score >= 50}
print(passing) # {'Alice': 88, 'Carlos': 92, 'Eve': 76}
# Transform values
prices_usd = {"apple": 1.20, "banana": 0.50, "cherry": 3.00}
exchange_rate = 0.92
prices_eur = {item: round(price * exchange_rate, 2)
for item, price in prices_usd.items()}
print(prices_eur) # {'apple': 1.1, 'banana': 0.46, 'cherry': 2.76}
# From two parallel lists
names = ["Alice", "Bob", "Carlos"]
grades = [88, 72, 95]
grade_book = {name: grade for name, grade in zip(names, grades)}
print(grade_book) # {'Alice': 88, 'Bob': 72, 'Carlos': 95}
zip() pairs up two iterables element by element — here it pairs each name with its grade. It's one of the most useful built-in functions for comprehensions.
Your turn: You have a list of file names: ["report.pdf", "data.csv", "image.jpg", "notes.txt", "backup.csv"]. Create a dictionary where each filename maps to its extension (without the dot). Use a comprehension.
Set Comprehensions
The pattern:
{expression for item in iterable if condition}
It looks exactly like a dict comprehension but has no : — just a single expression.
# Unique word lengths
words = ["cat", "elephant", "ant", "hippopotamus", "bee", "giraffe", "dog"]
unique_lengths = {len(w) for w in words}
print(unique_lengths) # {3, 7, 4, 12} (order not guaranteed)
# Unique first letters
names = ["Alice", "Bob", "Anna", "Carlos", "Beth", "Alex"]
first_letters = {name[0] for name in names}
print(first_letters) # {'A', 'B', 'C'}
# Squares of odd numbers as a set
odd_squares = {n**2 for n in range(1, 11) if n % 2 != 0}
print(odd_squares) # {1, 9, 25, 49, 81}
Use a set comprehension when you want unique values and don't care about order. It's like set(list_comprehension) but cleaner.
Generator Expressions — Lazy Evaluation
A generator expression looks like a list comprehension but with parentheses instead of brackets. The crucial difference: it doesn't create the entire list in memory. Instead it produces values one at a time, on demand.
# List comprehension — creates the entire list in memory NOW
squares_list = [n**2 for n in range(1, 1_000_001)] # 1 million items in RAM
# Generator expression — creates nothing yet; produces values as needed
squares_gen = (n**2 for n in range(1, 1_000_001)) # barely any memory used
You iterate over a generator with for, next(), or by passing it to a function:
gen = (n**2 for n in range(1, 6))
print(next(gen)) # 1
print(next(gen)) # 4
print(next(gen)) # 9
# generator remembers where it left off
# Sum without building the list
total = sum(n**2 for n in range(1, 1_000_001))
print(total) # 333333833333500000
Notice: sum(n**2 for n in range(...)) — no double brackets needed when a generator expression is the only argument to a function. Python accepts the generator directly.
# Find the first even square over 50
gen = (n**2 for n in range(1, 100))
result = next(x for x in gen if x > 50 and x % 2 == 0)
print(result) # 64
When to use generators vs list comprehensions
| Situation | Use |
|---|---|
| You need to iterate once and move on | Generator expression |
| You'll iterate multiple times | List comprehension |
You'll index into the result (result[3]) |
List comprehension |
You need len() |
List comprehension |
| Processing millions of items | Generator expression |
Passing to sum(), max(), min(), any(), all() |
Generator expression |
The golden rule: if you only need to go through the values once, use a generator. If you need to use them multiple times or by index, use a list.
# any() and all() with generators — short-circuit evaluation
numbers = [2, 4, 6, 7, 8, 10]
# any() stops as soon as it finds one True
has_odd = any(n % 2 != 0 for n in numbers)
print(has_odd) # True (stops at 7)
# all() stops as soon as it finds one False
all_even = all(n % 2 == 0 for n in numbers)
print(all_even) # False (stops at 7)
any() and all() with generator expressions are incredibly efficient — they stop the moment the answer is clear, without processing the rest.
Comprehensions vs Loops — When to Use Which
Comprehensions are not always the right tool. Here's how to decide.
Use a comprehension when:
- You're creating a new list/dict/set by transforming or filtering an existing iterable.
- The logic fits comfortably on one or two lines.
- The expression is clear without needing a comment.
Use a loop when:
- You have complex multi-step logic per item.
- You need
breakorcontinue. - The result involves more than one collection being updated.
- You're doing something for its side effects (printing, writing to a file) — not building a value.
# Good use of comprehension — simple transformation
emails = [user["email"].lower() for user in users if user["active"]]
# Bad use — too complex, hard to read
results = [
(item["id"], item["price"] * (1 - item.get("discount", 0)) * tax_rate)
for item in inventory
if item["in_stock"] and item["price"] * (1 - item.get("discount", 0)) > 10
]
# Better as a loop — readable, debuggable
results = []
for item in inventory:
if not item["in_stock"]:
continue
discount = item.get("discount", 0)
price = item["price"] * (1 - discount)
if price > 10:
results.append((item["id"], price * tax_rate))
The second comprehension is technically valid but takes real effort to parse. The loop is longer but instantly understandable. Choose readability.
zip(), enumerate(), and map() in Comprehensions
These built-ins pair perfectly with comprehensions.
zip() — Pair up two or more iterables
names = ["Alice", "Bob", "Carlos"]
scores = [88, 72, 95]
cities = ["London", "Paris", "Madrid"]
# Three parallel lists -> list of tuples
combined = [(name, score, city) for name, score, city in zip(names, scores, cities)]
print(combined)
# [('Alice', 88, 'London'), ('Bob', 72, 'Paris'), ('Carlos', 95, 'Madrid')]
# Directly into a dict
grade_book = {name: score for name, score in zip(names, scores)}
print(grade_book) # {'Alice': 88, 'Bob': 72, 'Carlos': 95}
enumerate() — Index + value
fruits = ["apple", "banana", "cherry"]
indexed = {i: fruit for i, fruit in enumerate(fruits, start=1)}
print(indexed) # {1: 'apple', 2: 'banana', 3: 'cherry'}
map() — Apply a function to every item
words = ["hello", "world", "python"]
upper = list(map(str.upper, words))
print(upper) # ['HELLO', 'WORLD', 'PYTHON']
# Equivalent comprehension (usually preferred for readability)
upper = [w.upper() for w in words]
map() returns an iterator (like a generator). Wrap it in list() to get a list. List comprehensions are usually clearer — but map() is worth knowing because you'll see it in real code.
Putting It All Together: A Data Pipeline
Here's a realistic data processing pipeline using comprehensions throughout:
# Raw data — imagine this came from a CSV or API
raw_students = [
{"name": " Alice ", "grade": "88", "city": "London", "active": True},
{"name": "Bob", "grade": "45", "city": "Paris", "active": True},
{"name": "Carlos", "grade": "92", "city": "Madrid", "active": False},
{"name": " Diana ", "grade": "N/A","city": "Tokyo", "active": True},
{"name": "Eve", "grade": "76", "city": "Sydney", "active": True},
]
def parse_grade(value):
"""Parse a grade string to float, return None if invalid."""
try:
return float(value)
except ValueError:
return None
# Step 1: Clean and parse — only active students with valid grades
students = [
{
"name": s["name"].strip().title(),
"grade": parse_grade(s["grade"]),
"city": s["city"],
}
for s in raw_students
if s["active"] and parse_grade(s["grade"]) is not None
]
print("Active students with valid grades:")
for s in students:
print(f" {s['name']}: {s['grade']}")
# Step 2: Build a grade lookup dictionary
grade_lookup = {s["name"]: s["grade"] for s in students}
# Step 3: Classify grades
classifications = {
s["name"]: "Pass" if s["grade"] >= 50 else "Fail"
for s in students
}
# Step 4: Get unique cities
cities = {s["city"] for s in students}
# Step 5: Calculate stats using generator expressions
grades = [s["grade"] for s in students]
average = sum(grades) / len(grades)
passing_count = sum(1 for g in grades if g >= 50)
print(f"\nGrade lookup: {grade_lookup}")
print(f"Classifications: {classifications}")
print(f"Cities represented: {cities}")
print(f"Average grade: {average:.1f}")
print(f"Passing: {passing_count}/{len(students)}")
Output:
Active students with valid grades:
Alice: 88.0
Bob: 45.0
Eve: 76.0
Grade lookup: {'Alice': 88.0, 'Bob': 45.0, 'Eve': 76.0}
Classifications: {'Alice': 'Pass', 'Bob': 'Fail', 'Eve': 'Pass'}
Cities represented: {'London', 'Paris', 'Sydney'}
Average grade: 69.7
Passing: 2/3
List comprehension, dict comprehension, set comprehension, and generator expression — all in a twenty-five line pipeline that cleans, filters, transforms, and summarizes real messy data.
Quick Reference
# List comprehension
result = [expr for item in iterable]
result = [expr for item in iterable if condition]
result = [a if cond else b for item in iterable]
# Dict comprehension
result = {key: value for item in iterable}
result = {key: value for item in iterable if condition}
result = {k: v for k, v in some_dict.items()}
# Set comprehension
result = {expr for item in iterable}
result = {expr for item in iterable if condition}
# Generator expression
result = (expr for item in iterable)
result = (expr for item in iterable if condition)
# Useful combinations
sum(expr for item in iterable)
max(expr for item in iterable)
any(condition for item in iterable)
all(condition for item in iterable)
list(map(func, iterable))
{k: v for k, v in zip(keys, values)}
What You Learned in This Chapter
- A list comprehension creates a list:
[expr for item in iterable if condition]. - Filter with
if conditionat the end; transform conditionally withif/elsebeforefor. - Nested comprehensions handle nested loops — but readability has limits.
- A dictionary comprehension creates a dict:
{key: value for item in iterable}. - A set comprehension creates a set:
{expr for item in iterable}. - A generator expression uses
()and produces values lazily — no memory overhead. - Use generators with
sum(),any(),all(),max(),min()— they stop early when possible. zip()pairs up iterables; perfect for building dicts from parallel lists.- Comprehensions are for building values. Loops are for side effects and complex logic.
- If the comprehension takes a moment to parse, use a loop.
What's Next?
Your code is becoming more expressive and efficient. Now it's time to learn how to organize it.
In Chapter 15 you'll learn about modules and packages — how Python's import system works, how to use the standard library, how to install third-party packages with pip, and how to organize your own code into modules. This is the foundation for writing programs that scale beyond a single file.
Your turn: You have a list of dictionaries, each representing a product: {"name": "...", "price": ..., "category": "..."}. Using only comprehensions:
- Build a list of product names where price > 20.
- Build a dict mapping name -> price for all products.
- Build a set of all unique categories.
- Calculate the average price using a generator expression.
- Find the most expensive product name using
max()with a generator.