Chapter 10: Strings in Depth

You've been using strings since Chapter 1. Every name, message, and label in your programs has been a string. But so far you've only scratched the surface.

Strings in Python are extraordinarily powerful. Python gives you dozens of built-in methods to search, split, transform, format, and validate text — and you'll use them constantly. Almost every real program deals with text: reading user input, processing files, building messages, parsing data. Master strings and a huge part of Python becomes easy.

This chapter covers everything.

Strings Are Sequences

A string is not just a blob of text — it's a sequence of characters. That means everything you learned about sequences in Chapter 6 (lists) applies here too: indexing, slicing, and looping.

name = "Python"

print(name[0])    # P
print(name[1])    # y
print(name[-1])   # n  (last character)
print(name[-2])   # o  (second to last)
print(len(name))  # 6

Slicing strings

language = "Python 3.12"

print(language[0:6])    # Python
print(language[7:])     # 3.12
print(language[:6])     # Python
print(language[::2])    # Pto .2  (every other character)
print(language[::-1])   # 21.3 nohtyP  (reversed)

Looping over a string

for char in "Hello":
    print(char)

Output:

H
e
l
l
o

Strings are immutable

Unlike lists, you cannot change a character in a string by assigning to an index. Strings are immutable — once created, they cannot be modified.

name = "Python"
name[0] = "J"   # TypeError: 'str' object does not support item assignment

To "change" a string, you create a new one:

name = "Python"
name = "J" + name[1:]
print(name)   # Jython

Your turn: Take the string "Hello, World!". Using slicing, extract and print just "World".

String Methods

A method is a function that belongs to an object. You call it with a dot: string.method(). String methods never modify the original string — they always return a new one.

Case methods

message = "Hello, World!"

print(message.upper())      # HELLO, WORLD!
print(message.lower())      # hello, world!
print(message.title())      # Hello, World!
print(message.capitalize()) # Hello, world!
print(message.swapcase())   # hELLO, wORLD!

Real use: normalizing user input before comparison.

answer = input("Continue? (yes/no): ").strip().lower()

if answer == "yes":
    print("Continuing...")
elif answer == "no":
    print("Stopping.")
else:
    print("Please type yes or no.")

.lower() means "Yes", "YES", "yes" all match "yes". Without it, the user has to type the exact case you expect.

Stripping whitespace

text = "   hello world   "

print(text.strip())    # "hello world"   (both ends)
print(text.lstrip())   # "hello world   " (left only)
print(text.rstrip())   # "   hello world"  (right only)

Always .strip() user input. People accidentally add spaces all the time.

You can also strip specific characters:

url = "https://example.com/"
print(url.strip("/"))   # https://example.com

`split()` — Break a string into a list

sentence = "Python is easy to learn"
words = sentence.split()
print(words)   # ['Python', 'is', 'easy', 'to', 'learn']

Without arguments, split() splits on any whitespace. Pass a delimiter to split on something specific:

csv_line = "Alice,88,London"
parts = csv_line.split(",")
print(parts)   # ['Alice', '88', 'London']

date = "2026-03-09"
year, month, day = date.split("-")
print(year, month, day)   # 2026 03 09

split() with unpacking is one of the most useful patterns in Python for parsing text data.

`join()` — Combine a list into a string

join() is the opposite of split(). It takes a list of strings and joins them with a separator.

words = ["Python", "is", "fun"]
sentence = " ".join(words)
print(sentence)   # Python is fun

items = ["bread", "milk", "eggs"]
print(", ".join(items))   # bread, milk, eggs
print(" | ".join(items))  # bread | milk | eggs
print("\n".join(items))   # each item on its own line

The pattern to remember: separator goes before .join(), list goes inside the parentheses.

`replace()` — Swap text

text = "I love Java. Java is great."
new_text = text.replace("Java", "Python")
print(new_text)   # I love Python. Python is great.

Replace only the first occurrence:

print(text.replace("Java", "Python", 1))   # I love Python. Java is great.

`find()` and `index()` — Search for a substring

find() returns the index of the first occurrence of a substring. Returns -1 if not found.

email = "alice@example.com"
at_position = email.find("@")
print(at_position)   # 5

index() does the same but raises a ValueError if not found (instead of returning -1). Use find() when "not found" is a normal situation; use index() when not finding it would be a bug.

text = "Hello, World!"
print(text.find("World"))    # 7
print(text.find("Python"))   # -1  (not found)

`startswith()` and `endswith()`

filename = "report_2026.csv"

print(filename.startswith("report"))   # True
print(filename.endswith(".csv"))       # True
print(filename.endswith(".pdf"))       # False

Pass a tuple to check multiple options at once:

print(filename.endswith((".csv", ".xlsx", ".txt")))   # True

Real use: filtering files by extension.

files = ["data.csv", "notes.txt", "image.png", "sales.csv", "readme.md"]
csv_files = [f for f in files if f.endswith(".csv")]
print(csv_files)   # ['data.csv', 'sales.csv']

`count()` — How many times does a substring appear?

text = "banana"
print(text.count("a"))    # 3
print(text.count("an"))   # 2

`in` — Check if a substring exists

sentence = "The quick brown fox"

print("fox" in sentence)      # True
print("cat" in sentence)      # False
print("quick" in sentence)    # True

This is often cleaner than find() when you just need a yes/no answer:

email = input("Enter your email: ")

if "@" not in email or "." not in email:
    print("That doesn't look like a valid email.")

Your turn: Take this string: " the Quick Brown FOX jumps OVER the lazy DOG ". In one chain of method calls, strip the whitespace, convert to title case, and print the result.

String Formatting

Python gives you three ways to format strings. You met f-strings in Chapter 2 — now let's see all three and understand why f-strings are the right choice for almost everything.

1. `%` formatting (old style — avoid in new code)

name = "Alice"
age = 30
print("My name is %s and I am %d years old." % (name, age))

%s for strings, %d for integers, %f for floats. You'll see this in older Python code. Don't write it in new code.

2. `.format()` (better — still common)

name = "Alice"
age = 30
print("My name is {} and I am {} years old.".format(name, age))

# Named placeholders
print("My name is {name} and I am {age} years old.".format(name=name, age=age))

More readable than %, but more verbose than f-strings.

3. F-strings (best — use these)

name = "Alice"
age = 30
print(f"My name is {name} and I am {age} years old.")

F-strings are:

Faster — evaluated at runtime, not parsed
Cleaner — expression goes right in the string
More powerful — any Python expression works inside {}

price = 49.99
quantity = 3

print(f"Total: ${price * quantity:.2f}")   # Total: $149.97

The :.2f inside the {} formats the number to 2 decimal places.

F-string formatting cheat sheet

pi = 3.14159265

print(f"{pi:.2f}")       # 3.14     (2 decimal places)
print(f"{pi:.4f}")       # 3.1416   (4 decimal places)
print(f"{pi:10.2f}")     # '      3.14'  (width 10, right-aligned)
print(f"{pi:<10.2f}")    # '3.14      '  (left-aligned)
print(f"{pi:^10.2f}")    # '   3.14   '  (centered)

n = 1000000
print(f"{n:,}")          # 1,000,000  (thousands separator)
print(f"{n:_}")          # 1_000_000  (underscore separator)

pct = 0.87654
print(f"{pct:.1%}")      # 87.7%   (percentage)

# Padding and alignment with strings
name = "Alice"
print(f"|{name:10}|")    # |Alice     |  (left-aligned, width 10)
print(f"|{name:>10}|")   # |     Alice|  (right-aligned)
print(f"|{name:^10}|")   # |  Alice   |  (centered)
print(f"|{name:*^10}|")  # |**Alice** |  (centered, fill with *)

Your turn: Create a simple receipt printer. You have three items with prices. Print a formatted receipt where item names are left-aligned in a 20-character column and prices are right-aligned in an 8-character column, showing 2 decimal places. Add a total at the bottom.

Checking String Content

Python has a set of methods that check what kind of characters a string contains. They all return True or False.

print("12345".isdigit())     # True  — all digits
print("12.5".isdigit())      # False — period is not a digit
print("hello".isalpha())     # True  — all letters
print("hello1".isalpha())    # False — has a digit
print("hello1".isalnum())    # True  — letters and digits only
print("   ".isspace())       # True  — all whitespace
print("Hello World".istitle()) # True — title case
print("HELLO".isupper())     # True  — all uppercase
print("hello".islower())     # True  — all lowercase

Real use: validating user input.

def get_positive_integer(prompt):
    """Keep asking until the user enters a positive integer."""
    while True:
        value = input(prompt).strip()
        if value.isdigit():
            return int(value)
        print("Please enter a whole number greater than zero.")

age = get_positive_integer("Enter your age: ")
print(f"You are {age} years old.")

isdigit() is a clean way to validate numeric input without using try/except (though try/except handles negatives and decimals better — you'll pick the right tool for each situation).

Multiline Strings and Raw Strings

You saw these in Chapter 2. Here's the full picture.

Triple-quoted strings

poem = """
Roses are red,
Violets are blue,
Python is great,
And so are you.
"""
print(poem)

Triple quotes preserve all whitespace and newlines exactly as written. Useful for long messages, email templates, SQL queries, and HTML snippets embedded in Python.

Raw strings

A raw string treats backslashes as literal characters — no escape processing.

# Normal string — \n is a newline, \t is a tab
print("First line\nSecond line")

# Raw string — \n is literally backslash-n
print(r"First line\nSecond line")

Output:

First line
Second line
First line\nSecond line

Essential for Windows file paths and regular expressions:

# Windows path — without raw string you'd need "C:\\Users\\Alice\\file.txt"
path = r"C:\Users\Alice\Documents\file.txt"

# Regular expression pattern — backslashes are common
import re
pattern = r"\d{3}-\d{4}"   # matches phone number format like 555-1234

Unicode and Encoding

Every character in Python is a Unicode character. Unicode is a standard that assigns a unique number to every character in every writing system on earth — Latin letters, Arabic, Chinese, emoji, everything.

print("Hello")         # English
print("こんにちは")     # Japanese
print("Привет")        # Russian
print("مرحبا")         # Arabic
print("")            # Snake emoji

All of these work in Python 3 strings.

`ord()` and `chr()`

ord() gives you the Unicode code point (number) of a character. chr() gives you the character for a code point.

print(ord("A"))    # 65
print(ord("a"))    # 97
print(ord(""))   # 128013

print(chr(65))     # A
print(chr(9829))   # ♥

Encoding

When Python writes strings to a file or sends them over a network, it must convert them to bytes. The most common encoding is UTF-8, and it's what Python uses by default.

text = "Hello, 世界"
encoded = text.encode("utf-8")
print(encoded)   # b'Hello, \xe4\xb8\x96\xe7\x95\x8c'

decoded = encoded.decode("utf-8")
print(decoded)   # Hello, 世界

In practice, you almost never need to think about this. Python handles it for you. The one time it matters: if you're opening a file and see strange characters, specify the encoding explicitly:

with open("data.txt", "r", encoding="utf-8") as file:
    content = file.read()

If you're working with files from Windows, you may sometimes encounter encoding="latin-1" or encoding="cp1252". UTF-8 is always the right choice for new files.

String Methods Reference

Here's a quick reference of every method covered in this chapter:

Method	What it does	Example
`.upper()`	All uppercase	`"hi".upper()` -> `"HI"`
`.lower()`	All lowercase	`"HI".lower()` -> `"hi"`
`.title()`	Title Case	`"hello world".title()` -> `"Hello World"`
`.capitalize()`	First letter uppercase	`"hello".capitalize()` -> `"Hello"`
`.strip()`	Remove leading/trailing whitespace	`" hi ".strip()` -> `"hi"`
`.lstrip()`	Remove leading whitespace
`.rstrip()`	Remove trailing whitespace
`.split(sep)`	Split into list	`"a,b,c".split(",")` -> `["a","b","c"]`
`.join(list)`	Join list into string	`",".join(["a","b"])` -> `"a,b"`
`.replace(old, new)`	Replace substring	`"hi".replace("h","H")` -> `"Hi"`
`.find(sub)`	First index of substring, -1 if not found	`"hello".find("l")` -> `2`
`.index(sub)`	First index, raises ValueError if not found
`.startswith(prefix)`	Starts with?	`"hello".startswith("he")` -> `True`
`.endswith(suffix)`	Ends with?	`"hello".endswith("lo")` -> `True`
`.count(sub)`	Count occurrences	`"banana".count("a")` -> `3`
`.isdigit()`	All digits?	`"123".isdigit()` -> `True`
`.isalpha()`	All letters?	`"abc".isalpha()` -> `True`
`.isalnum()`	Letters and digits only?	`"abc1".isalnum()` -> `True`
`.isspace()`	All whitespace?	`" ".isspace()` -> `True`
`.isupper()`	All uppercase?	`"HI".isupper()` -> `True`
`.islower()`	All lowercase?	`"hi".islower()` -> `True`

Putting It All Together: A Text Analyzer

def analyze_text(text):
    """Print a detailed analysis of a text string."""
    words = text.split()
    sentences = text.replace("!", ".").replace("?", ".").split(".")
    sentences = [s.strip() for s in sentences if s.strip()]

    print(f"--- Text Analysis ---")
    print(f"Characters (with spaces):    {len(text)}")
    print(f"Characters (no spaces):      {len(text.replace(' ', ''))}")
    print(f"Words:                       {len(words)}")
    print(f"Sentences:                   {len(sentences)}")
    print(f"Unique words:                {len(set(w.lower().strip('.,!?') for w in words))}")
    print(f"Average word length:         {sum(len(w) for w in words) / len(words):.1f}")

    most_common = max(set(words), key=words.count)
    print(f"Most common word:            '{most_common}' ({words.count(most_common)}x)")

    print(f"\nFirst 50 chars:  '{text[:50]}'")
    print(f"Uppercase:       '{text.upper()[:50]}'")

sample = (
    "Python is a great language. Python is easy to learn. "
    "Many people use Python for data science and web development. "
    "Python is growing fast!"
)

analyze_text(sample)

Output:

--- Text Analysis ---
Characters (with spaces):    122
Characters (no spaces):      103
Words:                       21
Sentences:                   4
Unique words:                15
Average word length:         5.3
Most common word:            'Python' (4x)

First 50 chars:  'Python is a great language. Python is easy to lea'
Uppercase:       'PYTHON IS A GREAT LANGUAGE. PYTHON IS EASY TO LEA'

Every technique from this chapter, working together on a real task.

What You Learned in This Chapter

Strings are immutable sequences — you can index, slice, and loop over them, but not change them in place.
Key methods: upper(), lower(), title(), strip(), split(), join(), replace(), find(), startswith(), endswith(), count()
Content checks: isdigit(), isalpha(), isalnum(), isspace(), isupper(), islower()
Three formatting styles: % (old), .format() (fine), f-strings (best — use these)
F-string format spec: :.2f (floats), :, (thousands), :.1% (percent), :<, :>, :^ (alignment)
Triple-quoted strings preserve newlines and whitespace.
Raw strings (r"") treat backslashes literally — essential for paths and regex.
Python strings are Unicode. ord() and chr() convert between characters and code points.
Specify encoding="utf-8" when opening files to avoid encoding issues.

What's Next?

Strings are sequences of characters. But Python has two more sequence types you haven't met yet — and they behave quite differently.

In Chapter 11 you'll learn about tuples (like lists, but unchangeable) and sets (unordered collections with no duplicates). They sound simple, but knowing when to use each one — and why — is what separates programmers who just write code from programmers who write good code.

Your turn: Write a function called clean_name(name) that takes a raw name string and returns a cleaned version: stripped of whitespace, converted to title case, and with any double spaces replaced with single spaces. Test it with inputs like " alice smith ", "BOB JONES", and " carlos ".

Chapter 10: Strings in Depth

Strings Are Sequences

Slicing strings

Looping over a string

Strings are immutable

String Methods

Case methods

Stripping whitespace

split() — Break a string into a list

join() — Combine a list into a string

replace() — Swap text

find() and index() — Search for a substring

startswith() and endswith()

count() — How many times does a substring appear?

in — Check if a substring exists

String Formatting

1. % formatting (old style — avoid in new code)

2. .format() (better — still common)

3. F-strings (best — use these)

F-string formatting cheat sheet

Checking String Content

Multiline Strings and Raw Strings

Triple-quoted strings

Raw strings

Unicode and Encoding

ord() and chr()

Encoding

String Methods Reference

Putting It All Together: A Text Analyzer

What You Learned in This Chapter

What's Next?

`split()` — Break a string into a list

`join()` — Combine a list into a string

`replace()` — Swap text

`find()` and `index()` — Search for a substring

`startswith()` and `endswith()`

`count()` — How many times does a substring appear?

`in` — Check if a substring exists

1. `%` formatting (old style — avoid in new code)

2. `.format()` (better — still common)

`ord()` and `chr()`