Chapter 36: Advanced Python Internals

Most developers use Python for years without understanding how it works internally. That's fine — the abstraction holds. But understanding what happens under the hood makes you a better Python developer. You'll know why some things are fast and others slow, why the GIL exists, how memory is managed, and why certain patterns work the way they do.

This chapter goes deep. Take it slowly.

How Python Runs Your Code

When you run python script.py, several things happen:

1. Lexing    — the source text is tokenized into tokens
2. Parsing   — tokens are turned into an Abstract Syntax Tree (AST)
3. Compiling — the AST is compiled to bytecode (.pyc files)
4. Executing — the CPython interpreter executes the bytecode

You can inspect every stage:

# Stage 1: Tokens
import tokenize
import io

source = "x = 1 + 2"
tokens = list(tokenize.generate_tokens(io.StringIO(source).readline))
for tok in tokens:
    print(tok)
# TokenInfo(type=1 (NAME),  string='x', ...)
# TokenInfo(type=54 (OP),   string='=', ...)
# TokenInfo(type=2 (NUMBER),string='1', ...)
# TokenInfo(type=54 (OP),   string='+', ...)
# TokenInfo(type=2 (NUMBER),string='2', ...)

# Stage 2: Abstract Syntax Tree
import ast

source = "x = 1 + 2"
tree   = ast.parse(source)
print(ast.dump(tree, indent=2))

Output (simplified):

Module(
  body=[
    Assign(
      targets=[Name(id='x')],
      value=BinOp(
        left=Constant(value=1),
        op=Add(),
        right=Constant(value=2)
      )
    )
  ]
)

# Stage 3: Bytecode
import dis

def add(a, b):
    return a + b

dis.dis(add)

Output:

  2           0 RESUME          0

  3           2 LOAD_FAST       0 (a)
              4 LOAD_FAST       1 (b)
              6 BINARY_OP      0 (+)
             10 RETURN_VALUE

Each line is a bytecode instruction — a simple operation the Python virtual machine executes. LOAD_FAST pushes a local variable onto the stack. BINARY_OP pops two values, adds them, pushes the result. RETURN_VALUE pops the top of the stack and returns it.

Python is a stack machine — instructions operate on a stack of values.

Inspecting bytecode of any function

import dis

def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

dis.dis(fibonacci)
print()
print(f"Constants: {fibonacci.__code__.co_consts}")
print(f"Names:     {fibonacci.__code__.co_names}")
print(f"Varnames:  {fibonacci.__code__.co_varnames}")
print(f"Stack size:{fibonacci.__code__.co_stacksize}")

The __code__ object contains everything about the compiled function: its bytecode (co_code), local variable names, constants, free variables, and more.

The CPython Virtual Machine

CPython is the reference implementation of Python — the one you download from python.org. It's written in C.

The core execution loop is in ceval.c — a giant switch statement that dispatches on bytecode opcodes. For every LOAD_FAST, BINARY_OP, CALL_FUNCTION, there's a case in that switch that executes the corresponding C code.

This is why Python is slower than compiled languages like C or Go — each Python operation involves multiple C operations, plus the overhead of the dispatch loop. The GIL (covered next) adds more overhead for multi-threaded programs.

Code objects

Every function, class, and module has a code object (types.CodeType):

import types

def outer():
    x = 10
    def inner():
        return x
    return inner

print(type(outer.__code__))   # <class 'code'>

# Code object attributes
code = outer.__code__
print(f"co_name:      {code.co_name}")        # outer
print(f"co_filename:  {code.co_filename}")    # script.py
print(f"co_firstlineno:{code.co_firstlineno}")# 1
print(f"co_varnames:  {code.co_varnames}")    # ('x', 'inner')
print(f"co_freevars:  {code.co_freevars}")    # () — outer has no free vars
print(f"co_cellvars:  {code.co_cellvars}")    # ('x',) — x is captured by inner

inner_code = outer.__code__.co_consts[1]      # the inner function's code object
print(f"inner freevars: {inner_code.co_freevars}")  # ('x',) — captured from outer

Frame objects

When a function is called, Python creates a frame object — a runtime snapshot of the execution state:

import sys

def show_frame():
    frame = sys._getframe()     # the current frame
    print(f"Function:  {frame.f_code.co_name}")
    print(f"File:      {frame.f_code.co_filename}")
    print(f"Line:      {frame.f_lineno}")
    print(f"Locals:    {frame.f_locals}")
    print(f"Globals:   {list(frame.f_globals.keys())[:5]}")

    caller = frame.f_back       # the frame that called us
    if caller:
        print(f"Called by: {caller.f_code.co_name}")

def main():
    x = 42
    show_frame()

main()

The call stack is a linked list of frame objects — each frame has f_back pointing to its caller.

# Walk the entire call stack
import traceback

def deep():
    for frame_info in traceback.extract_stack():
        print(f"  {frame_info.filename}:{frame_info.lineno} in {frame_info.name}")

def middle():
    deep()

def top():
    middle()

top()

The GIL — Global Interpreter Lock in Depth

The GIL is a mutex in CPython that ensures only one thread executes Python bytecode at a time. It's the most controversial aspect of CPython.

Why the GIL exists

CPython's memory management is not thread-safe by design. It uses reference counting — every object has a counter of how many references point to it. When the counter reaches zero, the object is freed. Without the GIL, two threads could simultaneously modify an object's reference count, causing corruption.

The GIL is a coarse solution: instead of fine-grained locks on every object, one global lock protects everything. This is simpler and faster for single-threaded programs — and most programs are mostly single-threaded.

The GIL releases during I/O

The GIL is released before any I/O operation and re-acquired after. This is why threading helps for I/O-bound work:

import threading
import time

def io_bound():
    time.sleep(1)   # GIL is released during sleep

def cpu_bound():
    sum(range(10_000_000))   # GIL is held throughout

# I/O — threads help (GIL releases during sleep)
start = time.perf_counter()
threads = [threading.Thread(target=io_bound) for _ in range(4)]
for t in threads: t.start()
for t in threads: t.join()
print(f"I/O 4 threads: {time.perf_counter() - start:.2f}s")   # ~1.0s

# CPU — threads don't help (GIL prevents true parallelism)
start = time.perf_counter()
threads = [threading.Thread(target=cpu_bound) for _ in range(4)]
for t in threads: t.start()
for t in threads: t.join()
print(f"CPU 4 threads: {time.perf_counter() - start:.2f}s")   # ~4.0s (no speedup)

The GIL check interval

The GIL doesn't hold forever. Python checks every sys.getswitchinterval() seconds (default: 5ms) whether another thread wants the GIL. At that point, the current thread drops it, and another thread can acquire it.

import sys
print(sys.getswitchinterval())   # 0.005  (5 milliseconds)
sys.setswitchinterval(0.01)      # change to 10ms

Python 3.13 — the no-GIL build

Python 3.13 introduced an experimental no-GIL build (--disable-gil). True multi-threading for CPU-bound work is coming to CPython — but the default build still has the GIL for backwards compatibility. Watch this space.

Memory Management

Reference counting

Every Python object has a reference count — an integer tracking how many names, containers, or other objects hold a reference to it.

import sys

x = [1, 2, 3]
print(sys.getrefcount(x))   # 2 (x itself, and the argument to getrefcount)

y = x
print(sys.getrefcount(x))   # 3 (x, y, and getrefcount's argument)

del y
print(sys.getrefcount(x))   # 2

When the reference count hits zero, CPython immediately deallocates the object — no waiting for a garbage collection cycle.

class MyObject:
    def __init__(self, name):
        self.name = name
        print(f"Created  {name}")

    def __del__(self):
        print(f"Destroyed {self.name}")


obj = MyObject("A")
print("About to delete")
del obj                  # reference count -> 0, __del__ called immediately
print("After delete")

Output:

Created  A
About to delete
Destroyed A    <- immediate
After delete

Cyclic garbage collector

Reference counting has one weakness: reference cycles. Two objects that reference each other will never reach zero, even when no outside code references either.

class Node:
    def __init__(self, value):
        self.value = value
        self.next  = None

# Create a cycle
a = Node(1)
b = Node(2)
a.next = b
b.next = a   # cycle: a -> b -> a

del a
del b
# Reference counts are now 1 (each object references the other)
# Not zero — they're never freed by reference counting alone

Python's cyclic garbage collector (in the gc module) detects and breaks these cycles:

import gc

gc.enable()                   # on by default
gc.collect()                  # force a collection
print(gc.get_count())         # (generation 0, 1, 2 counts)
print(gc.get_threshold())     # (700, 10, 10) — collection thresholds

# Inspect what's tracked
gc.set_debug(gc.DEBUG_LEAK)   # print objects that can't be freed

The cyclic GC runs automatically in three generations. Most objects are young (generation 0) and die young. Objects that survive multiple collections move to older generations and are collected less frequently.

Object internment and small integer caching

CPython caches small integers (-5 to 256) and short strings. This is why is comparisons work for small integers but not large ones:

a = 256
b = 256
print(a is b)    # True — same cached object

a = 257
b = 257
print(a is b)    # False (may vary) — different objects

# String interning
a = "hello"
b = "hello"
print(a is b)    # True — interned (short strings matching identifier rules)

a = "hello world"
b = "hello world"
print(a is b)    # False (may vary) — not interned

This is why you must always use == for value comparison, never is.

Memory pools — `pymalloc`

CPython doesn't call malloc/free directly for small objects. It uses its own memory allocator (pymalloc) that maintains pools of fixed-size blocks. This dramatically reduces memory fragmentation and allocation overhead for the many small, short-lived objects Python creates.

# See memory usage
import tracemalloc

tracemalloc.start()

# Your code here
data = [i ** 2 for i in range(100_000)]

snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics("lineno")

print("Top 5 memory allocations:")
for stat in top_stats[:5]:
    print(f"  {stat}")

tracemalloc.stop()

The Descriptor Protocol

Descriptors are one of Python's most powerful mechanisms. They're how @property, @classmethod, @staticmethod, and super() work internally.

A descriptor is any object that defines __get__, __set__, or __delete__. When you access an attribute on an object, Python checks whether the class has a descriptor with that name.

class Descriptor:
    def __set_name__(self, owner, name):
        self.name = name          # called when class is defined

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self           # accessed from the class, not an instance
        return obj.__dict__.get(self.name)

    def __set__(self, obj, value):
        obj.__dict__[self.name] = value

    def __delete__(self, obj):
        del obj.__dict__[self.name]


class MyClass:
    x = Descriptor()
    y = Descriptor()


obj = MyClass()
obj.x = 42
print(obj.x)    # 42 — goes through Descriptor.__get__
del obj.x       # goes through Descriptor.__delete__

Building a typed attribute descriptor

class TypedAttribute:
    """Descriptor that enforces a type constraint."""
    def __set_name__(self, owner, name):
        self.public_name  = name
        self.private_name = f"_{name}"

    def __init__(self, expected_type, default=None):
        self.expected_type = expected_type
        self.default       = default

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return getattr(obj, self.private_name, self.default)

    def __set__(self, obj, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(
                f"{self.public_name} must be {self.expected_type.__name__}, "
                f"got {type(value).__name__}"
            )
        setattr(obj, self.private_name, value)


class Person:
    name  = TypedAttribute(str)
    age   = TypedAttribute(int, default=0)
    score = TypedAttribute(float, default=0.0)


p = Person()
p.name  = "Alice"
p.age   = 30
p.score = 95.5

print(p.name, p.age, p.score)   # Alice 30 95.5

try:
    p.age = "thirty"    # TypeError: age must be int, got str
except TypeError as e:
    print(e)

How `@property` is implemented

property is a built-in descriptor. Here's a simplified version:

class property_:
    """A simplified version of the built-in property descriptor."""
    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        self.__doc__ = doc or (fget.__doc__ if fget else None)

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

    def getter(self, fget):
        return type(self)(fget, self.fset, self.fdel, self.__doc__)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)


class Temperature:
    def __init__(self, celsius=0.0):
        self._celsius = celsius

    @property_()
    def celsius(self):
        return self._celsius

    @celsius.setter
    def celsius(self, value):
        if value < -273.15:
            raise ValueError("Temperature below absolute zero")
        self._celsius = value

    @property_()
    def fahrenheit(self):
        return self._celsius * 9 / 5 + 32


t = Temperature(100)
print(t.celsius)      # 100
print(t.fahrenheit)   # 212.0
t.celsius = -5
print(t.celsius)      # -5

Metaclasses

A metaclass is a class whose instances are classes. The default metaclass for all Python classes is type. When you write class Foo:, Python calls type("Foo", (object,), {...}) under the hood.

# These are equivalent
class MyClass:
    x = 42

MyClass = type("MyClass", (object,), {"x": 42})

print(type(MyClass))   # <class 'type'>
print(MyClass.x)       # 42

Writing a metaclass

class SingletonMeta(type):
    """Metaclass that makes a class a Singleton."""
    _instances: dict = {}

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super().__call__(*args, **kwargs)
        return cls._instances[cls]


class Database(metaclass=SingletonMeta):
    def __init__(self, url: str):
        self.url = url
        print(f"Connected to {url}")


db1 = Database("postgresql://localhost/myapp")   # Connected to ...
db2 = Database("postgresql://localhost/other")   # NOT printed — returns existing
print(db1 is db2)   # True — same object
print(db2.url)      # postgresql://localhost/myapp

Metaclass `new` and `init`

class ValidatedMeta(type):
    """Metaclass that validates class definitions at creation time."""

    def __new__(mcs, name, bases, namespace):
        # Check that every method has type hints
        for attr_name, value in namespace.items():
            if callable(value) and not attr_name.startswith("_"):
                if not hasattr(value, "__annotations__"):
                    raise TypeError(
                        f"{name}.{attr_name} must have type annotations"
                    )
        return super().__new__(mcs, name, bases, namespace)


class StrictAPI(metaclass=ValidatedMeta):
    def get_user(self, user_id: int) -> dict:   # OK — has annotations
        return {}

# class BadAPI(metaclass=ValidatedMeta):
#     def fetch():   # TypeError — no annotations
#         pass

`__init_subclass__` — the modern alternative to metaclasses

Metaclasses are powerful but complex. For most use cases, __init_subclass__ is simpler:

class Plugin:
    _registry: dict[str, type] = {}

    def __init_subclass__(cls, name: str = "", **kwargs):
        super().__init_subclass__(**kwargs)
        if name:
            Plugin._registry[name] = cls
            print(f"Registered plugin: {name!r}")


class JSONPlugin(Plugin, name="json"):
    def serialize(self, data):
        import json
        return json.dumps(data)


class CSVPlugin(Plugin, name="csv"):
    def serialize(self, data):
        return ",".join(str(v) for v in data)


print(Plugin._registry)
# {'json': <class 'JSONPlugin'>, 'csv': <class 'CSVPlugin'>}

plugin = Plugin._registry["json"]()
print(plugin.serialize({"hello": "world"}))

__init_subclass__ is called automatically when a class inherits from Plugin. It's the preferred pattern for plugin systems, registries, and framework hooks.

`slots` Internals

We used __slots__ in Chapter 30 for memory reduction. Now let's understand why it works.

Without __slots__, every instance has a __dict__ — a hash table that can hold any attribute. A dict takes at least 232 bytes.

With __slots__, Python creates slot descriptors in the class — fixed memory offsets in the instance structure. No __dict__ is created. The instance is a small C struct with named slots.

import sys

class WithDict:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class WithSlots:
    __slots__ = ("x", "y")
    def __init__(self, x, y):
        self.x = x
        self.y = y


d = WithDict(1, 2)
s = WithSlots(1, 2)

print(f"WithDict:  {sys.getsizeof(d)} + {sys.getsizeof(d.__dict__)} = "
      f"{sys.getsizeof(d) + sys.getsizeof(d.__dict__)} bytes")
print(f"WithSlots: {sys.getsizeof(s)} bytes")

# WithDict:  48 + 232 = 280 bytes
# WithSlots: 56 bytes   — 5x smaller

# Verify: WithSlots has no __dict__
print(hasattr(d, "__dict__"))   # True
print(hasattr(s, "__dict__"))   # False

`getattr` and `getattribute`

These two methods control attribute access:

__getattribute__ — called for every attribute access, even if the attribute exists
__getattr__ — called only when normal lookup fails

class TrackedAccess:
    """Log every attribute access."""
    def __init__(self):
        object.__setattr__(self, "_accessed", set())

    def __getattribute__(self, name):
        if not name.startswith("_"):
            accessed = object.__getattribute__(self, "_accessed")
            accessed.add(name)
        return object.__getattribute__(self, name)

    def accessed_attrs(self):
        return self._accessed


class LazyLoader:
    """Load attributes from a dict only when accessed."""
    def __init__(self, data: dict):
        self._data = data

    def __getattr__(self, name):
        # Only called when normal lookup fails
        if name in self._data:
            value = self._data[name]
            setattr(self, name, value)   # cache it so __getattr__ isn't called again
            return value
        raise AttributeError(f"{type(self).__name__!r} has no attribute {name!r}")


config = LazyLoader({
    "host":     "localhost",
    "port":     5432,
    "database": "myapp",
})

print(config.host)      # localhost — loaded from _data, cached on instance
print(config.port)      # 5432
print(config.database)  # myapp

AST Manipulation

You can parse, inspect, and even modify Python's AST — the internal representation of your code before compilation:

import ast

source = """
def greet(name):
    return f"Hello, {name}!"
"""

tree = ast.parse(source)

# Walk all nodes
for node in ast.walk(tree):
    if isinstance(node, ast.FunctionDef):
        print(f"Function: {node.name}, line {node.lineno}")
    elif isinstance(node, ast.Return):
        print(f"Return statement at line {node.lineno}")


# Transform the AST — add a print to every function
class AddLogging(ast.NodeTransformer):
    def visit_FunctionDef(self, node):
        self.generic_visit(node)   # process children first

        # Add: print(f"Calling {node.name}")
        log_stmt = ast.Expr(
            value=ast.Call(
                func=ast.Name(id="print", ctx=ast.Load()),
                args=[ast.Constant(value=f"Calling {node.name}")],
                keywords=[],
            )
        )
        node.body.insert(0, log_stmt)
        return node


transformed = AddLogging().visit(tree)
ast.fix_missing_locations(transformed)

code = compile(transformed, "<ast>", "exec")
exec(code)

greet("Alice")
# Calling greet      <- injected by AST transformation
# Hello, Alice!      <- original behavior

This is how many Python tools work: pytest rewrites assert statements, dataclasses generates __init__ code, attrs builds constructors — all by manipulating the AST.

`sys` Module Internals

import sys

# Python implementation details
print(sys.implementation.name)    # cpython
print(sys.version)                # 3.12.0 (...)
print(sys.version_info)           # sys.version_info(major=3, minor=12, ...)

# Memory
print(sys.getsizeof([]))          # 56 bytes (empty list)
print(sys.getsizeof({}))          # 64 bytes (empty dict)
print(sys.getsizeof(""))          # 49 bytes (empty string)
print(sys.getsizeof(0))           # 28 bytes (int)

# Recursion
print(sys.getrecursionlimit())    # 1000
sys.setrecursionlimit(5000)       # increase for deep recursion

# Reference counting
x = [1, 2, 3]
print(sys.getrefcount(x))        # 2

# All loaded modules
print(list(sys.modules.keys())[:10])

# Platform
print(sys.platform)   # 'win32', 'darwin', 'linux'

# Intern strings (force string to be shared)
a = sys.intern("hello world")
b = sys.intern("hello world")
print(a is b)   # True — both refer to the same object

Project: A Simple Bytecode Analyzer

"""
bytecode_analyzer.py — Inspect and report on Python bytecode.
"""
import dis
import sys
import types
from collections import Counter


def analyze(func) -> dict:
    """
    Analyze a function's bytecode and return a report.

    Args:
        func: Any Python function or method.

    Returns:
        A dict with keys: name, instructions, opcode_counts,
        constants, locals, stack_size, complexity_estimate.
    """
    code  = func.__code__
    instrs= list(dis.get_instructions(code))

    opcode_counts = Counter(i.opname for i in instrs)

    # Estimate cyclomatic complexity: 1 + number of branches
    branch_ops    = {"POP_JUMP_IF_TRUE", "POP_JUMP_IF_FALSE",
                     "JUMP_IF_TRUE_OR_POP", "JUMP_IF_FALSE_OR_POP",
                     "FOR_ITER"}
    complexity    = 1 + sum(1 for i in instrs if i.opname in branch_ops)

    return {
        "name":               code.co_name,
        "file":               code.co_filename,
        "line":               code.co_firstlineno,
        "instruction_count":  len(instrs),
        "opcode_counts":      dict(opcode_counts.most_common(5)),
        "constants":          [c for c in code.co_consts if c is not None],
        "local_variables":    list(code.co_varnames),
        "free_variables":     list(code.co_freevars),
        "stack_size":         code.co_stacksize,
        "complexity_estimate":complexity,
    }


def print_report(func) -> None:
    """Print a human-readable bytecode analysis report."""
    report = analyze(func)

    print(f"\n{'━'*50}")
    print(f"  Function: {report['name']}")
    print(f"  File:     {report['file']}:{report['line']}")
    print(f"{'━'*50}")
    print(f"  Instructions:    {report['instruction_count']}")
    print(f"  Stack depth:     {report['stack_size']}")
    print(f"  Complexity est.: {report['complexity_estimate']}")

    if report["constants"]:
        print(f"  Constants:       {report['constants']}")
    if report["local_variables"]:
        print(f"  Locals:          {report['local_variables']}")
    if report["free_variables"]:
        print(f"  Free vars:       {report['free_variables']}")

    print(f"\n  Top opcodes:")
    for opname, count in report["opcode_counts"].items():
        bar = "█" * count
        print(f"    {opname:<30} {count:>3}  {bar}")

    print()
    print("  Disassembly:")
    dis.dis(func)


# ── Demo ──────────────────────────────────────────────────────────────────────

def simple(x, y):
    return x + y

def with_branch(x):
    if x > 0:
        return "positive"
    elif x < 0:
        return "negative"
    else:
        return "zero"

def with_loop(items):
    total = 0
    for item in items:
        if item > 0:
            total += item
    return total

def with_closure():
    count = 0
    def counter():
        nonlocal count
        count += 1
        return count
    return counter


if __name__ == "__main__":
    for func in [simple, with_branch, with_loop]:
        print_report(func)

What You Learned in This Chapter

Python source goes through lexing -> parsing -> compilation -> execution. tokenize, ast, and dis let you inspect each stage.
CPython is a stack machine — bytecode instructions push and pop values on a stack.
Every function has a code object (__code__) containing bytecode, constants, variable names, and free variables.
Every function call creates a frame object (f_locals, f_globals, f_back, f_lineno) — the call stack is a linked list of frames.
The GIL is a coarse mutex protecting CPython's non-thread-safe reference counting. It releases during I/O, preventing true CPU parallelism in threads.
Reference counting frees objects immediately when their count reaches zero. The cyclic GC handles reference cycles in three generations.
Small integers (-5 to 256) and short strings are interned — always the same object. Use ==, never is, for value comparison.
tracemalloc measures memory allocation per line.
A descriptor defines __get__, __set__, or __delete__. property, classmethod, and staticmethod are all descriptors.
A metaclass is a class whose instances are classes. type is the default metaclass. Use __init_subclass__ for most plugin/registry patterns instead.
__slots__ replaces __dict__ with fixed C struct slots — 4-5x smaller per instance.
__getattribute__ intercepts every access; __getattr__ only fires when normal lookup fails.
Python's AST is manipulable — parse, inspect, transform, compile, and execute.

What's Next?

Chapter 37 covers Metaprogramming — class factories, dynamic class creation, __class_getitem__, runtime code generation, and exec/eval. You've seen the internals; now you'll use them to write code that writes code.