Debugging & Profiling
📖 Concept
Effective debugging separates productive developers from those who spend hours guessing. Python provides excellent built-in debugging tools, and understanding when and how to use each tool is critical for diagnosing issues in production systems.
Python's debugging toolkit:
| Tool | Purpose | Use Case |
|---|---|---|
pdb |
Interactive debugger (stdlib) | Step through code, inspect state |
breakpoint() |
Built-in function (3.7+) | Drop into debugger anywhere |
pdb++ / ipdb |
Enhanced debuggers | Syntax highlighting, better UX |
logging |
Structured log output | Production debugging, audit trails |
traceback |
Exception formatting | Custom error reporting |
pdb commands (the essential ones):
n(next) — execute current line, step over function callss(step) — step into function callsc(continue) — run until next breakpointr(return) — run until current function returnsl(list) — show source code around current positionp expr— print expression valuepp expr— pretty-print expressionw(where) — show call stackb lineno— set breakpoint at line numbercl(clear) — remove breakpoints
breakpoint() (Python 3.7+) is the modern way to invoke the debugger. It respects the PYTHONBREAKPOINT environment variable, allowing you to switch debuggers or disable breakpoints entirely without changing code:
PYTHONBREAKPOINT=0— disable all breakpoints (production)PYTHONBREAKPOINT=ipdb.set_trace— use ipdb instead of pdbPYTHONBREAKPOINT=pudb.set_trace— use pudb (visual debugger)
Profiling identifies performance bottlenecks. Python offers multiple profilers:
cProfile— deterministic profiler (function-level), built-in, low overheadprofile— pure Python profiler (slower but extensible)line_profiler— line-by-line execution time (third-party, essential for optimization)memory_profiler— track memory allocation per linepy-spy— sampling profiler that attaches to running processes without code changes
Debugging strategy: Start with logging for context, use breakpoint() for interactive investigation, and profile only when you have confirmed a performance issue. Premature optimization guided by intuition rather than profiling data wastes time.
💻 Code Example
1# ============================================================2# 1. pdb / breakpoint() — Interactive Debugging3# ============================================================4import sys5from pathlib import Path678def calculate_discount(items, membership_level):9 """10 Production function with a subtle bug to debug.11 Items: list of dicts with 'name', 'price', 'quantity'.12 """13 subtotal = sum(14 item["price"] * item["quantity"] for item in items15 )1617 # Drop into debugger to inspect state:18 # breakpoint() # Uncomment to debug interactively1920 discount_rates = {21 "bronze": 0.05,22 "silver": 0.10,23 "gold": 0.15,24 "platinum": 0.20,25 }2627 rate = discount_rates.get(membership_level, 0)28 discount = subtotal * rate29 total = subtotal - discount3031 return {32 "subtotal": round(subtotal, 2),33 "discount": round(discount, 2),34 "total": round(total, 2),35 "rate": rate,36 }373839# ============================================================40# 2. Conditional breakpoints and post-mortem debugging41# ============================================================42def find_anomalies(data):43 """Process data with conditional debugging."""44 results = []45 for i, value in enumerate(data):46 processed = value ** 0.5 if value >= 0 else None4748 # Conditional breakpoint — only pause on suspicious values49 # if processed is not None and processed > 100:50 # breakpoint()5152 results.append({"index": i, "original": value, "processed": processed})53 return results545556def debug_with_post_mortem():57 """58 Post-mortem debugging: inspect state AFTER a crash.59 Run with: python -m pdb script.py60 When it crashes, pdb drops you into the frame where the61 exception occurred.62 """63 data = [100, 200, -1, 400, 0]64 try:65 results = [1 / x for x in data]66 except ZeroDivisionError:67 import pdb68 # pdb.post_mortem() # Uncomment to debug at crash site69 print("ZeroDivisionError caught — would enter post-mortem debugger")707172# ============================================================73# 3. Structured Logging (production debugging)74# ============================================================75import logging7677# Configure logging with structured format78logging.basicConfig(79 level=logging.DEBUG,80 format="%(asctime)s [%(levelname)s] %(name)s:%(funcName)s:%(lineno)d — %(message)s",81 datefmt="%Y-%m-%d %H:%M:%S",82)83logger = logging.getLogger(__name__)848586def process_order(order_id, items):87 """Production code with proper logging levels."""88 logger.info("Processing order %s with %d items", order_id, len(items))8990 for item in items:91 logger.debug(92 "Item: %s, price=%.2f, qty=%d",93 item["name"],94 item["price"],95 item["quantity"],96 )9798 try:99 result = calculate_discount(items, "gold")100 logger.info(101 "Order %s total: $%.2f (discount: $%.2f)",102 order_id,103 result["total"],104 result["discount"],105 )106 return result107 except Exception as e:108 logger.exception("Failed to process order %s", order_id)109 raise110111112# ============================================================113# 4. Custom exception hooks and traceback formatting114# ============================================================115import traceback116117118def robust_processor(data_batch):119 """Collect errors without stopping the entire batch."""120 results = []121 errors = []122123 for i, item in enumerate(data_batch):124 try:125 processed = 100 / item["value"]126 results.append({"index": i, "result": processed})127 except (ZeroDivisionError, KeyError, TypeError) as e:128 error_info = {129 "index": i,130 "item": item,131 "error": str(e),132 "traceback": traceback.format_exc(),133 }134 errors.append(error_info)135 logger.warning("Error at index %d: %s", i, e)136137 if errors:138 logger.warning(139 "Batch completed with %d errors out of %d items",140 len(errors),141 len(data_batch),142 )143144 return results, errors145146147# ============================================================148# 5. cProfile — Function-level profiling149# ============================================================150import cProfile151import io152import pstats153154155def fibonacci(n):156 """Deliberately unoptimized for profiling demonstration."""157 if n <= 1:158 return n159 return fibonacci(n - 1) + fibonacci(n - 2)160161162def profile_fibonacci():163 """Profile with cProfile and display sorted results."""164 profiler = cProfile.Profile()165 profiler.enable()166167 result = fibonacci(30)168169 profiler.disable()170171 # Capture profile output172 stream = io.StringIO()173 stats = pstats.Stats(profiler, stream=stream)174 stats.sort_stats("cumulative")175 stats.print_stats(10) # top 10 functions176 print(stream.getvalue())177 print(f"Result: {result}")178179180# Alternative: profile from command line181# python -m cProfile -s cumulative my_script.py182# python -m cProfile -o profile_output.prof my_script.py183184185# ============================================================186# 6. Timing utilities for targeted profiling187# ============================================================188import time189from functools import wraps190from contextlib import contextmanager191192193def timer(func):194 """Decorator to measure function execution time."""195 @wraps(func)196 def wrapper(*args, **kwargs):197 start = time.perf_counter()198 result = func(*args, **kwargs)199 elapsed = time.perf_counter() - start200 logger.info("%s executed in %.4f seconds", func.__name__, elapsed)201 return result202 return wrapper203204205@contextmanager206def timed_block(label="block"):207 """Context manager for timing arbitrary code blocks."""208 start = time.perf_counter()209 yield210 elapsed = time.perf_counter() - start211 logger.info("%s completed in %.4f seconds", label, elapsed)212213214@timer215def sort_large_list():216 """Example function to profile."""217 import random218 data = [random.randint(0, 1_000_000) for _ in range(500_000)]219 return sorted(data)220221222def demo_timed_block():223 """Demonstrate context manager timing."""224 with timed_block("list comprehension"):225 squares = [x ** 2 for x in range(1_000_000)]226227 with timed_block("generator sum"):228 total = sum(x ** 2 for x in range(1_000_000))229230231# ============================================================232# 7. tracemalloc — Memory profiling (stdlib)233# ============================================================234import tracemalloc235236237def memory_profile_demo():238 """Track memory allocations to find leaks."""239 tracemalloc.start()240241 # Allocate some memory242 data = [list(range(1000)) for _ in range(1000)]243244 snapshot = tracemalloc.take_snapshot()245 top_stats = snapshot.statistics("lineno")246247 print("\nTop 5 memory allocations:")248 for stat in top_stats[:5]:249 print(f" {stat}")250251 current, peak = tracemalloc.get_traced_memory()252 print(f"\nCurrent memory: {current / 1024:.1f} KB")253 print(f"Peak memory: {peak / 1024:.1f} KB")254255 tracemalloc.stop()256257258# ============================================================259# Usage260# ============================================================261if __name__ == "__main__":262 # Debugging demo263 items = [264 {"name": "Widget", "price": 25.99, "quantity": 3},265 {"name": "Gadget", "price": 49.99, "quantity": 1},266 {"name": "Doohickey", "price": 12.50, "quantity": 5},267 ]268 process_order("ORD-001", items)269270 # Profiling demo271 profile_fibonacci()272273 # Timing demo274 sort_large_list()275 demo_timed_block()276277 # Memory demo278 memory_profile_demo()
🏋️ Practice Exercise
Exercises:
Write a function with a deliberate bug (e.g., off-by-one error in a loop). Use
breakpoint()to step through execution withn,s,p, andlcommands. Document each pdb command you used and what it revealed.Create a
@timerdecorator and atimed_blockcontext manager. Apply them to three different algorithms for the same task (e.g., three sorting approaches) and compare their performance with formatted output.Use
cProfileto profile a recursive Fibonacci function vs. a memoized version. Generate a sorted stats report and identify the hotspot. Then usefunctools.lru_cacheand re-profile to show the improvement.Set up structured logging with different levels (DEBUG, INFO, WARNING, ERROR) in a multi-module application. Configure separate handlers: console for INFO+, file for DEBUG+. Demonstrate how to use logging for production debugging.
Use
tracemallocto find a simulated memory leak: a function that appends to a module-level list on each call. Show the top memory allocations and explain how to identify and fix the leak.Configure
PYTHONBREAKPOINTto useipdb(install it first), then set it to0to disable all breakpoints. Explain how this mechanism lets you leave breakpoints in code without affecting production.
⚠️ Common Mistakes
Leaving
breakpoint()orpdb.set_trace()calls in committed code. UsePYTHONBREAKPOINT=0in production as a safety net, and add a pre-commit hook or linter rule to catch stray debugger statements.Using
print()statements instead of theloggingmodule. Print statements are not configurable (no levels, no formatting, no routing), cannot be disabled in production, and clutter stdout. Theloggingmodule is designed for exactly this purpose.Profiling before confirming there is actually a performance problem. Premature optimization wastes time. First measure with wall-clock timing, then profile only the slow paths. cProfile has overhead that can skew results for micro-benchmarks.
Ignoring the difference between
time.time()andtime.perf_counter()for benchmarking.perf_counter()uses the highest-resolution clock available and is not affected by system clock adjustments. Always useperf_counter()for measuring code execution time.Not using post-mortem debugging (
python -m pdb script.pyorpdb.post_mortem()) for crashes. It drops you into the exact frame where the exception occurred, with all local variables intact — far more useful than reading a traceback.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Debugging & Profiling