Threading & the GIL
📖 Concept
Python's threading module provides OS-level threads for concurrent execution, but understanding the Global Interpreter Lock (GIL) is essential before writing any threaded Python code.
The GIL explained:
The GIL is a mutex in CPython that allows only one thread to execute Python bytecode at a time. It exists because CPython's memory management (reference counting) is not thread-safe. The GIL is released during I/O operations (file reads, network calls, time.sleep), which is why threading is still effective for I/O-bound workloads. For CPU-bound tasks, threads in CPython cannot achieve true parallelism — use multiprocessing or concurrent.futures.ProcessPoolExecutor instead.
| Scenario | Threading effective? | Why |
|---|---|---|
| HTTP requests | Yes | GIL released during socket I/O |
| File I/O | Yes | GIL released during OS read/write |
| CPU computation | No | GIL prevents parallel bytecode execution |
| C extensions (NumPy) | Yes | Well-written C extensions release the GIL |
Key synchronization primitives:
Lock— Mutual exclusion. Only one thread canacquire()at a time. Always usewith lock:to guarantee release.RLock(Reentrant Lock) — Same thread canacquire()multiple times without deadlocking. Mustrelease()the same number of times.Semaphore— Allows up to N threads to enter a section concurrently. Useful for rate-limiting or connection pooling.Event— One thread signals, others wait.set()/clear()/wait().Condition— Threads wait for a condition to become true. Supportsnotify()/notify_all()/wait(). Used in producer-consumer patterns.Barrier— N threads block until all N arrive, then all proceed together.
Thread safety rule: Any mutable shared state accessed by multiple threads must be protected by a lock. Even simple operations like counter += 1 are not atomic in Python — they compile to LOAD, ADD, STORE bytecodes, and a context switch can happen between them.
💻 Code Example
1# ============================================================2# Basic threading with Lock for shared state3# ============================================================4import threading5import time6import logging7from typing import Optional89logging.basicConfig(level=logging.INFO, format="%(threadName)s: %(message)s")10logger = logging.getLogger(__name__)111213class ThreadSafeCounter:14 """A counter safe for concurrent access from multiple threads."""1516 def __init__(self) -> None:17 self._value = 018 self._lock = threading.Lock()1920 def increment(self, amount: int = 1) -> None:21 with self._lock: # Acquire and release automatically22 self._value += amount2324 def decrement(self, amount: int = 1) -> None:25 with self._lock:26 self._value -= amount2728 @property29 def value(self) -> int:30 with self._lock:31 return self._value323334counter = ThreadSafeCounter()353637def worker(n: int) -> None:38 """Each worker increments the counter n times."""39 for _ in range(n):40 counter.increment()414243threads = [threading.Thread(target=worker, args=(100_000,)) for _ in range(10)]4445start = time.perf_counter()46for t in threads:47 t.start()48for t in threads:49 t.join()50elapsed = time.perf_counter() - start5152print(f"Counter value: {counter.value}") # Always 1_000_00053print(f"Elapsed: {elapsed:.3f}s")545556# ============================================================57# RLock for reentrant (nested) locking58# ============================================================59class CachedRepository:60 """Repository that uses RLock so public methods can call each other."""6162 def __init__(self) -> None:63 self._lock = threading.RLock()64 self._cache: dict[str, str] = {}6566 def get(self, key: str) -> Optional[str]:67 with self._lock:68 return self._cache.get(key)6970 def set(self, key: str, value: str) -> None:71 with self._lock:72 self._cache[key] = value7374 def get_or_set(self, key: str, default: str) -> str:75 """Calls get() and set() internally -- RLock prevents deadlock."""76 with self._lock:77 existing = self.get(key) # Acquires _lock again (RLock OK)78 if existing is None:79 self.set(key, default) # Acquires _lock again (RLock OK)80 return default81 return existing828384# ============================================================85# Semaphore for connection pool / rate limiting86# ============================================================87import random8889MAX_CONCURRENT_CONNECTIONS = 390pool_semaphore = threading.Semaphore(MAX_CONCURRENT_CONNECTIONS)919293def fetch_url(url: str) -> str:94 """Simulate fetching a URL with limited concurrency."""95 with pool_semaphore: # At most 3 threads here concurrently96 logger.info(f"Fetching {url}")97 time.sleep(random.uniform(0.1, 0.5)) # Simulate network I/O98 logger.info(f"Done fetching {url}")99 return f"Response from {url}"100101102urls = [f"https://api.example.com/item/{i}" for i in range(10)]103threads = [threading.Thread(target=fetch_url, args=(url,)) for url in urls]104for t in threads:105 t.start()106for t in threads:107 t.join()108109110# ============================================================111# Event for signaling between threads112# ============================================================113data_ready = threading.Event()114shared_data: dict = {}115116117def producer() -> None:118 """Produce data and signal consumers."""119 logger.info("Producing data...")120 time.sleep(1) # Simulate work121 shared_data["result"] = [1, 2, 3, 4, 5]122 data_ready.set() # Signal consumers123 logger.info("Data is ready")124125126def consumer(name: str) -> None:127 """Wait for data, then consume it."""128 logger.info(f"{name} waiting for data...")129 data_ready.wait() # Blocks until set()130 logger.info(f"{name} got data: {shared_data['result']}")131132133prod = threading.Thread(target=producer)134cons1 = threading.Thread(target=consumer, args=("Consumer-1",))135cons2 = threading.Thread(target=consumer, args=("Consumer-2",))136137for t in [cons1, cons2, prod]:138 t.start()139for t in [prod, cons1, cons2]:140 t.join()141142143# ============================================================144# Condition variable for producer-consumer queue145# ============================================================146class BoundedBuffer:147 """Thread-safe bounded buffer using Condition variables."""148149 def __init__(self, capacity: int = 10) -> None:150 self._buffer: list = []151 self._capacity = capacity152 self._condition = threading.Condition()153154 def put(self, item) -> None:155 with self._condition:156 while len(self._buffer) >= self._capacity:157 self._condition.wait() # Wait until space available158 self._buffer.append(item)159 self._condition.notify() # Notify waiting consumers160161 def get(self):162 with self._condition:163 while len(self._buffer) == 0:164 self._condition.wait() # Wait until item available165 item = self._buffer.pop(0)166 self._condition.notify() # Notify waiting producers167 return item168169170buffer = BoundedBuffer(capacity=5)171172173def buffer_producer(n: int) -> None:174 for i in range(n):175 buffer.put(i)176 logger.info(f"Produced {i}")177 time.sleep(0.05)178179180def buffer_consumer(n: int) -> None:181 for _ in range(n):182 item = buffer.get()183 logger.info(f"Consumed {item}")184 time.sleep(0.1)185186187p = threading.Thread(target=buffer_producer, args=(20,))188c = threading.Thread(target=buffer_consumer, args=(20,))189p.start()190c.start()191p.join()192c.join()193194195# ============================================================196# Daemon threads and graceful shutdown197# ============================================================198shutdown_event = threading.Event()199200201def background_monitor(interval: float = 2.0) -> None:202 """Background daemon that runs until shutdown is signaled."""203 while not shutdown_event.is_set():204 logger.info("Monitor heartbeat")205 shutdown_event.wait(timeout=interval) # Sleep but wake on shutdown206 logger.info("Monitor shutting down")207208209monitor = threading.Thread(target=background_monitor, daemon=True)210monitor.start()211212# ... do work ...213214shutdown_event.set() # Signal graceful shutdown215monitor.join(timeout=5)
🏋️ Practice Exercise
Exercises:
Write a program that spawns 5 threads, each incrementing a shared counter 1,000,000 times. First run it without a lock and observe the race condition. Then add a
Lockand confirm the final value is always 5,000,000.Implement a thread-safe
LRUCacheclass usingthreading.Lockthat supportsget(key)andput(key, value)with a configurable max size. Write a stress test with 10 threads doing random reads/writes.Build a producer-consumer pipeline using
threading.Condition: 3 producer threads generate random numbers, 2 consumer threads compute their squares. Use a bounded buffer of size 10. Print the throughput (items/sec) at the end.Create a
ConnectionPoolclass usingSemaphore(max_size). Threads callpool.acquire()to get a connection andpool.release(conn)to return it. Add atimeoutparameter that raisesTimeoutErrorif no connection is available within the limit.Write a benchmark that compares threading vs sequential execution for (a) downloading 20 web pages (I/O-bound) and (b) computing SHA-256 hashes of 20 large strings (CPU-bound). Measure and explain the results in terms of the GIL.
Implement a
ReadWriteLockfrom scratch that allows unlimited concurrent readers but exclusive writer access. Test it with 10 reader threads and 2 writer threads accessing a shared dictionary.
⚠️ Common Mistakes
Using threading for CPU-bound work and expecting a speedup. The GIL prevents parallel bytecode execution in CPython, so CPU-bound threads actually run slower than sequential code due to lock contention and context-switch overhead. Use
multiprocessingfor CPU parallelism.Forgetting to call
thread.join()and letting the main thread exit. If non-daemon threads are still running, the process hangs. If daemon threads are running, they are killed abruptly without cleanup. Alwaysjoin()threads you care about.Assuming simple operations like
list.append()ordict[key] = valueare thread-safe because they are 'atomic.' While CPython's GIL makes some bytecode operations accidentally safe, this is an implementation detail, not a language guarantee. Always use explicit locks for shared mutable state.Acquiring multiple locks in inconsistent order across threads, causing deadlocks. Thread A holds Lock1 and waits for Lock2, while Thread B holds Lock2 and waits for Lock1. Always acquire locks in a globally consistent order, or use
threading.RLockfor self-reentrant cases.Not using
with lock:context-manager syntax and forgetting to release the lock in an exception path. Manuallock.acquire()/lock.release()without try/finally will leak locks if the critical section raises.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Threading & the GIL