Databases & SQLAlchemy
📖 Concept
SQLAlchemy is the most widely used database toolkit for Python. Version 2.0 introduced a modernized API with a cleaner, more Pythonic style that aligns with type hints and async patterns. Understanding SQLAlchemy is essential for any Python backend role.
SQLAlchemy Architecture:
- Engine — manages the connection pool and dialect (PostgreSQL, MySQL, SQLite). Created once per application.
- Session — a unit of work that tracks changes to objects and flushes them to the database in a transaction. Created per request.
- Declarative Base — base class for ORM models that maps Python classes to database tables.
- Mapped Columns — SQLAlchemy 2.0 uses `Mapped[type]` and `mapped_column()` for type-safe column definitions (replacing the old `Column()` style).
Core ORM Concepts:
- Models define your schema as Python classes. Each class maps to a table, each attribute to a column.
- Relationships link models together: `relationship()` defines the Python-side access pattern, `ForeignKey` defines the database constraint.
- CRUD Operations use the Session: `session.add()` for insert, query with `select()`, modify attributes directly, `session.delete()` for removal, and `session.commit()` to persist.
Eager vs Lazy Loading:
- Lazy loading (default) — related objects are loaded when you first access them. Causes the N+1 query problem: loading 100 users and accessing each user's posts triggers 101 queries.
- Eager loading — loads related objects in the initial query. Three strategies: `joinedload()` (JOIN), `selectinload()` (separate SELECT IN), `subqueryload()` (subquery).
Alembic Migrations: Alembic is SQLAlchemy's migration tool (like Django's `makemigrations`/`migrate`). It generates migration scripts by comparing your models to the database state. Key commands: `alembic init` (setup), `alembic revision --autogenerate -m "message"` (create migration), `alembic upgrade head` (apply), `alembic downgrade -1` (rollback).
In interviews, focus on the N+1 problem, transaction management, and the difference between `flush()` (write to DB within transaction) and `commit()` (finalize transaction).
💻 Code Example
1# ============================================================2# SQLAlchemy 2.0 ORM — Production Patterns3# ============================================================4# from sqlalchemy import (5# create_engine, ForeignKey, String, Text, DateTime, select, func6# )7# from sqlalchemy.orm import (8# DeclarativeBase, Mapped, mapped_column, relationship,9# Session, sessionmaker, joinedload, selectinload10# )11# from datetime import datetime, timezone12# from typing import Optional13#14#15# # ============================================================16# # Engine & Base Setup17# # ============================================================18# engine = create_engine(19# "sqlite:///app.db",20# echo=False, # Set True for SQL logging during development21# pool_size=5, # Connection pool size (default 5)22# max_overflow=10, # Extra connections beyond pool_size23# pool_recycle=3600, # Recycle connections after 1 hour24# )25#26# SessionLocal = sessionmaker(bind=engine, expire_on_commit=False)27#28#29# class Base(DeclarativeBase):30# """SQLAlchemy 2.0 declarative base."""31# pass32#33#34# # ============================================================35# # Models — SQLAlchemy 2.0 Mapped style36# # ============================================================37# class User(Base):38# __tablename__ = "users"39#40# # Mapped[type] + mapped_column() = type-safe columns41# id: Mapped[int] = mapped_column(primary_key=True)42# username: Mapped[str] = mapped_column(String(50), unique=True, index=True)43# email: Mapped[str] = mapped_column(String(255), unique=True)44# full_name: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)45# created_at: Mapped[datetime] = mapped_column(46# DateTime(timezone=True),47# default=lambda: datetime.now(timezone.utc),48# )49#50# # Relationship: one-to-many (User -> Posts)51# posts: Mapped[list["Post"]] = relationship(52# back_populates="author",53# cascade="all, delete-orphan", # delete posts when user is deleted54# lazy="selectin", # default eager loading strategy55# )56#57# def __repr__(self) -> str:58# return f"User(id={self.id}, username={self.username!r})"59#60#61# class Post(Base):62# __tablename__ = "posts"63#64# id: Mapped[int] = mapped_column(primary_key=True)65# title: Mapped[str] = mapped_column(String(200))66# body: Mapped[str] = mapped_column(Text)67# published: Mapped[bool] = mapped_column(default=False)68# created_at: Mapped[datetime] = mapped_column(69# DateTime(timezone=True),70# default=lambda: datetime.now(timezone.utc),71# )72#73# # Foreign key + relationship back-reference74# author_id: Mapped[int] = mapped_column(ForeignKey("users.id"))75# author: Mapped["User"] = relationship(back_populates="posts")76#77# # Many-to-many through association table78# tags: Mapped[list["Tag"]] = relationship(79# secondary="post_tags", back_populates="posts"80# )81#82#83# class Tag(Base):84# __tablename__ = "tags"85#86# id: Mapped[int] = mapped_column(primary_key=True)87# name: Mapped[str] = mapped_column(String(50), unique=True)88# posts: Mapped[list["Post"]] = relationship(89# secondary="post_tags", back_populates="tags"90# )91#92#93# # Association table for many-to-many94# from sqlalchemy import Table, Column, Integer95# post_tags = Table(96# "post_tags",97# Base.metadata,98# Column("post_id", Integer, ForeignKey("posts.id"), primary_key=True),99# Column("tag_id", Integer, ForeignKey("tags.id"), primary_key=True),100# )101#102#103# # ============================================================104# # CRUD Operations105# # ============================================================106#107# # BAD: Not using a session context manager108# # session = SessionLocal()109# # session.add(user)110# # session.commit() # If this fails, session is left in broken state111#112# # GOOD: Session as context manager with automatic rollback113# def create_user(username: str, email: str) -> User:114# with SessionLocal() as session:115# user = User(username=username, email=email)116# session.add(user)117# session.commit()118# session.refresh(user) # reload from DB to get generated ID119# return user120#121#122# def get_user_by_id(user_id: int) -> Optional[User]:123# with SessionLocal() as session:124# stmt = select(User).where(User.id == user_id)125# return session.execute(stmt).scalar_one_or_none()126#127#128# # BAD: N+1 query problem129# # def list_users_with_posts_bad():130# # with SessionLocal() as session:131# # users = session.execute(select(User)).scalars().all()132# # for user in users:133# # print(user.posts) # Each access triggers a separate query!134#135# # GOOD: Eager loading to avoid N+1136# def list_users_with_posts() -> list[User]:137# with SessionLocal() as session:138# stmt = (139# select(User)140# .options(joinedload(User.posts)) # JOIN in single query141# .order_by(User.created_at.desc())142# )143# return session.execute(stmt).unique().scalars().all()144#145#146# def update_user_email(user_id: int, new_email: str) -> Optional[User]:147# with SessionLocal() as session:148# user = session.get(User, user_id)149# if user:150# user.email = new_email # SQLAlchemy tracks the change151# session.commit()152# session.refresh(user)153# return user154#155#156# def delete_user(user_id: int) -> bool:157# with SessionLocal() as session:158# user = session.get(User, user_id)159# if user:160# session.delete(user) # cascade deletes posts too161# session.commit()162# return True163# return False164#165#166# # ============================================================167# # Advanced Queries168# # ============================================================169# def search_posts(keyword: str, limit: int = 20) -> list[Post]:170# with SessionLocal() as session:171# stmt = (172# select(Post)173# .options(joinedload(Post.author), selectinload(Post.tags))174# .where(Post.title.ilike(f"%{keyword}%"))175# .where(Post.published == True)176# .order_by(Post.created_at.desc())177# .limit(limit)178# )179# return session.execute(stmt).unique().scalars().all()180#181#182# def get_user_post_counts() -> list[tuple]:183# """Aggregate query: count posts per user."""184# with SessionLocal() as session:185# stmt = (186# select(User.username, func.count(Post.id).label("post_count"))187# .join(Post, isouter=True)188# .group_by(User.username)189# .order_by(func.count(Post.id).desc())190# )191# return session.execute(stmt).all()192#193#194# # ============================================================195# # Alembic Migration Commands (run in terminal)196# # ============================================================197# # alembic init migrations # initialize Alembic198# # alembic revision --autogenerate -m "create users and posts"199# # alembic upgrade head # apply all pending migrations200# # alembic downgrade -1 # rollback last migration201# # alembic history # show migration history202# # alembic current # show current revision
🏋️ Practice Exercise
Exercises:
Define SQLAlchemy 2.0 models for a blog system: `User`, `Post`, `Comment`, and `Tag` (many-to-many with Post). Use `Mapped[]` type annotations, `mapped_column()`, and proper relationships with cascade options. Create the tables with `Base.metadata.create_all(engine)`.
Write CRUD functions for the `Post` model: `create_post()`, `get_post()`, `update_post()`, `delete_post()`, and `list_posts(page, per_page)` with pagination. Use session context managers and handle the case where a post is not found.
Demonstrate the N+1 query problem: write a function that lists all users and their post titles using lazy loading. Enable SQL echo (`echo=True`) to see the query count. Then fix it using `joinedload()` and `selectinload()` and compare the query counts.
Set up Alembic for your project. Create an initial migration from your models, apply it, then add a `bio` column to the `User` model, generate a new migration, and apply it. Practice rolling back and re-applying migrations.
Write a function that performs a bulk insert of 10,000 records using `session.add_all()` and compare its performance to `session.execute(insert(Model).values(records))` using the core API. Time both approaches and explain the difference.
Implement a soft-delete pattern: add an `is_deleted` column and a `deleted_at` timestamp. Override the default query to exclude soft-deleted records using a custom `Session` class or event hooks.
⚠️ Common Mistakes
Not closing or properly scoping database sessions. A session that is never closed leaks connections from the pool. Always use
with SessionLocal() as session:or FastAPI'sDepends(get_db)with yield to ensure cleanup.Ignoring the N+1 query problem. Loading 100 users and then accessing
user.postsfor each triggers 101 queries. Usejoinedload()(single JOIN query) orselectinload()(two queries with IN clause) to eagerly load relationships.Confusing
session.flush()withsession.commit().flush()sends SQL to the database but stays inside the current transaction — a subsequent rollback undoes it.commit()finalizes the transaction permanently. Useflush()when you need generated IDs before committing.Using the old SQLAlchemy 1.x
Column()style instead of 2.0'sMapped[]+mapped_column(). The 2.0 style provides type safety, better IDE support, and aligns with Python's type hint ecosystem. Interviewers will notice if you use the legacy API.Hardcoding database URLs instead of reading from environment variables. Use
os.getenv('DATABASE_URL')or a settings management library likepydantic-settings. This prevents credentials from leaking into version control and allows different configs per environment.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Databases & SQLAlchemy