Dataclasses & NamedTuples

0/5 in this phase0/54 across the roadmap

📖 Concept

Dataclasses (Python 3.7+) reduce boilerplate for classes that mainly store data. The @dataclass decorator auto-generates __init__, __repr__, __eq__, and optionally __hash__, __lt__, etc.

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

This auto-generates __init__(self, x, y), __repr__, and __eq__.

Dataclass options:

Option Default Effect
frozen=True False Immutable (like a tuple)
order=True False Generate <, >, <=, >=
slots=True False Use __slots__ for memory efficiency (3.10+)
kw_only=True False All fields keyword-only (3.10+)

field() customizes individual fields:

  • default / default_factory — default values
  • repr=False — exclude from repr
  • compare=False — exclude from comparison
  • init=False — exclude from constructor

NamedTuple (from typing module) creates lightweight, immutable record types. They're tuples with named fields — great for function return values and simple data records.

__slots__ restricts an object to a fixed set of attributes, reducing memory usage by 40-50% for classes with many instances. Instances don't have a __dict__, so you can't add arbitrary attributes.

Choosing between options:

Need Use
Simple mutable data class @dataclass
Immutable record @dataclass(frozen=True) or NamedTuple
Dict key / set member @dataclass(frozen=True) or NamedTuple
Memory-efficient @dataclass(slots=True) or __slots__
Tuple compatibility NamedTuple

💻 Code Example

codeTap to expand ⛶
1# ============================================================
2# Basic dataclass
3# ============================================================
4from dataclasses import dataclass, field
5
6@dataclass
7class User:
8 name: str
9 email: str
10 age: int
11 active: bool = True # Default value
12
13# Auto-generated __init__, __repr__, __eq__
14user = User("Alice", "alice@email.com", 30)
15print(user) # User(name='Alice', email='alice@email.com', age=30, active=True)
16print(user == User("Alice", "alice@email.com", 30)) # True
17
18# ============================================================
19# field() for customization
20# ============================================================
21@dataclass
22class Order:
23 order_id: str
24 items: list = field(default_factory=list) # Mutable default!
25 total: float = 0.0
26 _internal_state: str = field(default="pending", repr=False, compare=False)
27
28 def add_item(self, item, price):
29 self.items.append(item)
30 self.total += price
31
32order = Order("ORD-001")
33order.add_item("Widget", 9.99)
34order.add_item("Gadget", 19.99)
35print(order)
36# Order(order_id='ORD-001', items=['Widget', 'Gadget'], total=29.98)
37
38# Each instance gets its own list (unlike the mutable default bug)
39order2 = Order("ORD-002")
40print(order2.items) # [] — independent!
41
42# ============================================================
43# Frozen dataclass (immutable)
44# ============================================================
45@dataclass(frozen=True)
46class Point:
47 x: float
48 y: float
49
50 @property
51 def distance_from_origin(self):
52 return (self.x ** 2 + self.y ** 2) ** 0.5
53
54p = Point(3, 4)
55print(p.distance_from_origin) # 5.0
56# p.x = 10 # FrozenInstanceError: cannot assign to field 'x'
57
58# Frozen dataclasses are hashable → can be dict keys and set members
59points = {Point(0, 0): "origin", Point(1, 0): "unit x"}
60
61# ============================================================
62# Ordered dataclass
63# ============================================================
64@dataclass(order=True)
65class Version:
66 major: int
67 minor: int
68 patch: int
69
70 def __str__(self):
71 return f"{self.major}.{self.minor}.{self.patch}"
72
73versions = [Version(2, 1, 0), Version(1, 9, 5), Version(2, 0, 1)]
74print(sorted(versions)) # [1.9.5, 2.0.1, 2.1.0]
75
76# ============================================================
77# Post-init processing
78# ============================================================
79@dataclass
80class Rectangle:
81 width: float
82 height: float
83 area: float = field(init=False) # Computed, not passed to __init__
84
85 def __post_init__(self):
86 """Called after __init__. Use for validation and computed fields."""
87 if self.width <= 0 or self.height <= 0:
88 raise ValueError("Dimensions must be positive")
89 self.area = self.width * self.height
90
91r = Rectangle(4, 5)
92print(r) # Rectangle(width=4, height=5, area=20)
93
94# ============================================================
95# Dataclass with slots (Python 3.10+)
96# ============================================================
97@dataclass(slots=True)
98class Particle:
99 x: float
100 y: float
101 mass: float = 1.0
102
103p = Particle(0, 0)
104# p.color = "red" # AttributeError: no __dict__, can't add attributes
105# Uses ~40% less memory than regular dataclass
106
107# ============================================================
108# NamedTuple — immutable record type
109# ============================================================
110from typing import NamedTuple
111
112class Coordinate(NamedTuple):
113 latitude: float
114 longitude: float
115 altitude: float = 0.0
116
117loc = Coordinate(40.7128, -74.0060)
118print(loc) # Coordinate(latitude=40.7128, longitude=-74.006, altitude=0.0)
119print(loc.latitude) # 40.7128
120print(loc[0]) # 40.7128 (tuple indexing)
121
122# Unpacking
123lat, lng, alt = loc
124print(f"{lat}, {lng}")
125
126# Immutable
127# loc.latitude = 0 # AttributeError
128
129# Can be dict key (hashable)
130locations = {loc: "New York City"}
131
132# ============================================================
133# Comparison: dataclass vs NamedTuple vs dict
134# ============================================================
135import sys
136
137@dataclass
138class DataPoint:
139 x: float
140 y: float
141
142@dataclass(slots=True)
143class DataPointSlots:
144 x: float
145 y: float
146
147class DataPointNT(NamedTuple):
148 x: float
149 y: float
150
151# Memory comparison
152dc = DataPoint(1.0, 2.0)
153dcs = DataPointSlots(1.0, 2.0)
154nt = DataPointNT(1.0, 2.0)
155d = {"x": 1.0, "y": 2.0}
156
157print(f"dataclass: {sys.getsizeof(dc)} bytes")
158print(f"dataclass+slots: {sys.getsizeof(dcs)} bytes")
159print(f"namedtuple: {sys.getsizeof(nt)} bytes")
160print(f"dict: {sys.getsizeof(d)} bytes")

🏋️ Practice Exercise

Exercises:

  1. Create a @dataclass for Employee with name, department, salary, and hire_date. Add a computed years_of_service property and implement ordering by salary.

  2. Build an immutable Color dataclass (frozen) with r, g, b fields (0-255). Add validation in __post_init__, a hex property, and a @classmethod factory from_hex("#FF0000").

  3. Compare memory usage: create 100,000 instances of the same data using a regular class, @dataclass, @dataclass(slots=True), NamedTuple, and dict. Measure with sys.getsizeof.

  4. Create a Config NamedTuple with sensible defaults. Show how to create modified copies with _replace().

  5. Build a dataclass inheritance hierarchy: ShapeRectangleSquare. Handle the field ordering issue (fields with defaults must come after fields without).

  6. Implement a simple @dataclass-like decorator from scratch that auto-generates __init__ and __repr__ from class annotations.

⚠️ Common Mistakes

  • Using mutable defaults in dataclass fields: items: list = []. Use field(default_factory=list) instead. The dataclass decorator catches this and raises a TypeError.

  • Forgetting that @dataclass(frozen=True) prevents ALL attribute changes, not just on constructor fields. You can't add new attributes or modify computed ones. Use object.__setattr__(self, 'attr', value) in __post_init__ if needed.

  • Inheriting from a dataclass with defaults when the child has fields without defaults — this causes a TypeError. Fields without defaults must come before fields with defaults in MRO order.

  • Using NamedTuple when you need mutability. NamedTuples are immutable (tuple subclass). Use @dataclass for mutable records.

  • Not knowing about asdict() and astuple() from dataclasses module — they convert dataclass instances to dicts/tuples, which is useful for serialization.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Dataclasses & NamedTuples