A Python data class is a regular Python class that has the @dataclass
decorator. It is specifically created to hold data. Since Python version 3.7, Python offers data classes through a built-in module that you can import, called dataclass
. There are several advantages over regular Python classes which we’ll explore in this article. We’ll also look at example code and a couple of common operations you might want to perform with data classes.
Table of Contents
The advantage of using data classes
Why should you use a data class instead of a regular Python class? First, let’s look at some of the advantages a Python data class has to offer.
Requires a minimal amount of code
The @dataclass
decorator adds a lot of functionality to a class without adding any visible code. This allows your data class to be very compact while still offering many useful features. All you need to do is define the fields to hold your data. You don’t need to define any functions.
Comparison
Two Python data classes can be compared with ==
because the so-called dunder method __eq__ is implemented automatically. In general, we can compare any Python object that implements this special method to other objects of the same type.
Printing a data class
Similarly, because __repr__
is implemented, you can print data classes and get a nice representation of it. This is especially useful for debugging.
Data classes require type hints
Data classes are built around the new(ish) type system Python offers. Using type hints reduces the chances of bugs and unexpected behavior in your code. You essentially declare the type of data that should be stored in a variable.
Python data class example
Here’s an example of a data class at work:
from dataclasses import dataclass @dataclass class Card: rank: str suit: str card1 = Card("Q", "hearts") card2 = Card("Q", "hearts") print(card1 == card2) # True print(card1.rank) # 'Q' print(card1) Card(rank='Q', suit='hearts')
Default values
A data class can have default values. Assigning default values is as simple as assigning a value to a variable. For example, to make our Card class have a default value of Queen of hearts, we can do as follows:
from dataclasses import dataclass @dataclass class Card: rank: str = 'Q' suit: str = 'hearts'
Converting a data class to JSON
A common use case is to convert your nicely structured data class to JSON. E.g., if you want to export the data to a database, or send it to the browser. The bad news here: there’s no built-in way to convert a data class to JSON. At least not in such a way that it can comfortably export all kinds of data types inside your class (like date objects).
The good news is, that there is a Python package called dataclasses-json that simplifies the task. However, it requires an extra decorator. You’ll need to install the package with the pip install command or something like Pipenv, preferably inside a virtual environment. For example:
$ pip install dataclasses-json
Here’s an example of how you can use the package:
from dataclasses import dataclass from dataclasses_json import dataclass_json @dataclass_json @dataclass class Card: rank: str = 'Q' suit: str = 'hearts' card = Card() print(card.to_json())
This results in the following output:
{"rank": "Q", "suit": "hearts"}
Another method is to use Python inheritance and inherit from the JSONEncoder class to create your own custom encoder. The advantage here is that you don’t need to install an external package. You can learn how to do this in this blog post.
Keep learning
- The Python attrs package has an advanced version of the native Python data class
- The official documentation on Python.org
- How to return multiple values in Python