Python CSV: Read And Write CSV Files

CSV is short for comma-separated values, and it’s a common format to store all kinds of data. Many tools offer an option to export data to CSV. Python’s CSV module is a built-in module that we can use to read and write CSV files. In this article, you’ll learn to use the Python CSV module to read and write CSV files. In addition, we’ll look at how to write CSV files with NumPy and Pandas, since many people use these tools as well.

Table of Contents

1 Python’s CSV module
2 Python CSV vs. NumPy or Pandas
3 Reading and writing CSV with NumPy
4 Reading and writing CSV with Pandas

Python’s CSV module

Python has a CSV module built-in, so there’s no need to install anything. We’ll first look at Python’s built-in CSV module before we dive into using alternatives like Numpy and Pandas. In many cases, Python’s module will offer everything you need without requiring extra dependencies for your script.

It is good to know that if you plan to use the data with NumPy, you can use Numpy’s functionality instead. Similarly, Pandas has its read_csv function to read CSV directly into a DataFrame. These options are demonstrated extensively in this article, and, spoiler alert, they offer some advantages compared to Python’s built-in module. Especially if you were planning on using Numpy or Pandas anyway.

Import the Python CSV module

For starters, let’s import the csv module. This couldn’t be simpler:

import csv

Read CSV files with Python

Now that we know how to import the CSV module let’s see how we can use Python to open a CSV file and read the data from it. In the following interactive crumb, we read a CSV file with names, ages, and countries and use print() to display each parsed line:

Let’s walk through this step-by-step and see what’s going on.

After importing the CSV module, we open the CSV file with Python open. There’s one peculiarity that might catch your eye: the newline='' argument to the open() function. This ensures that open won’t try to convert newlines but instead return them as-is. The csvreader will instead handle the newlines based on the platform and selected CSV dialect (more on that later!).

Once we have an open file, we use the csv.reader() to parse the CSV file. It’s good to know that this reader will not read the entire file at once. It accepts any iterable object and starts requesting rows from the iterator. As you may know, a file object will not read a file at once but will read it in chunks, depending on how large the file is. So this CSV reader can process large files without causing memory issues.

The call to csv.reader() itself returns an iterator as well; hence we can use a simple for-loop to iterate over the CSV file from here on.

Write CSV with Python

Now that we know how to read CSV let’s see how to write CSV in Python. As you may have guessed, there’s also a csv.writer() function that we can use to write to a file:

import csv

# Open the file in write mode
with open("output.csv", "w") as csv_file:
    # Create a writer object
    csv_writer = csv.writer(csv_file)

    # Write the data to the file
    csv_writer.writerow(["Name", "Age", "Country"])
    csv_writer.writerow(["John Doe", 30, "United States"])
    csv_writer.writerow(["Jane Doe", 28, "Canada"])

We open the output.csv file in write mode and create a writer object. Next, use the writerow() method to write new rows. The writerow() method takes a list of values and writes them to a single row in the CSV file.

Add to CSV

Adding extra data to an existing CSV file is similar to writing a new one. We just need to open the file in another mode: append mode. More on file modes can be learned in the article on Python files. As an example, we will append some more lines to the output.csv file from above:

import csv

# Open the file in append mode
with open("output.csv", "a") as csv_file:
    # Create a writer object
    csv_writer = csv.writer(csv_file)

    # Write the new data to the file
    csv_writer.writerow(["Joe Smith", 35, "United Kingdom"])
    csv_writer.writerow(["Mary Smith", 32, "France"])

Choosing a CSV dialect

you can choose a CSV dialect when working with the csv module in Python. A CSV dialect is a set of parameters that defines the specific format of a CSV file. This includes the character used to delimit fields, the character used to quote fields, and other formatting details.

The csv module provides some pre-defined dialects that you can use, such as excel, excel-tab, and unix. You can specify the dialect that you want to use when creating a writer object, like this:

import csv

with open("output.csv", "w", newline="") as csv_file:
    # Create a writer object, using the `excel` dialect
    csv_writer = csv.writer(csv_file, dialect="excel")
    ...

In this example, we create a writer object using the excel dialect. This tells the csv module to use the formatting conventions of the excel dialect when writing the data to the file.

Custom dialects

You can create your own custom dialect by defining the dialect parameters yourself. This can be useful if you need to write a CSV file in a specific format that is not supported by the pre-defined dialects.

To create a custom dialect, you can use the csv.register_dialect() function, like this:

import csv

# Define the custom dialect
my_dialect = csv.register_dialect("my_dialect",
    delimiter=";",
    quotechar='"',
    quoting=csv.QUOTE_MINIMAL
)

# Open the file in write mode
with open("output.csv", "w") as csv_file:
    # Create a writer object, using the custom dialect
    csv_writer = csv.writer(csv_file, dialect="my_dialect")

    # Write the data to the file
    csv_writer.writerow(["Name", "Age", "Country"])
    csv_writer.writerow(["John Doe", 30, "United States"])
    csv_writer.writerow(["Jane Doe", 28, "Canada"])

In this example, we define a custom dialect called my_dialect, which uses a semicolon as the delimiter character and a double quote as the quote character. We then use this custom dialect when creating the writer object, which tells the csv module to use the formatting conventions of the my_dialect dialect when writing the data to the file.

In addition to writing, you can also use a custom dialect when reading a CSV file. Here is an example of how you can use a custom dialect to read the previously created output.csv file:

import csv

# Define the custom dialect
my_dialect = csv.register_dialect("my_dialect",
    delimiter=";",
    quotechar='"',
    quoting=csv.QUOTE_MINIMAL
)

# Open the file in read mode
with open("output.csv", "r") as csv_file:
    # Create a reader object, using the custom dialect
    csv_reader = csv.reader(csv_file, dialect="my_dialect")

    # Read the data from the file
    for row in csv_reader:
        # Process the data in the row
        print(row)

Python CSV vs. NumPy or Pandas

In the following sections, we’ll look at how to read and write CSV files with NumPy and Pandas. These packages both have great CSV support, but for projects that are not built around these (large) packages, I recommend using the built-in Python CSV module for a few reasons:

Both NumPy and Pandas are extensive tools that can do a lot. The downside is they both need to be installed before you can use them.
Both packages will add ‘weight’ to your project. Sometimes it’s better to create lightweight scripts without dependencies since they are much easier to share and use.

However, if you are working with one of these libraries anyway, you’re better off using their CSV readers and writers since they tie in nicely with their specific data structures!

Reading and writing CSV with NumPy

Let’s start with NumPy. I’ve written an introduction to Numpy here if you’re interested.

To write a CSV file with NumPy, you can use the numpy.savetxt() function, which allows you to save a NumPy array to a CSV file.

Here is an example of how you can use the numpy.savetxt() function to write a CSV file:

import numpy as np

# Create a NumPy array
data = np.array([["Name", "Age", "Country"],
                 ["John Doe", 30, "United States"],
                 ["Jane Doe", 28, "Canada"]])

# Save the array to a CSV file
np.savetxt("output.csv", data, delimiter=",", fmt="%s")

In this example, we first create a NumPy array called data that contains the data that we want to write to the CSV file. Since NumPy arrays can only hold one type of element, all elements will be converted to strings. We then use the numpy.savetxt() function to save the array to a CSV file. The numpy.savetxt() function takes the following arguments:

The name of the file to save the array to
The NumPy array to save
The delimiter character to use in the CSV file (in this case, a comma)
The format to use when writing the data (in this case, a string format, indicated by the %s format specifier)

Appending data with NumPy

NumPy will overwrite an existing file and thus remove any data already present in that file. If you need to append data to an existing CSV file, you can first open the existing file in append mode and then use savetxt to write to that file:

import numpy as np

# Open the file in append mode
with open("output.csv", "a") as csv_file:
    # Create a NumPy array with the new data
    new_data = np.array([["Joe Smith", 35, "United Kingdom"],
                         ["Mary Smith", 32, "France"]])

    # Append the data to the file
    np.savetxt(csv_file, new_data, delimiter=",", fmt="%s")

Reading and writing CSV with Pandas

To write CSV data using Pandas, you can use the pandas.DataFrame.to_csv() method to save a Pandas DataFrame to a CSV file.

Here is an example of how you can use the to_csv() method to write CSV data using Pandas:

import pandas as pd

# Create a Pandas DataFrame
data = pd.DataFrame([["John Doe", 30, "United States"],
                     ["Jane Doe", 28, "Canada"]])

# Save the DataFrame to a CSV file
data.to_csv("output.csv", index=False, header=False)

For a complete description of the method, I can recommend the official documentation here.

Learn Python properly through small, easy-to-digest lessons, progress tracking, quizzes to test your knowledge, and practice sessions. Each course will earn you a downloadable course certificate.