How to Work with CSV Files in Python: A Beginner’s Guide

A complete guide on working with CSV files in Python including read, write and basic manipulation operations

CSV, Comma Separated Values, is a plain text file format used to store structured tabular data. In a CSV file, a line represents a row in a table. Each line consists of a comma-separated value, representing a column in a row. A CSV file is saved with a .csv extension.

In this tutorial, we’ll cover how to read, write, and manipulate CSV files using Python.

What are CSV Files Used For?

CSV files are extensively used across various applications and platforms. Some of the advantages of CSV files are listed below:

Data Analysis: CSV files are used to analyze data and to manipulate datasets for analysis.

Spreadsheet Applications: CSV files allow easy data manipulation and sharing using various spreadsheet applications, such as Microsoft Excel and Google Sheets.

Customer Relationship Management (CRM): CSV files are used to manage and analyze data, such as managing contact lists, campaign data, and customer management.

E-commerce and Inventory Management: CSV files help manage product listings and update inventory to allow data to be coordinated in bulk.

Imagine you are a Data Analyst at a startup. You regularly receive data from many sources. To manage the data, you start by storing it in a CSV file and processing it using a programming language such as Python.

Getting Started

Before you start working with CSV files in Python, it’s crucial to set up your environment properly.

If you don’t have Python installed on your system, visit this link to install it. To work with CSV files in Python, we will use the CSV module part of the standard library (no need to install it). We import the CSV module like so:

import csv

Now we have the environment ready to use the csv module. Let’s learn what operations we can perform.

Overview of basic File Handling Operations supported by Python for CSV Files

We can perform operations like adding a new column or filtering out the data using a Python program.

First, let’s learn the basic operations like opening, reading, and writing the CSV files.

How to Open CSV Files?

Let’s see how we can open a CSV file using Python. A CSV file can be opened using the built-in function open() with the appropriate mode like (‘r’ for reading, ‘w’ for writing, or ‘a’ for appending). It’s better to use the with statement that automatically handles the closing of the file even if any error occurs.

Let’s look at an example to understand how we can open CSV files with Python. In a CSV file named example.csv, we have the following data:

Name,Age,Department  
Alice,30,HR  
Bob,24,IT  
Charlie,28,Finance  

The following code opens the example.csv file using the open() function and with read mode.

Then we use the next() function to skip the header row and a for loop to iterate over each row in the CSV file:

import csv
# Open the CSV file in read mode
with open('example.csv', mode='r') as file:
# Create a CSV reader object
csv_reader = csv.reader(file)
# Skip the header row (if there is one)
next(csv_reader, None)
# Iterate over each row in the CSV file
for row in csv_reader:
print(row)

How to Read CSV Files?

To see how we can read CSV files with Python, let’s look at an example. In a CSV file named example.csv, we have the following information:

Name,Age,Department 
Alice,30,HR 
Bob,24,IT 
Charlie,28,Finance 

We write a Python script to read this CSV file and print its contents like so:

import csv
# Open the CSV file
with open('example.csv', mode= 'r') as file:
# Create a CSV reader object
csv_reader = csv.reader(file)
# Read the header
header = next(csv_reader)
print(f"Header: {header}")
# Read each row of the CSV file
for row in csv_reader:
print(f"Row: {row}")

In this code, we create a CSV reader object using csv.reader(file), which reads a CSV file and returns each row as a list of strings.

Next, we read the header of the CSV file using the next(csv_reader) function which is used to retrieve the next item from an iterator. A header in a CSV file is the first row that contains the names of the columns, providing a label for each column’s data.

Then, we loop through the remaining rows in the CSV file, printing each row as a list of strings. This approach allows us to easily read and process CSV data in Python.

How to Write to a CSV File in Python?

To write the data into a CSV file, we first open the file in write mode using the with statement and then create a writer object using the csv.writer(file) to allow us to write the data into the file.

To understand it more concisely, let’s look at an example. In a CSV file named output.csv, we have the following information:

Name,Age,Department 
Alice,30,HR 
Bob,24,IT 
Charlie,28,Finance 

Here’s the code snippet that opens the output.csv file and defines a list containing all the data that is to be written into the CSV file.

Then, the csv.writer(file) object writes the data into the output.csv file by passing the list containing the header row and all subsequent data rows as arguments to the writerows() function:

import csv
# Data to be written to CSV
data = [
["Name", "Age", "Department"],
["Alice", 30, "HR"],
["Bob", 24, "IT"],
["Charlie", 28, "Finance"]
]
# Open the CSV file in write mode
with open('output.csv', mode='w', newline='') as file:
# Create a CSV writer object
csv_writer = csv.writer(file)
# Write the rows to the CSV file
csv_writer.writerows(data)

How to Manipulate and Analyze Data using CSV Files?

Data manipulation and analysis are important skills to have for any data analyst or scientist. In this section, we will take a look at basic data manipulation in Python.

The following are some data analysis things we can do with the CSV module in Python

Filtering Data

Filtering data involves selecting rows that meet certain criteria. For example, we might want to filter rows where the age of the employees is greater than 25.

Here’s how we can filter the data for that:

filtered_data = [row for row in data if int(row['Age']) > 25]
# Display the filtered data
for row in filtered_data:
print(row)

Adding a New Column

Adding a new column involves creating additional data based on the existing columns or with new data. For instance, we might want to add a column that calculates the age of each employee next year.

Here’s how you can add a new column:

# Add a new column 'Age Next Year'
for row in data:
row['age_next_year'] = int(row['Age']) + 1
# Display the data with the new column
for row in data:
print(row)

This loop iterates over each row in the data list. In each row, it adds a new key-value pair where the key is ‘age_next _year’ and the value is the current age incremented by one. This creates a new column in the dataset.

We’ve successfully used the csv module to perform some basic operations on the files.

Conclusion

We explored CSV files and the csv module in this tutorial. Let’s recap what we’ve discussed:

  • CSV files provide a standardized format for storing tabular data, ensuring compatibility across various platforms and applications.
  • Python’s csv module offers efficient tools for reading, writing, and manipulating CSV files, simplifying data management tasks.
  • Data manipulation operations such as filtering rows and adding new columns can be easily performed using Python’s csv module.
  • Python’s simplicity and versatility make it a powerful tool for handling CSV files, facilitating seamless integration into data analysis workflows.

These were some basics of working with CSV files using Python. For more advanced data analysis tasks, libraries like Pandas can be integrated with Python to enhance CSV file processing capabilities.

Author

Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team