How to Work with CSV Files in Python: A Beginner’s Guide
CSV, Comma Separated Values, is a plain text file format used to store structured tabular data. In a CSV file, a line represents a row in a table. Each line consists of a comma-separated value, representing a column in a row. A CSV file is saved with a .csv extension.
In this tutorial, we’ll cover how to read, write, and manipulate CSV files using Python.
What are CSV Files Used For?
CSV files are extensively used across various applications and platforms. Some of the advantages of CSV files are listed below:
Data Analysis: CSV files are used to analyze data and to manipulate datasets for analysis.
Spreadsheet Applications: CSV files allow easy data manipulation and sharing using various spreadsheet applications, such as Microsoft Excel and Google Sheets.
Customer Relationship Management (CRM): CSV files are used to manage and analyze data, such as managing contact lists, campaign data, and customer management.
E-commerce and Inventory Management: CSV files help manage product listings and update inventory to allow data to be coordinated in bulk.
Imagine you are a Data Analyst at a startup. You regularly receive data from many sources. To manage the data, you start by storing it in a CSV file and processing it using a programming language such as Python.
Getting Started
Before you start working with CSV files in Python, it’s crucial to set up your environment properly.
If you don’t have Python installed on your system, visit this link to install it. To work with CSV files in Python, we will use the CSV module part of the standard library (no need to install it). We import the CSV module like so:
import csv
Now we have the environment ready to use the csv
module. Let’s learn what operations we can perform.
Overview of basic File Handling Operations supported by Python for CSV Files
We can perform operations like adding a new column or filtering out the data using a Python program.
First, let’s learn the basic operations like opening, reading, and writing the CSV files.
How to Open CSV Files?
Let’s see how we can open a CSV file using Python. A CSV file can be opened using the built-in function open()
with the appropriate mode like (‘r’ for reading, ‘w’ for writing, or ‘a’ for appending). It’s better to use the with
statement that automatically handles the closing of the file even if any error occurs.
Let’s look at an example to understand how we can open CSV files with Python. In a CSV file named example.csv
, we have the following data:
Name,Age,Department
Alice,30,HR
Bob,24,IT
Charlie,28,Finance
The following code opens the example.csv
file using the open()
function and with read mode.
Then we use the next()
function to skip the header row and a for
loop to iterate over each row in the CSV file:
import csv# Open the CSV file in read modewith open('example.csv', mode='r') as file:# Create a CSV reader objectcsv_reader = csv.reader(file)# Skip the header row (if there is one)next(csv_reader, None)# Iterate over each row in the CSV filefor row in csv_reader:print(row)
How to Read CSV Files?
To see how we can read CSV files with Python, let’s look at an example. In a CSV file named example.csv
, we have the following information:
Name,Age,Department
Alice,30,HR
Bob,24,IT
Charlie,28,Finance
We write a Python script to read this CSV file and print its contents like so:
import csv# Open the CSV filewith open('example.csv', mode= 'r') as file:# Create a CSV reader objectcsv_reader = csv.reader(file)# Read the headerheader = next(csv_reader)print(f"Header: {header}")# Read each row of the CSV filefor row in csv_reader:print(f"Row: {row}")
In this code, we create a CSV reader object using csv.reader(file)
, which reads a CSV file and returns each row as a list of strings.
Next, we read the header of the CSV file using the next(csv_reader)
function which is used to retrieve the next item from an iterator. A header in a CSV file is the first row that contains the names of the columns, providing a label for each column’s data.
Then, we loop through the remaining rows in the CSV file, printing each row as a list of strings. This approach allows us to easily read and process CSV data in Python.
How to Write to a CSV File in Python?
To write the data into a CSV file, we first open the file in write mode using the with
statement and then create a writer
object using the csv.writer(file)
to allow us to write the data into the file.
To understand it more concisely, let’s look at an example. In a CSV file named output.csv
, we have the following information:
Name,Age,Department
Alice,30,HR
Bob,24,IT
Charlie,28,Finance
Here’s the code snippet that opens the output.csv
file and defines a list containing all the data that is to be written into the CSV file.
Then, the csv.writer(file)
object writes the data into the output.csv file by passing the list containing the header row and all subsequent data rows as arguments to the writerows()
function:
import csv# Data to be written to CSVdata = [["Name", "Age", "Department"],["Alice", 30, "HR"],["Bob", 24, "IT"],["Charlie", 28, "Finance"]]# Open the CSV file in write modewith open('output.csv', mode='w', newline='') as file:# Create a CSV writer objectcsv_writer = csv.writer(file)# Write the rows to the CSV filecsv_writer.writerows(data)
How to Manipulate and Analyze Data using CSV Files?
Data manipulation and analysis are important skills to have for any data analyst or scientist. In this section, we will take a look at basic data manipulation in Python.
The following are some data analysis things we can do with the CSV module in Python
Filtering Data
Filtering data involves selecting rows that meet certain criteria. For example, we might want to filter rows where the age of the employees is greater than 25.
Here’s how we can filter the data for that:
filtered_data = [row for row in data if int(row['Age']) > 25]# Display the filtered datafor row in filtered_data:print(row)
Adding a New Column
Adding a new column involves creating additional data based on the existing columns or with new data. For instance, we might want to add a column that calculates the age of each employee next year.
Here’s how you can add a new column:
# Add a new column 'Age Next Year'for row in data:row['age_next_year'] = int(row['Age']) + 1# Display the data with the new columnfor row in data:print(row)
This loop iterates over each row in the data list. In each row, it adds a new key-value pair where the key is ‘age_next _year’ and the value is the current age incremented by one. This creates a new column in the dataset.
We’ve successfully used the csv
module to perform some basic operations on the files.
Conclusion
We explored CSV files and the csv
module in this tutorial. Let’s recap what we’ve discussed:
- CSV files provide a standardized format for storing tabular data, ensuring compatibility across various platforms and applications.
- Python’s csv module offers efficient tools for reading, writing, and manipulating CSV files, simplifying data management tasks.
- Data manipulation operations such as filtering rows and adding new columns can be easily performed using Python’s csv module.
- Python’s simplicity and versatility make it a powerful tool for handling CSV files, facilitating seamless integration into data analysis workflows.
These were some basics of working with CSV files using Python. For more advanced data analysis tasks, libraries like Pandas can be integrated with Python to enhance CSV file processing capabilities.
Author
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
Handling Text Files in Python
Learn how to read text files in Python using built-in functions like `open()` and `read()`. Discover multiple ways to read from files in Python, from basic file operations to advanced techniques for efficient file handling. - Article
Intro to Data Acquisition
Exploring and defining the methods of obtaining data
Learn more on Codecademy
- Skill path
Code Foundations
Start your programming journey with an introduction to the world of code and basic concepts.Includes 5 CoursesWith CertificateBeginner Friendly4 hours - Career path
Full-Stack Engineer
A full-stack engineer can get a project done from start to finish, back-end to front-end.Includes 51 CoursesWith Professional CertificationBeginner Friendly150 hours