Pivot Tables
Pivot tables in Pandas are a data summarization tool that perform operations such as aggregation, grouping, and reshaping of tabular data based on specific criteria, making it easier to derive meaningful insights.
Syntax
The general syntax for creating a pivot table in Pandas is:
pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')
data
: The input DataFrame.values
: The column(s) to aggregate.index
: Column(s) to group data by (rows).columns
: Column(s) to group data by (columns).aggfunc
: The aggregation function(s) (default ismean
).fill_value
: Value to replaceNaN
s.margins
: Add subtotals (row/column totals).margins_name
: Name for row/column totals.dropna
: Whether to drop columns withNaN
values.
Example
Consider the following DataFrame:
import pandas as pddata = {'Region': ['North', 'South', 'East', 'West', 'North', 'South'],'Category': ['A', 'A', 'B', 'B', 'A', 'B'],'Sales': [100, 150, 200, 130, 120, 180]}df = pd.DataFrame(data)print(df)
The output for the above code will be as follows:
Region | Category | Sales |
---|---|---|
North | A | 100 |
South | A | 150 |
East | B | 200 |
West | B | 130 |
North | A | 120 |
South | B | 180 |
Pivoting Data
To summarize total sales by Region and Category:
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Category', aggfunc='sum', fill_value=0)print(pivot)
The output of the above code will be as follows:
Category | A | B |
---|---|---|
East | 0 | 200 |
North | 220 | 0 |
South | 150 | 180 |
West | 0 | 130 |
Adding Margins for Totals
To include row and column totals in the pivot table, set margins=True
:
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Category', aggfunc='sum', fill_value=0, margins=True)print(pivot)
The output of the above code will be as follows:
Category | A | B | All |
---|---|---|---|
East | 0 | 200 | 200 |
North | 220 | 0 | 220 |
South | 150 | 180 | 330 |
West | 0 | 130 | 130 |
All | 370 | 510 | 880 |
Aggregating with Multiple Functions
Multiple aggregation functions can be applied using a list:
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Category', aggfunc=['sum', 'mean'], fill_value=0)print(pivot)
The output of the above code will be as follows:
sum | mean | |||
---|---|---|---|---|
Region | A | B | A | B |
East | 0 | 200 | 0.0 | 200.0 |
North | 220 | 0 | 110.0 | 0.0 |
South | 150 | 180 | 150.0 | 180.0 |
West | 0 | 130 | 0.0 | 130.0 |
Codebyte Example
Run the following example to see how pivot tables can be used to aggregate Quantity
by Category
and Region
:
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Pandas on Codecademy
- Career path
Data Scientist: Machine Learning Specialist
Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.Includes 27 CoursesWith Professional CertificationBeginner Friendly90 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours