Python:Pandas size()
The size() method in pandas returns the number of rows or elements in each group created by the groupby() function. It provides a quick way to determine group sizes without applying an aggregation function.
Syntax
DataFrameGroupBy.size()
Parameters:
The size() method doesn’t take any parameters.
Return value:
The size() method returns a Series containing the size (row count) of each group created by groupby().
Example 1: Counting Rows by Group
In this example, a DataFrame of employees is grouped by their department, and size() counts how many employees belong to each department:
import pandas as pddata = {'Department': ['HR', 'IT', 'HR', 'Finance', 'IT', 'Finance'],'Employee': ['John', 'Sara', 'Mike', 'Anna', 'Tom', 'Chris']}df = pd.DataFrame(data)group_sizes = df.groupby('Department').size()print(group_sizes)
The output of this code is:
DepartmentFinance 2HR 2IT 2dtype: int64
Example 2: Using Multiple Grouping Columns
In this example, size() counts the number of members in each combination of team and shift within a dataset:
import pandas as pddata = {'Team': ['A', 'A', 'B', 'B', 'B', 'C'],'Shift': ['Day', 'Night', 'Day', 'Night', 'Day', 'Day'],'Name': ['John', 'Sara', 'Mike', 'Anna', 'Tom', 'Chris']}df = pd.DataFrame(data)group_sizes = df.groupby(['Team', 'Shift']).size()print(group_sizes)
The output of this code is:
Team ShiftA Day 1Night 1B Day 2Night 1C Day 1dtype: int64
Codebyte Example: Counting Transactions Per Product
In this example, size() is used to count how many sales transactions occurred for each product in a store dataset:
Frequently Asked Questions
1. What is the pandas groupby().size() method?
groupby().size() returns the number of rows in each group created by groupby().
2. What is the purpose of groupby() in pandas?
groupby() splits data into groups based on selected column values to enable aggregation and summarization.
3. What does NaN stand for in pandas?
NaN stands for Not a Number and indicates missing or undefined data.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Pandas on Codecademy
- Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.
- Includes 27 Courses
- With Professional Certification
- Beginner Friendly.95 hours
- Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.
- With Certificate
- Beginner Friendly.24 hours