Multi-Indexing
Multi-indexing in Pandas refers to the ability to use multiple levels of indexing (rows and/or columns) to organize data hierarchically. It enables advanced operations like grouping, slicing, and reshaping data with ease, especially in multi-dimensional datasets.
Creating a Multi-Index
A multi-index can be created directly when constructing a DataFrame or applied to an existing DataFrame.
From a List of Tuples
The syntax is as follows:
pd.MultiIndex.from_tuples(tuples, names=None)
tuples
: List of tuples, where each tuple represents a multi-index entry.names
: List of strings representing the names of the index levels (default isNone
).
Here’s an example where a DataFrame
is created using a multi-index from a list of tuples, where each tuple represents a hierarchical key.
import pandas as pddata = {'Sales': [200, 300, 400, 500],'Profit': [50, 80, 120, 150]}index = pd.MultiIndex.from_tuples([('Store A', 'Q1'), ('Store A', 'Q2'), ('Store B', 'Q1'), ('Store B', 'Q2')],names=['Store', 'Quarter'])df = pd.DataFrame(data, index=index)print(df)
The output will be as follows:
Sales ProfitStore QuarterStore A Q1 200 50Q2 300 80Store B Q1 400 120Q2 500 150
From a Groupby Operation
The syntax will be as follows:
DataFrame.groupby([columns]).aggregate()
columns
: List of columns to group by.
Here’s an example where a DataFrame
is grouped by two columns (Category
and Subcategory
), creating a multi-index for the grouped result -
import pandas as pddf = pd.DataFrame({'Category': ['A', 'A', 'B', 'B'],'Subcategory': ['X', 'Y', 'X', 'Y'],'Value': [10, 20, 30, 40]})grouped = df.groupby(['Category', 'Subcategory']).sum()print(grouped)
The output will look like this:
ValueCategory SubcategoryA X 10Y 20B X 30Y 40
Accessing Multi-Index Data
Access Using .loc
The syntax will be as follows:
DataFrame.loc[(level_1, level_2)]
level_1
,level_2
: Keys corresponding to specific levels in the index hierarchy.
Slicing with .xs
The syntax will be as follows:
DataFrame.xs(key, level=level_name)
key
: Value to slice on.level_name
: The level of the index to slice on.
Resetting and Setting Multi-Index
Resetting Multi-Index
The syntax will be as follows:
DataFrame.reset_index(level=None, inplace=False)
level
: The index level(s) to reset. IfNone
, all levels are reset.inplace
: IfTrue
, modifies the DataFrame in place (default isFalse
).
Setting Multi-Index
The syntax will be as follows:
DataFrame.set_index([column_1, column_2], inplace=False)
column_1
,column_2
: Columns to set as multi-index levels.inplace
: IfTrue
, modifies the DataFrame in place (default isFalse
).
Sorting Multi-Index
The syntax will be as follows:
DataFrame.sort_index(level=None, ascending=True, inplace=False)
level
: The index level(s) to sort by. IfNone
, sorts by all levels.ascending
: IfTrue
, sorts in ascending order; otherwise, in descending order.inplace
: IfTrue
, modifies the DataFrame in place (default isFalse
).
Codebyte Example
The below example demonstrates creating, accessing, slicing, resetting, setting, sorting, and grouping a multi-indexed DataFrame in Pandas:
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Pandas on Codecademy
- Career path
Data Scientist: Machine Learning Specialist
Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.Includes 27 CoursesWith Professional CertificationBeginner Friendly90 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours