Multi-Indexing

itispragativerma6560850080's avatar
Published Jan 22, 2025
Contribute to Docs

Multi-indexing in Pandas refers to the ability to use multiple levels of indexing (rows and/or columns) to organize data hierarchically. It enables advanced operations like grouping, slicing, and reshaping data with ease, especially in multi-dimensional datasets.

Creating a Multi-Index

A multi-index can be created directly when constructing a DataFrame or applied to an existing DataFrame.

From a List of Tuples

The syntax is as follows:

pd.MultiIndex.from_tuples(tuples, names=None)
  • tuples: List of tuples, where each tuple represents a multi-index entry.
  • names: List of strings representing the names of the index levels (default is None).

Here’s an example where a DataFrame is created using a multi-index from a list of tuples, where each tuple represents a hierarchical key.

import pandas as pd
data = {
'Sales': [200, 300, 400, 500],
'Profit': [50, 80, 120, 150]
}
index = pd.MultiIndex.from_tuples(
[('Store A', 'Q1'), ('Store A', 'Q2'), ('Store B', 'Q1'), ('Store B', 'Q2')],
names=['Store', 'Quarter']
)
df = pd.DataFrame(data, index=index)
print(df)

The output will be as follows:

Sales Profit
Store Quarter
Store A Q1 200 50
Q2 300 80
Store B Q1 400 120
Q2 500 150

From a Groupby Operation

The syntax will be as follows:

DataFrame.groupby([columns]).aggregate()
  • columns: List of columns to group by.

Here’s an example where a DataFrame is grouped by two columns (Category and Subcategory), creating a multi-index for the grouped result -

import pandas as pd
df = pd.DataFrame({
'Category': ['A', 'A', 'B', 'B'],
'Subcategory': ['X', 'Y', 'X', 'Y'],
'Value': [10, 20, 30, 40]
})
grouped = df.groupby(['Category', 'Subcategory']).sum()
print(grouped)

The output will look like this:

Value
Category Subcategory
A X 10
Y 20
B X 30
Y 40

Accessing Multi-Index Data

Access Using .loc

The syntax will be as follows:

DataFrame.loc[(level_1, level_2)]
  • level_1, level_2: Keys corresponding to specific levels in the index hierarchy.

Slicing with .xs

The syntax will be as follows:

DataFrame.xs(key, level=level_name)
  • key: Value to slice on.
  • level_name: The level of the index to slice on.

Resetting and Setting Multi-Index

Resetting Multi-Index

The syntax will be as follows:

DataFrame.reset_index(level=None, inplace=False)
  • level: The index level(s) to reset. If None, all levels are reset.
  • inplace: If True, modifies the DataFrame in place (default is False).

Setting Multi-Index

The syntax will be as follows:

DataFrame.set_index([column_1, column_2], inplace=False)
  • column_1, column_2: Columns to set as multi-index levels.
  • inplace: If True, modifies the DataFrame in place (default is False).

Sorting Multi-Index

The syntax will be as follows:

DataFrame.sort_index(level=None, ascending=True, inplace=False)
  • level: The index level(s) to sort by. If None, sorts by all levels.
  • ascending: If True, sorts in ascending order; otherwise, in descending order.
  • inplace: If True, modifies the DataFrame in place (default is False).

Codebyte Example

The below example demonstrates creating, accessing, slicing, resetting, setting, sorting, and grouping a multi-indexed DataFrame in Pandas:

Code
Output
Loading...

All contributors

Contribute to Docs

Learn Python:Pandas on Codecademy