Python:Pandas DataFrame
Published May 12, 2022Updated May 27, 2022
Contribute to Docs
A DataFrame is the primary object used by the Pandas module to store and manipulate data. It is a structured collection of data arranged in rows and columns, similar to a database table.
Many Pandas functions, such as .read_csv(), return DataFrame objects. Other functions take DataFrame objects and accept them as parameters. In addition, most of Pandas’ functionality is implemented through the DataFrame object. Methods and properties of the DataFrame object are listed below.
DataFrame
- .apply()
- Used to apply a function along one axis of the DataFrame.
- .assign()
- Creates new columns or modifies existing columns in a DataFrame while preserving the original DataFrame.
- .at[]
- Returns a specific value in a DataFrame using the row and column labels.
- .columns
- Represents the column labels of the DataFrame.
- .copy()
- Returns a copy of a DataFrame or Series.
- .drop()
- Returns a DataFrame object with rows or columns removed based on column or index names.
- .dropna()
- Returns a DataFrame object with rows or columns with NA values removed.
- .drop_duplicates()
- Removes duplicate rows from a DataFrame based on specified columns.
- .explode()
- Transforms each element of a list-like column into a separate row.
- .fillna()
- Replaces null values in a DataFrame or Series with specified values.
- .groupby()
- Groups a DataFrame using a mapper or a series of columns and returns a GroupBy object.
- .index
- Represents the row labels of the DataFrame.
- .insert()
- Inserts a new column into the DataFrame at the specified location.
- .isna()
- Checks whether the objects of a Dataframe or a Series contain missing or null values and returns a corresponding boolean value.
- .loc
- Accesses a group of rows and columns by label(s) or a boolean array.
- .merge()
- Merges two DataFrames based on a common key or index.
- .pop()
- Removes a specified column from a DataFrame.
- .replace()
- Returns a DataFrame object after values within the DataFrame have been changed.
- .reset_index()
- Resets the index of a DataFrame to the default integer index.
- .shape
- Returns the number of rows and columns of given DataFrame in tuple form.
- .sort_values()
- Sorts values in a DataFrame by one or more columns.
- .tail()
- Returns the last n rows of a DataFrame.
- join()
- Combines columns from another DataFrame into the calling DataFrame based on the index or a key column.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Pandas on Codecademy
- Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.
- Includes 27 Courses
- With Professional Certification
- Beginner Friendly.95 hours
- Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.
- With Certificate
- Beginner Friendly.24 hours