Python:Pandas DataFrame

KyraThompson's avatar
Published May 12, 2022Updated May 27, 2022
Contribute to Docs

A DataFrame is the primary object used by the Pandas module to store and manipulate data. It is a structured collection of data arranged in rows and columns, similar to a database table.

Many Pandas functions, such as .read_csv(), return DataFrame objects. Other functions take DataFrame objects and accept them as parameters. In addition, most of Pandas’ functionality is implemented through the DataFrame object. Methods and properties of the DataFrame object are listed below.

  • Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.
    • Includes 27 Courses
    • With Professional Certification
    • Beginner Friendly.
      95 hours
  • Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.
    • With Certificate
    • Beginner Friendly.
      24 hours

DataFrame

.apply()
Used to apply a function along one axis of the DataFrame.
.assign()
Creates new columns or modifies existing columns in a DataFrame while preserving the original DataFrame.
.at[]
Returns a specific value in a DataFrame using the row and column labels.
.columns
Represents the column labels of the DataFrame.
.copy()
Returns a copy of a DataFrame or Series.
.drop()
Returns a DataFrame object with rows or columns removed based on column or index names.
.dropna()
Returns a DataFrame object with rows or columns with NA values removed.
.drop_duplicates()
Removes duplicate rows from a DataFrame based on specified columns.
.explode()
Transforms each element of a list-like column into a separate row.
.fillna()
Replaces null values in a DataFrame or Series with specified values.
.groupby()
Groups a DataFrame using a mapper or a series of columns and returns a GroupBy object.
.index
Represents the row labels of the DataFrame.
.insert()
Inserts a new column into the DataFrame at the specified location.
.isna()
Checks whether the objects of a Dataframe or a Series contain missing or null values and returns a corresponding boolean value.
.loc
Accesses a group of rows and columns by label(s) or a boolean array.
.merge()
Merges two DataFrames based on a common key or index.
.pop()
Removes a specified column from a DataFrame.
.replace()
Returns a DataFrame object after values within the DataFrame have been changed.
.reset_index()
Resets the index of a DataFrame to the default integer index.
.shape
Returns the number of rows and columns of given DataFrame in tuple form.
.sort_values()
Sorts values in a DataFrame by one or more columns.
.tail()
Returns the last n rows of a DataFrame.
join()
Combines columns from another DataFrame into the calling DataFrame based on the index or a key column.

All contributors

Contribute to Docs

Learn Python:Pandas on Codecademy

  • Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.
    • Includes 27 Courses
    • With Professional Certification
    • Beginner Friendly.
      95 hours
  • Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.
    • With Certificate
    • Beginner Friendly.
      24 hours