Codecademy Logo

Seaborn

Related learning

Distribution Plots with Seaborn

In seaborn, distributions can be visualized using .histplot(), .kdeplot(), and .boxplot(), among other visualization functions.

The main parameters are data and x.

  • data is an optional parameter for the name of the pandas DataFrame.
  • x is the column name for the variable of interest.

The y-axis shows the frequency for histograms, the probability density for KDE plots, and the values for box plots.

For box plots, setting the y parameter to a grouping variable will show a box plot for each group on the same plotting grid.

import seaborn as sns
# histogram of heights
sns.histplot(data=df, x='height')
# KDE plot of heights
sns.kdeplot(data=df, x='height')
# box plot of heights
sns.boxplot(data=df, x='height')
# box plots of heights by age group
sns.boxplot(data=df, x='height', y='age_range')

Barplot error bars

By default, Seaborn’s barplot() function places error bars on the bar plot. Seaborn uses a bootstrapped confidence interval to calculate these error bars.

The confidence interval can be changed to standard deviation by setting the parameter ci = "sd".

Scatter Plots with Seaborn

In seaborn, a scatter plot can be created with .scatterplot(). The main parameters are data, x, and y.

  • data is an optional parameter for the name of the pandas DataFrame.
  • x is the column name for the x-axis of the plot.
  • y is the column name for the y-axis of the plot.

A scatter plot with a regression line can be created with .regplot(). This function takes the same parameters as .scatterplot() and produces the same plot, but with a regression line drawn on the scatter plot. By default, a 95% confidence interval is included as a shaded region around the line.

import seaborn as sns
# scatter plot of bird count by temperature
sns.scatterplot(data=df, x='bird_count', y='temperature')
# same plot with regression line
sns.regplot(data=df, x='bird_count', y='temperature')

Learn more on Codecademy