In seaborn, distributions can be visualized using .histplot(), .kdeplot(), and .boxplot(), among other visualization functions.
The main parameters are data and x.
data is an optional parameter for the name of the pandas DataFrame.x is the column name for the variable of interest.The y-axis shows the frequency for histograms, the probability density for KDE plots, and the values for box plots.
For box plots, setting the y parameter to a grouping variable will show a box plot for each group on the same plotting grid.
import seaborn as sns# histogram of heightssns.histplot(data=df, x='height')# KDE plot of heightssns.kdeplot(data=df, x='height')# box plot of heightssns.boxplot(data=df, x='height')# box plots of heights by age groupsns.boxplot(data=df, x='height', y='age_range')
By default, Seaborn’s barplot() function places error bars on the bar plot. Seaborn uses a bootstrapped confidence interval to calculate these error bars.
The confidence interval can be changed to standard deviation by setting the parameter ci = "sd".
In seaborn, a scatter plot can be created with .scatterplot(). The main parameters are data, x, and y.
data is an optional parameter for the name of the pandas DataFrame.x is the column name for the x-axis of the plot.y is the column name for the y-axis of the plot.A scatter plot with a regression line can be created with .regplot(). This function takes the same parameters as .scatterplot() and produces the same plot, but with a regression line drawn on the scatter plot. By default, a 95% confidence interval is included as a shaded region around the line.
import seaborn as sns# scatter plot of bird count by temperaturesns.scatterplot(data=df, x='bird_count', y='temperature')# same plot with regression linesns.regplot(data=df, x='bird_count', y='temperature')