While the hue
and style
parameters can help add more information to the same plot, too much information in one plot can make it difficult to understand. When we want to be able to make comparisons without overcomplicating a single plot, we may want to use faceting, where we make one plot for each group and align them next to each other.
In seaborn, we can use the sns.FacetGrid()
function to build a grid of plots and then map a plotting function onto it. Alternatively, we may use one of a few plotting functions that are built off of sns.FacetGrid()
. These functions take a kind
parameter to specify the type of plot and then add the familiar parameters like data
, x
, y
, hue
, or style
.
sns.relplot()
plots relational data of multiple variables like'line'
or'scatter'
plots.sns.displot()
plots distributional data like histograms ('hist'
) or'kde'
plots.sns.catplot()
plots categorical data like'bar'
or'box'
plots.
To create the facets, we set one of two parameters to a grouping variable: row
to display the plot vertically in rows or col
to display the plots horizontally in columns.
The following code plots multiple line plots horizontally, one plot for each year
.
sns.relplot(kind='line', data=df, x='month', y='sales', col='year')
So when might we want to switch from sns.lineplot()
to sns.relplot()
?
The function sns.lineplot()
and the other familiar plotting functions:
- are all axes-level functions
- integrate well with other matplotlib functions
- provide control of the plots individually and position them together with more flexibility, allowing for more complex or custom layouts
Functions like sns.relplot()
that build off of sns.FacetGrid()
:
- are figure-level functions
- make it easy to facet the same plot type across groups with a legend automatically positioned outside the plots
- don’t provide as much individual control over the plots or their positions, but lots of the work is done for us already
When using multiple plots just to explore a dataset, there are other functions in seaborn that may be more useful.
sns.pairplot()
plots multiple variables in pairs so that we can quickly check many variables for notable relationships.sns.jointplot()
plots two variables together along with their individual distributions in the margins so that we can quickly check out a relationship on multiple levels at once.
You can find more details about these functions in the seaborn API documentation.
Instructions
Set the parameter data
to plants
inside sns.pairplot()
to view scatter plots and histograms of the numeric variables in the plants
dataset.
Let’s investigate the distribution of plant heights at different pH values. Create a histogram of Plant_height
with hue
by PH
value.
It’s difficult to see each distribution when they are overlapping. Use sns.displot()
to split the previous plot into three plots, one for each value of PH
as a separate column. Keep the hue
set to PH
to keep each histogram a different color.
Let’s smooth the distributions to get a better sense of each distribution’s shape. Repeat the previous plot but make it a KDE plot instead of a histogram and set fill
to True
.