Learn

One major difference between matplotlib and seaborn is how elements by group are added to the same plot. In matplotlib, we have to identify and label each group that will be added to the plot. This is why wide-format data is sometimes easier to use for matplotlib plots.

In contrast, when our data is in long-format, we can add plot elements by group with seaborn by setting the hue parameter to the grouping variable. Most plot functions include the hue parameter, and many also include the style parameter to differentiate groups by line or point style as well. Seaborn includes a legend by default, so there is no extra coding required to create and label the legend.

The following code creates a scatter plot of sales_totals versus daily_customers. We can group the points in different colors for each day of the week by setting hue to weekday.

sns.scatterplot(data=df, x='daily_customers', y='sales_totals', hue='weekday')

By setting the style parameter to the grouping variable, the points will also be a different shape for each group. The style parameter makes the plot more visually accessible and is also a great alternative to hue for publishing in grayscale. The size parameter may also be used as a grouping parameter to change the point size by group.

The hue parameter can be used in most plot functions. The following code produces a line plot of average sales per month where each location has a line that is a unique color and pattern. We made all the lines thicker by setting the linewidth parameter to 3.

sns.lineplot(data=df, x='month', y='sales', hue='location', style='location', linewidth=3)

Adding the hue parameter in functions like sns.histplot(), sns.kdeplot(), and sns.boxplot() allows us to view the distributions of multiple groups. While there is no style parameter for these functions, we can adjust the multiple parameter for histograms and KDE plots.

Instructions

1.

After running the first two code cells, make a scatter plot from the plants dataset of Leaf_length (y-axis) versus Plant_height (x-axis) with point color by PH.

2.

The color in the previous plot helps us see the relationship between plant height and leaf length for each pH level, but the points are pretty small and the colors may not be easy for everyone to differentiate. Make the same plot as step 1, but additionally make the point style different by PH and increase all the point sizes by setting the s parameter to 100.

3.

Make a line plot of Lateral_spread over Time using the plants dataset and color lines by PH.

4.

Let’s improve visibility of the different groups by adding the style parameter and setting the line width to 3. Let’s also remove the confidence intervals by adding ci=None.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?