Learn
Learn Seaborn Introduction
Aggregating by Multiple Columns

Sometimes we’ll want to aggregate our data by multiple columns to visualize nested categorical variables.

For example, consider our hospital survey data. The mean satisfaction seems to depend on Gender, but it might also depend on another column: Age Range.

We can compare both the Gender and Age Range factors at once by using the keyword hue.

sns.barplot(data=df, x="Gender", y="Response", hue="Age Range")

The hue parameter adds a nested categorical variable to the plot.

*Visualizing survey results by gender with age range nested*.

Notice that we keep the same x-labels, but we now have different color bars representing each Age Range. We can compare two bars of the same color to see how patients with the same Age Range, but different Gender rated the survey.

Instructions

1.

Use sns.barplot() to create a chart with:

  • data equal to df
  • x equal to Age Range
  • y equal to Response
  • hue equal to Gender

How is this plot different from when hue is "Age Range" and x is "Gender"?

Why might we use one and not the other?

2.

Use plt.show() to display the graph.

Folder Icon

Sign up to start coding

Already have an account?