Sometimes we’ll want to aggregate our data by multiple columns to visualize nested categorical variables.

For example, consider our hospital survey data. The mean satisfaction seems to depend on Gender, but it might also depend on another column: Age Range.

We can compare both the Gender and Age Range factors at once by using the keyword hue.

sns.barplot(data=df, x="Gender", y="Response", hue="Age Range")

The hue parameter adds a nested categorical variable to the plot.

*Visualizing survey results by gender with age range nested*.

Notice that we keep the same x-labels, but we now have different color bars representing each Age Range. We can compare two bars of the same color to see how patients with the same Age Range, but different Gender rated the survey.



Use sns.barplot() to create a chart with:

  • data equal to df
  • x equal to Age Range
  • y equal to Response
  • hue equal to Gender

How is this plot different from when hue is "Age Range" and x is "Gender"?

Why might we use one and not the other?


Use plt.show() to display the graph.

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?