In most cases, we’ll want to plot the mean of our data, but sometimes, we’ll want something different:
- If our data has many outliers, we may want to plot the median.
- If our data is categorical, we might want to count how many times each category appears (such as in the case of survey responses).
Seaborn is flexible and can calculate any aggregate you want. To do so, you’ll need to use the keyword argument estimator
, which accepts any function that works on a list.
For example, to calculate the median, you can pass in np.median
to the estimator
keyword:
sns.barplot(data=df, x="x-values", y="y-values", estimator=np.median)
Consider the data in results.csv. To calculate the number of times a particular value appears in the Response
column , we pass in len
:
sns.barplot(data=df, x="Patient ID", y="Response", estimator=len)
Instructions
Consider our hospital satisfaction survey data, which is loaded into the Pandas DataFrame df
. Use print
to examine the data.
We’d like to know how many men and women answered the survey. Use sns.barplot()
with:
data
equal todf
x
equal toGender
y
equal toResponse
estimator
equal tolen
Use plt.show()
to display the graph.
Change sns.barplot()
to graph the median Response
aggregated by Gender
using estimator=np.median
.