In most cases, we’ll want to plot the mean of our data, but sometimes, we’ll want something different:
- If our data has many outliers, we may want to plot the median.
- If our data is categorical, we might want to count how many times each category appears (such as in the case of survey responses).
Seaborn is flexible and can calculate any aggregate you want. To do so, you’ll need to use the keyword argument
estimator, which accepts any function that works on a list.
For example, to calculate the median, you can pass in
np.median to the
sns.barplot(data=df, x="x-values", y="y-values", estimator=np.median)
Consider the data in results.csv. To calculate the number of times a particular value appears in the
Response column , we pass in
sns.barplot(data=df, x="Patient ID", y="Response", estimator=len)
Consider our hospital satisfaction survey data, which is loaded into the Pandas DataFrame
We’d like to know how many men and women answered the survey. Use
plt.show() to display the graph.
sns.barplot() to graph the median
Response aggregated by