Take a look at the file called results.csv. You’ll plot that data soon, but before you plot it, take a minute to understand the context behind that data, which is based on a hypothetical situation we have created:
Suppose we are analyzing data from a survey: we asked 1,000 patients at a hospital how satisfied they were with their experience. Their response was measured on a scale of 1 - 10, with 1 being extremely unsatisfied, and 10 being extremely satisfied. We have summarized that data in a CSV file called results.csv.
To plot this data using Matplotlib, you would write the following:
df = pd.read_csv("results.csv") ax = plt.subplot() plt.bar(range(len(df)), df["Mean Satisfaction"]) ax.set_xticks(range(len(df))) ax.set_xticklabels(df.Gender) plt.xlabel("Gender") plt.ylabel("Mean Satisfaction")
That’s a lot of work for a simple bar chart! Seaborn gives us a much simpler option. With Seaborn, you can use the
sns.barplot() command to do the same thing.
The Seaborn function
sns.barplot(), takes at least three keyword arguments:
data: a Pandas DataFrame that contains the data (in this example,
x: a string that tells Seaborn which column in the DataFrame contains other x-labels (in this case,
y: a string that tells Seaborn which column in the DataFrame contains the heights we want to plot for each bar (in this case
By default, Seaborn will aggregate and plot the mean of each category. In the next exercise you will learn more about aggregation and how Seaborn handles it.
Use Pandas to load in the data from results.csv and save it to the variable
Remove all of the
# characters from in front of the
sns.barplot command and fill in the missing values.
plt.show() to display the completed bar plot.