By default, bar plots using
geom_bar() show the count of observations for each value. We can also show other types of data, such as calculating and showing the mean instead.
Let’s say we want to see how much an animal sleeps on average by the kind of food it eats, based on the
msleep dataset. The code below does just that! Recall that passing
stat = "identity" to a
geom_bar() layer tells
ggplot2 to display values as is, rather than count the number of occurrences. We can similarly use
stat = "summary", which tells
ggplot2 to summarize values according to a provided function. We can specify
fun = "mean" to summarize our
y axis variable by calculating mean values for each value in our
x axis variable.
# Filter our data to include only hours spent asleep, omitting NA values msleep_means_df <- msleep %>% filter(status == "asleep") %>% na.omit() # Construct a bar plot calculating and displaying means msleep_meanbar <- ggplot(msleep_means_df, aes(x = diet, y = hours)) + labs(title="Mean Hours Asleep by Diet") + geom_bar(stat = "summary", fun = "mean")
Here’s how this plot looks. In the
msleep dataset, insectivores (animals that eat insects) sleep for fifteen hours a day on average, which is far more than animals with other diets!
We now want to calculate mean graduation rates for all students across all schools by year in our
graduation_df dataset. We’ve filtered
graduation_df to only retain rows where the
Demographic column equals
Total Cohort and the
Status column equals
head() function on our new
graduation_means_df data frame to examine it.
Create a bar plot using
Year to the
x axis and
Pct to the
y axis. Use
stat = "summary" and
fun = "mean" to calculate and display mean values on the
Print the plot to see the change in mean graduation rates over time!