By default, bar plots using geom_bar()
show the count of observations for each value. We can also show other types of data, such as calculating and showing the mean instead.
Let’s say we want to see how much an animal sleeps on average by the kind of food it eats, based on the msleep
dataset. The code below does just that! Recall that passing stat = "identity"
to a geom_bar()
layer tells ggplot2
to display values as is, rather than count the number of occurrences. We can similarly use stat = "summary"
, which tells ggplot2
to summarize values according to a provided function. We can specify fun = "mean"
to summarize our y
axis variable by calculating mean values for each value in our x
axis variable.
# Filter our data to include only hours spent asleep, omitting NA values msleep_means_df <- msleep %>% filter(status == "asleep") %>% na.omit() # Construct a bar plot calculating and displaying means msleep_meanbar <- ggplot(msleep_means_df, aes(x = diet, y = hours)) + labs(title="Mean Hours Asleep by Diet") + geom_bar(stat = "summary", fun = "mean")
Here’s how this plot looks. In the msleep
dataset, insectivores (animals that eat insects) sleep for fifteen hours a day on average, which is far more than animals with other diets!
Instructions
We now want to calculate mean graduation rates for all students across all schools by year in our graduation_df
dataset. We’ve filtered graduation_df
to only retain rows where the Demographic
column equals Total Cohort
and the Status
column equals Graduated
.
Run the head()
function on our new graduation_means_df
data frame to examine it.
Create a bar plot using graduation_means_df
named graduation_meanbar
. Map Year
to the x
axis and Pct
to the y
axis. Use stat = "summary"
and fun = "mean"
to calculate and display mean values on the y
axis.
Print the plot to see the change in mean graduation rates over time!