This lesson introduced you to aggregates in R using dplyr. You learned:

  • How to calculate summary statistics with summarize()
  • How to perform aggregate statistics over individual rows with the same value or values using group_by()



Let’s examine some more data from ShoeFly.com. This time, in addition to the orders data, we’ll be looking at data about user visits to the website, stored in the page_visits data frame. Inspect the columns of the data frames using the rendered notebook.

Find the average price of an order in the orders data frame using summarize() and the mean() summary function. Save the resulting data frame to a variable named average_price and view it.

Don’t forget to include na.rm = TRUE as an argument in the call to mean()!


In the page_visits data frame, the column utm_source contains information about how users got to ShoeFly’s homepage. For instance, if utm_source = Facebook, then the user came to ShoeFly by clicking on an ad on Facebook.com.

Use a group_by statement to calculate how many visits came from each of the different sources. Save your answer to the variable click_source, and view it.


Our Marketing department thinks that the traffic to our site has been changing over the past few months. Use group_by to calculate the number of visits to our site from each utm_source for each month. Save your answer to the variable click_source_by_month, and view it.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?