Skip to Content
Learn
Aggregates in R
Calculating Aggregate Functions II

Sometimes, we want to group by more than one column. We can do this by passing multiple column names as arguments to the group_by function.

Imagine that we run a chain of stores and have data about the number of sales at different locations on different days:

location date day_of_week total_sales
West Village February 1 W 400
West Village February 2 Th 450
Chelsea February 1 W 375
Chelsea February 2 Th 390

We suspect that sales are different at different locations on different days of the week. In order to test this hypothesis, we could calculate the average sales for each store on each day of the week across multiple months. The code would look like this:

df %>% group_by(location,day_of_week) %>% summarize(mean_total_sales = mean(total_sales))

And the results might look something like this:

location day_of_week mean_total_sales
Chelsea M 402.50
Chelsea Tu 422.75
Chelsea W 452.00
West Village M 390
West Village Tu 400

Instructions

1.

At ShoeFly.com, our Purchasing team thinks that certain shoe_type/shoe_color combinations are particularly popular this year (for example, blue ballet flats are all the rage in Paris).

Find the total number of shoes of each shoe_type/shoe_color combination purchased using group_by, summarize() and n(). Save your result to the variable shoe_counts, and view it.

2.

The Marketing team wants to better understand the different price levels of the kinds of shoes that have been sold on the website, in particular looking at shoe_type/shoe_material combinations.

Find the mean price of each shoe_type/shoe_material combination purchased using group_by, summarize() and mean(). Save your result to the variable shoe_prices, and view it.

Don’t forget to include na.rm = TRUE as an argument in the summary function that you call!

Folder Icon

Take this course for free

Already have an account?