It is easy to do this kind of matching for one row, but hard to do it for multiple rows.
Luckily, dplyr can efficiently do this for the entire table using the inner_join()
method.
The inner_join()
method looks for columns that are common between two data frames and then looks for rows where those columns’ values are the same. It then combines the matching rows into a single row in a new table.
We can call the inner_join()
method with two data frames like this:
joined_df <- orders %>% inner_join(customers)
This will match up all of the customer information to the orders that each customer made.
Instructions
You are an analyst at Cool T-Shirts Inc. You are going to help them analyze some of their sales data.
There are two data frames defined in the file notebook.Rmd:
sales
contains the monthly revenue for Cool T-Shirts Inc. It has two columns:month
andrevenue
.targets
contains the goals for monthly revenue for each month. It has two columns:month
andtarget
.
Create a new data frame sales_vs_targets
which contains the inner_join()
of sales
and targets
.
Cool T-Shirts Inc. wants to know the months when they crushed their targets.
Filter sales_vs_targets
to only include the rows where revenue
is greater than target
. Save these rows to the variable crushing_it
.