In the previous exercise, we saw that when we join two data frames whose rows don’t match perfectly, we lose the unmatched rows.
This type of join (where we only include matching rows) is called an inner join. There are other types of joins that we can use when we want to keep information from the unmatched rows.
Suppose that two companies, Company A and Company B have just merged. They each have a list of customers, but they keep slightly different data. Company A has each customer’s name and email. Company B has each customer’s name and phone number. They have some customers in common, but some are different.
company_a
name | |
---|---|
Sally Sparrow | [email protected] |
Peter Grant | [email protected] |
Leslie May | [email protected] |
company_b
name | phone |
---|---|
Peter Grant | 212-345-6789 |
Leslie May | 626-987-6543 |
Aaron Burr | 303-456-7891 |
If we wanted to combine the data from both companies without losing the customers who are missing from one of the tables, we could use a Full Join. A Full Join would include all rows from both tables, even if they don’t match. Any missing values are filled in with NA
.
full_joined_dfs <- company_a %>% full_join(company_b)
The resulting table would look like this:
name | phone | |
---|---|---|
Sally Sparrow | [email protected] | NA |
Peter Grant | [email protected] | 212-345-6789 |
Leslie May | [email protected]om | 626-987-6543 |
Aaron Burr | NA |
303-456-7891 |
Instructions
There are two hardware stores in town: Store A and Store B. Store A’s inventory is in data frame store_a
and Store B’s inventory is in data frame store_b
. They have decided to merge into one big Super Store!
Combine the inventories of Store A and Store B using a full join. Save the results to the variable store_a_b_full
.