When creating new columns from a data frame, sometimes you are interested in only keeping the new columns you add, and removing the ones you do not need. dplyr’s transmute()
function will add new columns while dropping the existing columns that may no longer be useful for your analysis. Let’s go back to the original inventory data frame for your store, The Handy Woman.
product_id | product_description | cost_to_manufacture | price |
---|---|---|---|
1 | 3 inch screw | 0.50 | 0.75 |
2 | 2 inch nail | 0.10 | 0.25 |
3 | hammer | 3.00 | 5.50 |
4 | screwdriver | 2.50 | 3.00 |
Like mutate()
, transmute()
takes name-value pairs as arguments. The names will be the names of the new columns you are adding, and the values are expressions defining the values of the new columns. The difference, however, is that transmute()
returns a data frame with only the new columns.
To add sales_tax
and profit
columns while dropping all other columns from the data frame:
df %>% transmute(sales_tax = price * 0.075, profit = price - cost_to_manufacture)
This inventory table will now look like this:
sales_tax | profit |
---|---|
0.06 | 0.25 |
0.02 | 0.15 |
0.41 | 2.5 |
0.22 | 0.5 |
Instructions
Update the code in the last code block to add the columns avg_height
, avg_weight
, and rank_change_13_to_16
to dogs
while dropping all existing columns.
Inspect the new data frame with head()
.
The new columns have been added, and all the old columns have been dropped. But, wait! What breed
do the column values refer to?
Add breed
back into the dogs
data frame by adding the following line of code as the first argument of the call to transmute()
:
breed = breed
This will keep the breed
column from being dropped in the transmuted data frame!