Learn

Sometimes rather than specify what columns you want to select from a data frame, it’s easier to state what columns you do not want to select. dplyr‘s select() function also enables you to do just that! Consider a customers data frame that contains biographical information for the customers of your business:

name address phone age
Martha Jones 123 Main St. 234-567-8910 28
Rose Tyler 456 Maple Ave. 212-867-5309 22
Donna Noble 789 Broadway 949-123-4567 35
Amy Pond 98 West End Ave. 646-555-1234 29
Clara Oswald 54 Columbus Ave. 714-225-1957 31

You are interested in analyzing where your customers live and how old they are. For your analysis, you do not care about the name and phone associated with a customer, only their address and age. To exclude the columns you do not need:

customers %>% select(-name,-phone)
  • the data frame customers is piped into select()
  • the columns to remove, prepended with a -, are given as arguments
  • a new data frame without the name and phone columns is returned

Instructions

1.

Select all columns of artists except albums using select() and save the result to no_albums. View no_albums.

2.

Select all columns of artists except genre, spotify_monthly_listeners, and year_founded using select() and save the result to df_cols_removed. View df_cols_removed.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?