Suppose you have a data frame called customers
, which contains the ages of your business’s customers:
name | age | gender |
---|---|---|
Rebecca Erikson | 35 | F |
Thomas Roberson | 28 | M |
Diane Ochoa | 42 | NA |
For your analysis, you only care about the age and gender of your customers, not their names. The data frame you want looks like this:
age | gender |
---|---|
35 | F |
28 | M |
42 | NA |
You can select the appropriate columns for your analysis using dplyr
‘s select()
function:
select(customers,age,gender)
select()
takes a data frame as its first argument- all additional arguments are the desired columns to select
select()
returns a new data frame containing only the desired columns
But what about the pipe %>%
, you ask? Great question. You can simplify the readability of your code by using the pipe:
customers %>% select(age,gender)
When using the pipe, you can read the code as: from the customers
table, select()
the age
and gender
columns. From now on we will use the pipe symbol where appropriate to simplify our code.
Instructions
Select the group
column of artists
using select()
and save the result to artist_groups
. View artist_groups
.
Select the group
, spotify_monthly_listeners
, and year_founded
columns of artists
using select()
and save the result to group_info
. View group_info
.