Data lies at the heart of nearly every problem in the business world and society. Having the right tools to manipulate data and organize it in a meaningful way is integral to performing data analyses and discovering unique insights!
The dplyr package in R is designed to make data manipulation tasks simpler and more intuitive than working with base R functions only. Called a “grammar of data manipulation,” dplyr provides functions that solve many challenges that arise when organizing tabular data (i.e., data in a table with rows and columns). Tabular data has a lot of the same functionality as tables from SQL or Excel, but dplyr adds the power of R.
In addition to learning how to load data into R with the readr package, this lesson will introduce how to perform the following data manipulation tasks with dplyr:
- select columns of a table
- filter rows of a table
- arrange rows of a table in order
dplyr and readr are a part of the tidyverse, a collection of R packages designed for data science. In this and future lessons, you will use different packages of the tidyverse to more easily analyze and visualize data!
The tidyverse is a package itself, and it can be imported at the top of your file if you need to use any of the packages it contains.
In our lessons, however, we will explicitly import the packages within the tidyverse that we are using. To get started with readr and dplyr, you can import them at the top of your
.Rmd R-markdown file or
notebook.Rmd file in the code editor contains the code to load, inspect and manipulate a data frame containing data from the top 7 most popular music groups of 2018.
Load the readr and dplyr libraries in the code block at the top of the notebook so the code can properly run.
By the end of this lesson, you will be able to perform this same analysis!