We were able to find quartiles manually by looking at the dataset and finding the correct division points. But that gets much harder when the dataset starts to get bigger. Luckily, there is a function in base R that will find the quartiles for you.
The base R function that we’ll be using is named
quantile(). You can learn more about quantiles in our quantiles lesson, but for right now all you need to know is that a quartile is a specific kind of quantile.
The code below calculates the third quartile of the given dataset:
dataset <- c(50, 10, 4, -3, 4, -20, 2) third_quartile <- quantile(dataset, 0.75)
quantile() function takes two parameters. The first is the dataset you’re interested in. The second is a number between
1. Since we calculated the third quartile, we used
0.75 — we want the point that splits the first 75% of the data from the rest.
For the second quartile, we’d use
0.5. This will give you the point that 50% of the data is below and 50% is above.
Notice that the dataset doesn’t need to be sorted for R’s function to work!
We’ve brought back our music dataset. The lengths of 9,975 songs (in seconds) are stored in a variable named
songs. Use the
quantile() function to find the first quartile. Store the result in a variable named
Find the second and third quartile of the dataset and store the values in two variables named
Look up the length of your favorite song in seconds. Store that value in a variable named
Does that song fall in the first, second, third, or fourth quarter of the data? Create a variable named
quarter equal to
1 if your favorite song falls in the first quarter of the data. Set it equal to
2 if your song falls in the second fourth. Set it equal to
3 if your song falls in the third fourth. And set it to
4 if your song falls in the final fourth of the data.