Learn

Quartiles

Quartiles

A common way to communicate a high-level overview of a dataset is to find the values that split the data into four groups of equal size.

By doing this, we can then say whether a new datapoint falls in the first, second, third, or fourth quarter of the data.

The values that split the data into fourths are the *quartiles*.

Those values are called the first quartile (Q1), the second quartile (Q2), and the third quartile (Q3)

In the image above, Q1 is `10`

, Q2 is `13`

, and Q3 is `22`

. Those three values split the data into four groups that each contain five datapoints.

In this lesson, you will learn to calculate the quartiles by hand, and by using Python’s NumPy library.

In this lesson we’ll be looking at a dataset about music. We’ve plotted a histogram of song lengths (measured in seconds) of 9,975 random songs.

Look up the length of a favorite song of yours. Do you think that song falls in the first, second, third or fourth quarter of the data?

For example, we’ve picked one of our favorite songs, *Chicago* by Sufjan Stevens. *Chicago* is `364`

seconds long — we’ve plotted it as a red vertical line. It looks like *Chicago* is in either the third or fourth quarter of the data, but it’s hard to say for sure. Let’s find the quartiles of the dataset!