Quantiles are points that split a dataset into groups of equal size. For example, let’s say you just took a test and wanted to know whether you’re in the top 10% of the class. One way to determine this would be to split the data into ten groups with an equal number of datapoints in each group and see which group you fall into.
There are nine values that split the dataset into ten groups of equal size — each group has 3 different test scores in it.
Those nine values that split the data are quantiles! Specifically, they are the 10-quantiles, or deciles.
You can find any number of quantiles. For example, if you split the dataset into 100 groups of equal size, the 99 values that split the data are the 100-quantiles, or percentiles.
The quartiles are some of the most commonly used quantiles. The quartiles split the data into four groups of equal size.
In this lesson, we’ll show you how to calculate quantiles using NumPy and discuss some of the most commonly used quantiles.
We’ve imported a dataset of song lengths (measured in seconds). We’ve drawn a few histograms showing different quantiles.
What do you think a histogram that shows the 100-quantiles would look like?