Codecademy Logo

Quartiles, Quantiles, and IQR

Quantiles

Quantiles are the set of values/points that divides the dataset into groups of equal size. For example, in the figure, there are nine values that splits the dataset. Those nine values are quantiles.

Quartiles

The three dividing points (or quantiles) that split data into four equally sized groups are called quartiles. For example, in the figure, the three dividing points Q1, Q2, Q3 are quartiles.

An example of quartiles. 

The quartile example has a horizontal number line with a range from the number 5 to the number 30 with a vertical mark on the line for each number between. The number line only has labels for the numbers 5, 10, 15, 20, 25, and 30. The number line has three long vertical lines representing the points that divide the data into four equally sized groups. There is a label above each line. The left-most vertical line, at the number 10 on the number line, is red and is labeled 'Q1=10'. The middle vertical line, at the number 13 on the number line, is green and is labeled 'Q2=13'. The right-most vertical line, at the number 22 on the number line, is yellow and is labeled 'Q3=22'. 

At different points on the number line there are purple dots that represent data. The data is distributed as follows: The number 6 has one dot. The number 7 has two dots. The numbers 8 and 9 each have a single dot. The number 11 has three dots, and 12 has two. The numbers 14, 15, 17, 19 and 20 each have one dot. The number 24 has one dot, 25 has two dots, and the numbers 26 and 28 each have a single dot.

Numpy’s Quantile() Function

In Python, the numpy.quantile() function takes an array and a number say q between 0 and 1. It returns the value at the qth quantile. For example, numpy.quantile(data, 0.25) returns the value at the first quartile of the dataset data.

Quantiles and Groups

If the number of quantiles is n, then the number of equally sized groups in a dataset is n+1.

Median in Quantiles

The median is the divider between the upper and lower halves of a dataset. It is the 50%, 0.5 quantile, also known as the 2-quantile.

# The value 5 is both the median and the 2-quantile
data = [1, 3, 5, 9, 20]
Second_quantile = 5

Interquartile Range Definition

The interquartile range is the difference between the first(Q1) and third quartiles(Q3). It can be mathematically represented as IQR = Q3 - Q1.

Interquartile Range and Outliers

The interquartile range is considered to be a robust statistic because it is not distorted by outliers like the average (or mean).

# Eventhough d_2 has an outlier, the IQR is identical for the 2 datasets
d_1 = [1,2,3,4,5,6,7,8,9]
d_2 = [-100,2,3,4,5,6,7,8,9]

Learn more on Codecademy