Introduction to Statistics with NumPy
Sorting and Outliers

One way to quickly identify outliers is by sorting our data, Once our data is sorted, we can quickly glance at the beginning or end of an array to see if some values lie far beyond the expected range. We can use the NumPy function np.sort to sort our data.

Let’s go back to our 3rd grade height example, and imagine an 8th grader walked into our experiement:

>>> heights = np.array([49.7, 46.9, 62, 47.2, 47, 48.3, 48.7])

If we use np.sort, we can immediately identify the taller student since their height (62”) is noticeably outside the range of the dataset:

>>> np.sort(heights) array([ 46.9, 47. , 47.2, 48.3, 48.7, 49.7, 62])



You’ve been tracking temperature data over the summer on your back porch, but realized that you placed your sensor right over a grill! Before you can use your data, you need to check to see if the heat from the grill caused any weird readings that could skew your data.

First, sort the temps data array and save the sorted data to a sorted_temps variable.


Now, print the sorted_temps array. What do we see? Did the grill, in fact, create outliers in our data?

Folder Icon

Sign up to start coding

Already have an account?