Key Concepts

Review core concepts you need to learn to master this subject

Mean of a Dataset

The mean, or average, of a dataset is calculated by adding all the values in the dataset and then dividing by the number of values in the set.

For example, for the dataset [1,2,3], the mean is 1+2+3 / 3 = 2.

Median of a Dataset

The median of a dataset is the value that, assuming the dataset is ordered from smallest to largest, falls in the middle. If there are an even number of values in a dataset, the middle two values are the median.

Say we have a dataset with the following ten numbers:

24, 16, 30, 10, 12, 28, 38, 2, 4, 36

We can order this dataset from smallest to largest:

2, 4, 10, 12, 16, 24, 28, 30, 36, 38

The medians of this dataset are 16 and 24, because they are the fifth- and sixth-positioned observations in this dataset. In other words, there are four observations to the left of 16, and four observations to the right of 24.


If we added another value (say, 28) near the middle of this dataset:

2, 4, 10, 12, 16, 24, 28, 28, 30, 36, 38

The new median is equal to 24, because there are 5 values smaller than it, and 5 values larger than it.

  1. 1
    Finding the center of a dataset is one of the most common ways to summarize statistical findings. Often, people communicate the center of data using words like, on average, usually, or often….
  2. 2
    The mean, often referred to as the average, is a way to measure the center of a dataset. The average of a set is calculated using a two-step process: 1. Add all of the observations in your da…
  3. 3
    While you’ve shown that you can calculate the average yourself, it becomes time-consuming as the size of your dataset increases — imagine adding all of the numbers in a dataset with 10,000 ob…
  4. 4
    In this lesson, you learned how to calculate the average of a dataset using the formula: \bar{x} = \frac{x_1 + x_2 … + x_{n}}{n} and the NumPy function: np.average(my_array) — Circling back…
  1. 1
    In this lesson, you will learn how to find the median of a dataset — a common measure of a dataset’s center. Each of the next three exercises will cover the following topics: - Manually fin…
  2. 2
    The formal definition for the median of a dataset is: *The value that, assuming the dataset is ordered from smallest to largest, falls in the middle. If there are an even number of values in a dat…
  3. 3
    Finding the median of a dataset becomes increasingly time-consuming as the size of your dataset increases — imagine finding the median of an unsorted dataset with 10,000 observations. The Nu…
  4. 4
    In this lesson, you learned how to find the median of a dataset in two steps: 1. Sort the dataset 2. Identify the one or two numbers that fall in the middle of the sorted dataset You also learned …
  1. 1
    In this lesson, you will learn how to find the mode of a dataset. Each of the next three exercises will cover the following: - Manually finding the mode of a dataset - Using Python’s SciPy librar…
  2. 2
    The formal definition for the mode of a dataset is: *The most frequently occurring observation in the dataset. A dataset can have multiple modes if there is more than one value with the same maxim…
  3. 3
    Finding the mode of a dataset becomes increasingly time-consuming as the size of your dataset increases — imagine finding the mode of a dataset with 10,000 observations. The SciPy stats.mode…
  4. 4
    In this lesson, you learned how to find the mode of a dataset in two steps: 1. Find the frequency of every unique number in the dataset 2. Determine which number has the highest frequency You also…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo