So far, we’ve defined the term sampling distribution and shown how we can simulate an approximated sampling distribution for a few different statistics (mean, maximum, variance, etc.). The Central Limit Theorem (CLT) allows us to specifically describe the sampling distribution of the mean.

The CLT states that the sampling distribution of the mean is normally distributed as long as the population is not too skewed or the sample size is large enough. Using a sample size of n > 30 is usually a good rule of thumb, regardless of what the distribution of the population is like. If the distribution of the population is normal, the sample size can be smaller than that.

Let’s take another look at the salmon weight to see how the CLT applies here. The first plot below shows the population distribution. The salmon weight is skewed right, meaning the tail of the distribution is longer on the right than on the left.

This graph shows the distribution of salmon weights across the entire population. The distribution is right-skewed as it ranges from 0 to almost 300 pounds.

Next, we’ve simulated a sampling distribution of the mean (using a sample size of 100) and super-imposed a normal distribution on top of it. Note how the estimated sampling distribution follows the normal curve almost perfectly.

This graph shows the sampling distribution of salmon weights across a sample size of 50. The sampling distribution is approximately normal, despite the population distribution being right-skewed, showcasing one of the key ideas behind the central limit theorem.

Note that the CLT only applies to the sampling distribution of the mean and not other statistics like maximum, minimum, and variance!



In order to see the Central Limit Theorem in action, let’s look at another population of fish that is not normally distributed.

We have loaded this data on the weight of cod fish into the workspace.

Uncomment the three lines underneath ## Checkpoint 1 to see the plot of the distribution of cod fish. Note the distribution.


Now that we have seen the skewed population distribution, let’s simulate a sampling distribution of the mean. According to the CLT, we will see a normal distribution once the sampling size is large enough. To start, we have set the sample size to 6.

Uncomment the five lines at the very bottom, run the code once, and take a look at the sampling distribution.

Remember to scroll down to see the second plot.


Now change the sample size to 50 and run the code. Does the estimated sampling distribution look more normal now?

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?