Now that we’ve generated some random samples from a population using an applet, let’s code this ourselves in Python. The numpy.random package has several functions that we could use to simulate random sampling. In this exercise, we’ll use the function np.random.choice(), which generates a sample of some size from a given array.

In the example code, we’ll pretend that we’re all-powerful and actually have a list of all the weights of Atlantic Salmon that currently exist.

In the example code to the right, we have done the following:

  • Loaded in the weights of all salmon into a dataframe called population.
  • Plotted the distribution of population and calculated the mean.
  • Used np.random.choice() function to generate a sample called sample of size 30 (samp_size variable is equal to 30).



Find the mean of the sample, round it to 3 decimal places, and assign it to a variable called sample_mean.


Uncomment the last 4 lines at the bottom of the editor to plot the histogram of the sample data.

You might have to scroll down to see the 2nd plot. You can comment out the first plot’s plt.show() in order to avoid scrolling down each time.

Run the code a couple of times. This code should behave similarly to the applet we used in the last exercise.


Change the sample size to 10. Does the mean change more or less each time you run it with a smaller sample size?

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?