Now that we’ve generated some random samples from a population using an applet, let’s code this ourselves in Python. The numpy.random
package has several functions that we could use to simulate random sampling. In this exercise, we’ll use the function np.random.choice()
, which generates a sample of some size from a given array.
In the example code, we’ll pretend that we’re all-powerful and actually have a list of all the weights of Atlantic Salmon that currently exist.
In the example code to the right, we have done the following:
- Loaded in the weights of all salmon into a dataframe called
population
. - Plotted the distribution of
population
and calculated the mean. - Used
np.random.choice()
function to generate a sample calledsample
of size 30 (samp_size
variable is equal to30
).
Instructions
Find the mean of the sample
, round it to 3 decimal places, and assign it to a variable called sample_mean
.
Uncomment the last 5 lines at the bottom of the editor to plot the histogram of the sample data.
You might have to scroll down to see the 2nd plot. You can comment out the first plot’s plt.show()
in order to avoid scrolling down each time.
Run the code a couple of times. This code should behave similarly to the applet we used in the last exercise.
Change the sample size to 10. Does the mean change more or less each time you run it with a smaller sample size?