Earlier, we mentioned that the parameter lambda (λ) is the expected value (or average value) of the Poisson distribution. But what does this mean?
Let’s put this into context: let’s say we are salespeople, and after many weeks of work, we calculate our average to be 10 sales per week. If we take this value to be our expected value of a Poisson Distribution, the probability mass function will look as follows:
The tallest bar represents the value with the highest probability of occurring. In this case, the tallest bar is at 10. This does not, however, mean that we will make 10 sales. It means that on average, across all weeks, we expect our average to equal about 10 sales per week.
Let’s look at this another way. Let’s take a sample of 1000 random values from the Poisson distribution with the expected value of 10. We can use the poisson.rvs()
method in the scipy.stats
library to generate random values:
import scipy.stats as stats # generate random variable # stats.poisson.rvs(lambda, size = num_values) rvs = stats.poisson.rvs(10, size = 1000)
The histogram of this sampling looks like the following:
We can see observations of as low as 2 but as high as 20. The tallest bars are at 9 and 10. If we took the average of the 1000 random samples, we would get:
print(rvs.mean())
Output:
10.009
This value is very close to 10, confirming that over the 1000 observations, the expected value (or average) is 10.
When we talk about the expected value, we mean the average over many observations. This relates to the Law of Large Numbers: the more samples we have, the more likely samples will resemble the true population, and the mean of the samples will approach the expected value. So even though the salesperson may make 3 sales one week, they may make 16 the next, and 11 the week after. In the long run, after many weeks, the expected value (or average) would still be 10.
Instructions
Uncomment rand_vars
and set it equal to 1000 random variates from the Poisson distribution with lambda equal to 15.
Calculate and print the mean rand_vars
.
We have preloaded a function histogram_function()
that takes a list of random variables and plots them to a histogram.
Using the same rand_vars
variable, run the histogram_function
function with rand_vars
as the argument to produce the histogram of the data.