A probability mass function (PMF) is a type of probability distribution that defines the probability of observing a particular value of a discrete random variable. For example, a PMF can be used to calculate the probability of rolling a three on a fair six-sided die.

There are certain kinds of random variables (and associated probability distributions) that are relevant for many different kinds of problems. These commonly used probability distributions have names and parameters that make them adaptable for different situations.

For example, suppose that we flip a fair coin some number of times and count the number of heads. The probability mass function that describes the likelihood of each possible outcome (eg., 0 heads, 1 head, 2 heads, etc.) is called the binomial distribution. The parameters for the binomial distribution are:

  • n for the number of trials (eg., n=10 if we flip a coin 10 times)
  • p for the probability of success in each trial (probability of observing a particular outcome in each trial. In this example, p= 0.5 because the probability of observing heads on a fair coin flip is 0.5)

If we flip a fair coin 10 times, we say that the number of observed heads follows a Binomial(n=10, p=0.5) distribution. The graph below shows the probability mass function for this experiment. The heights of the bars represent the probability of observing each possible outcome as calculated by the PMF.

A histogram with markers 0 to 10 along the x-axis and the heights of the bars at each marker represent the probability of observing the value of the marker from this distribution


Let’s see how the shape of the binomial distribution changes as the sample size changes.

Use the slider to change the value of x fair coin flips, between one and ten. The heights of the resulting bars represent the probability of observing different values of heads from x number of fair coin flips. You can roll your cursor over each bar and see the actual numeric value of the bar height. Taller bars represent more likely outcomes.

Notice that as x increases, the bars get smaller. This is because the sum of the heights of all the bars will always equal 1. So when x is larger, the number of heads we can observe increases, and the probability needs to be divided between more values.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?