Probability Distributions
In statistics and probability theory, a probability distribution describes how the values of a random variable are spread or distributed. It gives the probabilities of the possible outcomes of an experiment or event. In the context of SciPy, the scipy.stats
module provides a wide range of probability distributions that can be used for modelling, simulating, and analyzing random processes.
SciPy’s scipy.stats
module includes continuous and discrete distributions. Continuous distributions, such as normal or exponential distributions, are used to model variables that can take any real value within a range. Discrete distributions, such as the binomial or Poisson distributions, model scenarios where outcomes are limited to specific, countable values.
The scipy.stats
module offers various functions for each distribution type, including:
- Probability Density Function (PDF): Describes the likelihood of a given value under a continuous distribution.
- Cumulative Distribution Function (CDF): Gives the probability that a random variable will take a value less than or equal to a specified value.
- Random Variates: Functions to generate random samples from a specified distribution.
- Statistical Moments: Functions for calculating the mean, variance, skewness, and kurtosis of the distribution.
These distributions are essential tools for tasks such as hypothesis testing, statistical modelling, simulations, and generating synthetic data that follows known statistical properties.
Syntax
Each distribution is represented by a corresponding class in scipy.stats
, which provides methods for computing properties like probability density, cumulative distribution, and random sampling.
from scipy import stats
# For continuous distributions
dist = stats.norm(loc=0, scale=1) # Normal distribution with mean 0 and std 1
# For discrete distributions
dist_binom = stats.binom(n=10, p=0.5) # Binomial distribution with 10 trials and 0.5 probability of success
Methods available for probability distributions
pdf(x)
: Probability Density Function (for continuous distributions)cdf(x)
: Cumulative Distribution Functionrvs(size)
: Random variates (sampling from the distribution)mean()
: Mean of the distributionstd()
: Standard deviation of the distribution
Examples
Normal Distribution
In this example, the probability density function (PDF) will be calculated, and random samples from a normal distribution will be generated.
import numpy as npimport matplotlib.pyplot as pltfrom scipy import stats# Define a normal distribution with mean 0 and standard deviation 1dist = stats.norm(loc=0, scale=1)# Generate 1000 random samples from the distributionsamples = dist.rvs(size=1000)# Plot the histogram of the samplesplt.hist(samples, bins=30, density=True, alpha=0.6, color='g')# Plot the PDF of the normal distributionx = np.linspace(-4, 4, 100)pdf = dist.pdf(x)plt.plot(x, pdf, 'k', linewidth=2)plt.title('Normal Distribution: Mean = 0, Std = 1')plt.show()
This code generates random samples from a standard normal distribution and visualizes both the histogram of the samples and the probability density function (PDF).
The output will be:
Binomial Distribution
In this example, the binomial distribution will be calculated to compute the probability of a specific number of successes in a series of trials.
from scipy import stats# Define a binomial distribution with 10 trials and probability of success 0.5dist_binom = stats.binom(n=10, p=0.5)# Probability of getting exactly 5 successesprob_5_successes = dist_binom.pmf(5)print(f"Probability of 5 successes: {prob_5_successes}")# Generate 1000 random samplessamples_binom = dist_binom.rvs(size=1000)# Plot the histogram of the binomial samplesimport matplotlib.pyplot as pltplt.hist(samples_binom, bins=10, density=True, alpha=0.6, color='blue')plt.title('Binomial Distribution: n = 10, p = 0.5')plt.show()
In this case, a binomial distribution with 10 trials and a 50% chance of success is defined. The example computes the probability of obtaining exactly 5 successes and visualizes the distribution of random samples.
The output will be:
Probability of 5 successes: 0.2460937500000002
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn SciPy on Codecademy
- Career path
Computer Science
Looking for an introduction to the theory behind programming? Master Python while learning data structures, algorithms, and more!Includes 6 CoursesWith Professional CertificationBeginner Friendly75 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours