Student's t Distribution
The Student’s t distribution is a probability distribution used in statistical inference when working with small sample sizes or when the population standard deviation is unknown. It resembles the normal distribution but features heavier tails, making it more appropriate for estimating population parameters with limited data. This distribution is fundamental in hypothesis testing, confidence interval construction, and statistical modeling.
The formula for a t-statistic is given by:
$$t = \frac{\bar{x} - \mu}{s / \sqrt{n}}$$
Where:
t
: t-statistic value- $\bar{x}$: sample mean
μ
: population means
: sample standard deviationn
: sample size
The probability density function (PDF) of the t-distribution with v degrees of freedom is:
$$f(t) = \frac{\Gamma(\frac{v+1}{2})}{\sqrt{v\pi}\Gamma(\frac{v}{2})} (1 + \frac{t^2}{v})^{-\frac{v+1}{2}}$$
Where:
Γ
is the gamma functionv
represents the degrees of freedom (df = n-1)
Key Properties
The t-distribution has several distinctive characteristics:
- Degrees of Freedom: Calculated as
n-1
(sample size minus one), this parameter determines the shape of the distribution. - Symmetry: Like the normal distribution, the t-distribution is symmetric around zero.
- Heavier Tails: Compared to the normal distribution, the t-distribution has heavier tails, meaning extreme values are more probable.
- Convergence to Normal Distribution: As degrees of freedom increase, the t-distribution approaches the standard normal distribution. When df > 30, the t-distribution is practically indistinguishable from the normal distribution.
- Mean, Median, and Mode: All equal to 0 when degrees of freedom > 1.
- Variance: Equal to
v/(v-2)
for v > 2, undefined for 1 < v ≤ 2, and infinite for v = 1.
Note: The heavier tails of the t-distribution account for the additional uncertainty introduced when estimating the population standard deviation from a sample.
Applications
The Student’s t distribution is widely used in various statistical scenarios:
- Hypothesis Testing: In t-tests to determine if there’s a significant difference between sample means and population means, or between two sample means when sample sizes are small (n<30) or population standard deviation is unknown.
- Confidence Intervals: To establish intervals for population parameters when the population standard deviation is unknown or sample sizes are small (n<30).
- Regression Analysis: In determining the significance of regression coefficients.
Example: Plotting a t-Distribution in Python
This example demonstrates how to generate and visualize a Student’s t-distribution for different degrees of freedom (df):
import numpy as npimport matplotlib.pyplot as pltfrom scipy.stats import t# Define x valuesx = np.linspace(-4, 4, 1000)# Plot t-distributions for different degrees of freedomdfs = [1, 5, 10, 30] # Different sample sizesfor df in dfs:plt.plot(x, t.pdf(x, df), label=f'df = {df}')# Plot standard normal distribution for comparisonfrom scipy.stats import normplt.plot(x, norm.pdf(x), 'k--', label='Normal (df → ∞)')# Labels and legendplt.title("Student's t-Distribution for Different Degrees of Freedom")plt.xlabel("t value")plt.ylabel("Probability Density")plt.legend()plt.grid()# Show plotplt.show()
This example results in the following output:
For a comprehensive understanding of statistical distributions and their applications, consider exploring Codecademy’s Master Statistics with Python skill path.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Data Science on Codecademy
- Career path
Data Scientist: Machine Learning Specialist
Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.Includes 27 CoursesWith Professional CertificationBeginner Friendly95 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours