anova_lm
anova_lm()
is the function in Python’s statsmodels
library that produces an ANOVA table for one or more fitted linear models. ANOVA (Analysis of Variance) is a statistical method used to determine whether there are significant differences between the means of three or more groups. Researchers and analysts can use anova_lm()
to evaluate the effects of categorical variables on a continuous outcome and compare nested linear models to assess their relative explanatory power.
Syntax
statsmodels.stats.anova.anova_lm(*args, **kwargs)
*args
: One or more fitted model results (e.g., instances ofOLS
model results) to perform ANOVA on.**kwargs
: Optional keyword arguments to customize the ANOVA test, such as:scale
: Specifies the scale parameter.test
: Type of test (e.g., ‘F’ for F-test).typ
: Specifies the type of sum of squares to calculate (e.g., Type I, II, or III).robust
: IfTrue
, performs a robust ANOVA.
Example
This example demonstrates how to use the anova_lm()
function in the statsmodels
library to perform analysis of variance on a fitted linear model:
import statsmodels.api as smfrom statsmodels.formula.api import olsfrom statsmodels.stats.anova import anova_lmimport pandas as pd# Load example datadata = sm.datasets.get_rdataset('iris').data# Rename columns to make them Python-friendlydata.rename(columns=lambda x: x.replace('.', '_'), inplace=True)# Fit a linear regression model using Ordinary Least Squares (OLS)model = ols('Sepal_Length ~ Petal_Length + Petal_Width', data=data).fit()# Perform ANOVA on the fitted modelanova_results = anova_lm(model, typ=2)# Print the ANOVA tableprint(anova_results)
The ANOVA table produced shows the sum of squares, degrees of freedom, F-statistics, and p-values, helping to evaluate the significance of each predictor in the model:
sum_sq df F PR(>F)Petal_Length 9.934196 1.0 61.150938 9.414477e-13Petal_Width 0.644340 1.0 3.966300 4.827246e-02Residual 23.880694 147.0 NaN NaN
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python on Codecademy
- Career path
Data Scientist: Machine Learning Specialist
Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.Includes 27 CoursesWith Professional CertificationBeginner Friendly90 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours