Python anova_lm
anova_lm() is the function in Python’s statsmodels library that produces an ANOVA table for one or more fitted linear models. ANOVA (Analysis of Variance) is a statistical method used to determine whether there are significant differences between the means of three or more groups. Researchers and analysts can use anova_lm() to evaluate the effects of categorical variables on a continuous outcome and compare nested linear models to assess their relative explanatory power.
Syntax
statsmodels.stats.anova.anova_lm(*args, **kwargs)
*args: One or more fitted model results (e.g., instances ofOLSmodel results) to perform ANOVA on.**kwargs: Optional keyword arguments to customize the ANOVA test, such as:scale: Specifies the scale parameter.test: Type of test (e.g., ‘F’ for F-test).typ: Specifies the type of sum of squares to calculate (e.g., Type I, II, or III).robust: IfTrue, performs a robust ANOVA.
Example
This example demonstrates how to use the anova_lm() function in the statsmodels library to perform analysis of variance on a fitted linear model:
import statsmodels.api as smfrom statsmodels.formula.api import olsfrom statsmodels.stats.anova import anova_lmimport pandas as pd# Load example datadata = sm.datasets.get_rdataset('iris').data# Rename columns to make them Python-friendlydata.rename(columns=lambda x: x.replace('.', '_'), inplace=True)# Fit a linear regression model using Ordinary Least Squares (OLS)model = ols('Sepal_Length ~ Petal_Length + Petal_Width', data=data).fit()# Perform ANOVA on the fitted modelanova_results = anova_lm(model, typ=2)# Print the ANOVA tableprint(anova_results)
The ANOVA table produced shows the sum of squares, degrees of freedom, F-statistics, and p-values, helping to evaluate the significance of each predictor in the model:
sum_sq df F PR(>F)Petal_Length 9.934196 1.0 61.150938 9.414477e-13Petal_Width 0.644340 1.0 3.966300 4.827246e-02Residual 23.880694 147.0 NaN NaN
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python on Codecademy
- Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.
- Includes 27 Courses
- With Professional Certification
- Beginner Friendly.95 hours
- Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.
- With Certificate
- Beginner Friendly.24 hours