Chi-Squared tests
Anonymous contributor
Published Jan 14, 2025
Contribute to Docs
The Chi-Square test in statsmodels is used to test whether observed proportions differ from expected proportions. It is commonly used to compare proportions across multiple groups or categories. The test can be applied in two contexts: goodness-of-fit (to see if the proportions match an expected distribution) and test of independence (to assess if two categorical variables are independent).
Syntax
scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0)
f_obs
: The observed frequencies or values. This should be a 1D or 2D array where each value represents the observed count in a category or group.f_exp
: The expected frequencies or values. This is also a 1D or 2D array, where each value represents the expected count in the corresponding category or group.ddof
: The “Delta Degrees of Freedom” adjustment for the test. This is used to adjust for the number of parameters estimated from the data. For a goodness-of-fit test,ddof=0
is standard, but you can adjust it for specific models or tests.axis
: The axis along which the test is computed. For multi-dimensional data, you can specify the axis (0 for rows and 1 for columns). Ifaxis
is set toNone
, the test is applied to all dimensions of the array.
Example
In this example, a chi-square test is performed to compare observed proportions across four categories with the expected proportions to determine if they differ:
from scipy.stats import chisquare# Observed countscounts = [150, 80, 100, 70]# For equal expected proportions (null hypothesis)# Expected counts would be total/number of categoriesn_categories = len(counts)total = sum(counts)expected = [total/n_categories] * n_categories# Perform chi-square testchi2_stat, p_value = chisquare(f_obs=counts, # Observed frequenciesf_exp=expected, # Expected frequenciesddof=0 # Degrees of freedom adjustment)# Print resultsprint(f"Chi-square statistic: {chi2_stat}")print(f"P-value: {p_value}")# Interpret resultsalpha = 0.05if p_value < alpha:print("Reject the null hypothesis: The proportions are significantly different.")else:print("Fail to reject the null hypothesis: The proportions are not significantly different.")
The code above generates the ouput as follows:
Chi-square statistic: 38.0P-value: 2.8264748814532456e-08Reject the null hypothesis: The proportions are significantly different.
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python on Codecademy
- Career path
Computer Science
Looking for an introduction to the theory behind programming? Master Python while learning data structures, algorithms, and more!Includes 6 CoursesWith Professional CertificationBeginner Friendly75 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours