Regression
Regression is a mathematical process used to model data by identifying a function that best represents its patterns. In machine learning, regression functions are used for predictive analysis.
There are various regression techniques and the choice depends on factors such as data distribution. A simple form is linear regression, represented by the equation:
y = a\*x + b
Visualizing this equation as a straight line on a 2D graph:
y
: The dependent (outcome) variable, plotted on the y-axis (vertical).x
: The independent (predictor) variable, plotted on the x-axis (horizontal).b
: The intercept, representing the value ofy
whenx = 0
.a
: The slope, indicating howy
changes whenx
increases by one unit.
Example
The following code predicts a person’s weight based on a person’s height:
import pandas as pdimport statsmodels.api as smimport matplotlib.pyplot as plt# Sample dataheights = [150, 152, 160, 172, 176, 176, 180, 189]weights = [50, 65, 65, 70, 80, 90, 90, 89]# Create a DataFramemeasurements = pd.DataFrame({'height': heights, 'weight': weights})# Fit the linear regression modelmodel = sm.OLS.from_formula("weight ~ height", data=measurements)results = model.fit()# Print the summary of the modelprint(results.summary())# Plot the data and the regression lineplt.scatter(measurements['height'], measurements['weight'], label='Data')plt.plot(measurements['height'], results.predict(measurements), color='red', label='Regression Line')plt.xlabel('Height (cm)')plt.ylabel('Weight (kg)')plt.title('Height vs Weight with Regression Line')plt.legend()# Save the plot as an image fileplt.savefig('height-vs-weight-plot.png')# Show the plotplt.show()
This code performs linear regression using statsmodels
to analyze the relationship between height and weight. It fits a model of the form weight = a * height + b
, prints the regression summary, and visualizes the data with a scatter plot and a best-fit line.
The output of this code is as follows:
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn AI on Codecademy
- Career path
Computer Science
Looking for an introduction to the theory behind programming? Master Python while learning data structures, algorithms, and more!Includes 6 CoursesWith Professional CertificationBeginner Friendly75 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours