Linear Regression Analysis
In sklearn, Linear Regression Analysis is a machine learning technique used to predict a dependent variable based on one or more independent variables, assuming a linear relationship.
In simple linear regression, we predict the dependent variable Y
using a single independent variable X
, fitting the data to a straight line, often called as the regression line
. The equation for the line is:
$$ Y = \beta_0 + \beta_1 X + \epsilon $$
Y
:Y
is the dependent variable which is to be predicted.beta_0
: It represents the predicted value ofY
whenX
has no effect.beta_1
: It is the coefficient that measures relationship between variableX
andY
.X
:X
is the independent variable that is used to predictY
.epsilon
: It is used to calculate the difference between theobserved
andpredicted
values ofY
.
Syntax
Here’s the basic syntax for implementing linear regression analysis in Python:
from sklearn.linear_model import LinearRegression
# Create the model
model = LinearRegression()
# Fit the model
model.fit(X, y)
# Predict the dependent variable
predictions = model.predict(X) # X is the input for which predictions are to be made
Example
In this example, the dependent variable y
is directly proportional to the independent variable X
, showing a basic linear regression:
# Import required librariesimport numpy as npfrom sklearn.linear_model import LinearRegressionimport matplotlib.pyplot as plt# Sample data (Simple Linear Regression)# X: Independent variable, y: Dependent variableX = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Reshaping to 2D array for sklearny = np.array([1, 2, 3, 4, 5])# Create the Linear Regression modelmodel = LinearRegression()# Fit the modelmodel.fit(X, y)# Make predictionspredictions = model.predict(X)# Output the resultsprint(f"Intercept (β₀): {model.intercept_}")print(f"Coefficient (β₁): {model.coef_[0]}")print(f"Predictions: {predictions}")# Plot the data and regression lineplt.scatter(X, y, color='red', label='Data Points')plt.plot(X, predictions, color='green', label='Regression Line')plt.xlabel('X')plt.ylabel('y')plt.title('Simple Linear Regression')plt.legend()plt.show()
The code outputs the following result:
Intercept (β₀): 0.0Coefficient (β₁): 1.0Predictions: [1. 2. 3. 4. 5.]
![A scatter plot showing the data points (red dots) and the corresponding regression line (green) representing the simple linear regression model.] (https://raw.githubusercontent.com/Codecademy/docs/main/media/linear-regressin-analysis.png)
Codebyte Example
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Sklearn on Codecademy
- Skill path
Build a Machine Learning Model
Learn to build machine learning models with Python.Includes 10 CoursesWith CertificateBeginner Friendly23 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours