Linear Regression Analysis

Anonymous contributor's avatar
Anonymous contributor
Published Dec 5, 2024Updated Feb 3, 2025
Contribute to Docs

In sklearn, Linear Regression Analysis is a machine learning technique used to predict a dependent variable based on one or more independent variables, assuming a linear relationship.

In simple linear regression, we predict the dependent variable Y using a single independent variable X, fitting the data to a straight line, often called as the regression line. The equation for the line is:

$$ Y = \beta_0 + \beta_1 X + \epsilon $$

  • Y: The dependent variable to be predicted.
  • β₀: The intercept represents Y’s predicted value when X has no effect.
  • β₁: The coefficient measures the relationship between X and Y.
  • X: The independent variable used to predict Y.
  • ε: The error term representing the difference between the observed and predicted values of Y.

Syntax

Here’s the basic syntax for implementing linear regression analysis in Python:

from sklearn.linear_model import LinearRegression

# Create the model
model = LinearRegression()

# Fit the model
model.fit(X, y)

# Predict the dependent variable
predictions = model.predict(X)  # X is the input for which predictions are to be made

Example

In this example, the dependent variable yis directly proportional to the independent variable X, showing a basic linear regression:

# Import required libraries
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# Sample data (Simple Linear Regression)
# X: Independent variable, y: Dependent variable
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Reshaping to 2D array for sklearn
y = np.array([1, 2, 3, 4, 5])
# Create the Linear Regression model
model = LinearRegression()
# Fit the model
model.fit(X, y)
# Make predictions
predictions = model.predict(X)
# Output the results
print(f"Intercept (β₀): {model.intercept_}")
print(f"Coefficient (β₁): {model.coef_[0]}")
print(f"Predictions: {predictions}")
# Plot the data and regression line
plt.scatter(X, y, color='red', label='Data Points')
plt.plot(X, predictions, color='green', label='Regression Line')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Simple Linear Regression')
plt.legend()
plt.show()

The code outputs the following result:

Intercept (β₀): 0.0
Coefficient (β₁): 1.0
Predictions: [1. 2. 3. 4. 5.]

A scatter plot showing the data points (red dots) and the corresponding regression line (green) representing the simple linear regression model.

Codebyte Example

Code
Output
Loading...

All contributors

Contribute to Docs

Learn Python:Sklearn on Codecademy