Now we have the training set and the test set, let’s use scikit-learn to build the linear regression model!
The steps for multiple linear regression in scikit-learn are identical to the steps for simple linear regression. Just like simple linear regression, we need to import LinearRegression
from the linear_model
module:
from sklearn.linear_model import LinearRegression
Then, create a LinearRegression
model, and then fit it to your x_train
and y_train
data:
mlr = LinearRegression() mlr.fit(x_train, y_train) # finds the coefficients and the intercept value
We can also use the .predict()
function to pass in x-values. It returns the y-values that this plane would predict:
y_predicted = mlr.predict(x_test) # takes values calculated by `.fit()` and the `x` values, plugs them into the multiple linear regression equation, and calculates the predicted y values.
We will start by using two of these columns to teach you how to predict the values of the dependent variable, prices.
Instructions
Import LinearRegression
from scikit-learn’s linear_model
module.
Create a Linear Regression model and call it mlr
.
Fit the model using x_train
and y_train
.
Use the model to predict y-values from x_test
. Store the predictions in a variable called y_predict
.
Now we have:
x_test
x_train
y_test
y_train
- and
y_predict
!
To see this model in action, let’s test it on Sonny’s apartment in Greenpoint, Brooklyn!
Or if you reside in New York, plug in your own apartment’s values and see if you are over or underpaying!