Now we have the training set and the test set, let’s use scikit-learn to build the linear regression model!

The steps for multiple linear regression in scikit-learn are identical to the steps for simple linear regression. Just like simple linear regression, we need to import LinearRegression from the linear_model module:

from sklearn.linear_model import LinearRegression

Then, create a LinearRegression model, and then fit it to your x_train and y_train data:

mlr = LinearRegression() mlr.fit(x_train, y_train) # finds the coefficients and the intercept value

We can also use the .predict() function to pass in x-values. It returns the y-values that this plane would predict:

y_predicted = mlr.predict(x_test) # takes values calculated by `.fit()` and the `x` values, plugs them into the multiple linear regression equation, and calculates the predicted y values.

We will start by using two of these columns to teach you how to predict the values of the dependent variable, prices.



Import LinearRegression from scikit-learn’s linear_model module.


Create a Linear Regression model and call it mlr.

Fit the model using x_train and y_train.


Use the model to predict y-values from x_test. Store the predictions in a variable called y_predict.

Now we have:

  • x_test
  • x_train
  • y_test
  • y_train
  • and y_predict!

To see this model in action, let’s test it on Sonny’s apartment in Greenpoint, Brooklyn!

Or if you reside in New York, plug in your own apartment’s values and see if you are over or underpaying!

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?