We have seen how to implement a linear regression algorithm in Python, and how to use the linear regression model from scikit-learn. We learned:

  • We can measure how well a line fits by measuring loss.
  • The goal of linear regression is to minimize loss.
  • To find the line of best fit, we try to find the b value (intercept) and the m value (slope) that minimize loss.
  • Convergence refers to when the parameters stop changing with each iteration.
  • Learning rate refers to how much the parameters are changed on each iteration.
  • We can use Scikit-learn’s LinearRegression() model to perform linear regression on a set of points.

These are important tools to have in your toolkit as you continue your exploration of data science.


Find another dataset, maybe in scikit-learn’s example datasets. Or on Kaggle, a great resource for tons of interesting data.

Try to perform linear regression on your own! If you find any cool linear correlations, make sure to share them!

As a starter, we’ve loaded in the Boston housing dataset. We made the X values the nitrogen oxides concentration (parts per 10 million), and the y values the housing prices. See if you can perform regression on these houses!

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?