Learn
We have seen how to implement a linear regression algorithm in Python, and how to use the linear regression model from scikit-learn. We learned:
- We can measure how well a line fits by measuring loss.
- The goal of linear regression is to minimize loss.
- To find the line of best fit, we try to find the
b
value (intercept) and them
value (slope) that minimize loss. - Convergence refers to when the parameters stop changing with each iteration.
- Learning rate refers to how much the parameters are changed on each iteration.
- We can use Scikit-learn’s
LinearRegression()
model to perform linear regression on a set of points.
These are important tools to have in your toolkit as you continue your exploration of data science.
Instructions
Find another dataset, maybe in scikit-learn’s example datasets. Or on Kaggle, a great resource for tons of interesting data.
Try to perform linear regression on your own! If you find any cool linear correlations, make sure to share them!
As a starter, we’ve loaded in the Boston housing dataset. We made the X
values the nitrogen oxides concentration (parts per 10 million), and the y
values the housing prices. See if you can perform regression on these houses!
Sign up to start coding
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.