Learn

We saw that predicted outcomes from a linear regression model range from negative to positive infinity. These predictions don’t really make sense for a classification problem. Step in logistic regression!

To build a logistic regression model, we apply a logit link function to the left-hand side of our linear regression function. Remember the equation for a linear model looks like this:

$y = b_{0} + b_{1}x_{1} + b_{2}x_{2} +\cdots + b_{n}x_{n}$

When we apply the logit function, we get the following:

$ln(\frac{y}{1-y}) = b_{0} + b_{1}x_{1} + b_{2}x_{2} +\cdots + b_{n}x_{n}$

For the Codecademy University example, this means that we are fitting the curve shown below to our data — instead of a line, like in linear regression: Notice that the red line stays between 0 and 1 on the y-axis. It now makes sense to interpret this value as a probability of group membership; whereas that would have been non-sensical for regular linear regression.

Note that this is a pretty nifty trick for adapting a linear regression model to solve classification problems! There are actually many other kinds of link functions that we can use for different adaptations.

### Instructions

1.

We’ve provided the code to build a logistic regression model on the Codecademy University data and plot the fitted curve. Take a look at the plot. Expand the plot to fullscreen for a larger view.

Using this curve, estimate the probability that a student who studied for five hours will pass the exam. Save the result as five_hour_studier and press “Run”.