Learn

So far, we’ve learned that the equation for a logistic regression model looks like this:

$ln(\frac{p}{1-p}) = b_{0} + b_{1}x_{1} + b_{2}x_{2} +\cdots + b_{n}x_{n}$

Note that we’ve replaced y with the letter p because we are going to interpret it as a probability (eg., the probability of a student passing the exam). The whole left-hand side of this equation is called log-odds because it is the natural logarithm (ln) of odds (p/(1-p)). The right-hand side of this equation looks exactly like regular linear regression!

In order to understand how this link function works, let’s dig into the interpretation of log-odds a little more. The odds of an event occurring is:

$Odds = \frac{p}{1-p} = \frac{P(event\ occurring)}{P(event\ not\ occurring)}$

For example, suppose that the probability a student passes an exam is 0.7. That means the probability of failing is 1 - 0.7 = 0.3. Thus, the odds of passing are:

$Odds\ of\ passing = \frac{0.7}{0.3} = 2.\overline{33}$

This means that students are 2.33 times more likely to pass than to fail.

Odds can only be a positive number. When we take the natural log of odds (the log odds), we transform the odds from a positive value to a number between negative and positive infinity — which is exactly what we need! The logit function (log odds) transforms a probability (which is a number between 0 and 1) into a continuous value that can be positive or negative.

### Instructions

1.

Suppose that there is a 40% probability of rain today (p = 0.4). Calculate the odds of rain and save it as odds_of_rain. Note that the odds are less than 1 because the probability of rain is less than 0.5.

Feel free to print odds_of_rain to see the results.

2.

Use the odds that you calculated above to calculate the log odds of rain and save it as log_odds_of_rain. You can calculate the natural log of a value using the numpy.log() function. Note that the log odds are negative because the probability of rain was less than 0.5.

Feel free to print log_odds_of_rain to see the results.

3.

Suppose that there is a 90% probability that my train to work arrives on-time. Calculate the odds of my train being on-time and save it as odds_on_time. Note that the odds are greater than 1 because the probability is greater than 0.5.

Feel free to print odds_on_time to see the results.

4.

Use the odds that you calculated above to calculate the log odds of an on-time train and save it as log_odds_on_time. Note that the log odds are positive because the probability of an on-time train was greater than 0.5.

Feel free to print log_odds_on_time to see the results.