Let’s return to the logistic regression equation and demonstrate how this works by fitting a model in sklearn. The equation is:

`$ln(\frac{p}{1-p}) = b_{0} + b_{1}x_{1} + b_{2}x_{2} +\cdots + b_{n}x_{n}$`

Suppose that we want to fit a model that predicts whether a visitor to a website will make a purchase. We’ll use the number of minutes they spent on the site as a predictor. The following code fits the model:

from sklearn.linear_model import LogisticRegression model = LogisticRegression() model.fit(purchase, min_on_site)

Next, just like linear regression, we can use the right-hand side of our regression equation to make predictions for each of our original datapoints as follows:

log_odds = model.intercept_ + model.coef_ * min_on_site print(log_odds)

Output:

[[-3.28394203] [-1.46465328] [-0.02039445] [ 1.22317391] [ 2.18476234]]

Notice that these predictions range from negative to positive infinity: these are log odds. In other words, for the first datapoint, we have:

`$ln(\frac{p}{1-p}) = -3.28394203$`

We can turn log odds into a probability as follows:

```
$\begin{aligned}
ln(\frac{p}{1-p}) = -3.28 \\
\frac{p}{1-p} = e^{-3.28} \\
p = e^{-3.28} (1-p) \\
p = e^{-3.28} - e^{-3.28}*p \\
p + e^{-3.28}*p = e^{-3.28} \\
p * (1 + e^{-3.28}) = e^{-3.28} \\
p = \frac{e^{-3.28}}{1 + e^{-3.28}} \\
p = 0.04
\end{aligned}$
```

In Python, we can do this simultaneously for all of the datapoints using NumPy (loaded as `np`

):

np.exp(log_odds)/(1+ np.exp(log_odds))

Output:

array([[0.0361262 ], [0.18775665], [0.49490156], [0.77262162], [0.89887279]])

The calculation that we just did required us to use something called the *sigmoid function*, which is the inverse of the logit function. The sigmoid function produces the S-shaped curve we saw previously:

### Instructions

**1.**

In the workspace, we’ve fit a logistic regression on the Codecademy University data and saved the intercept and coefficient on `hours_studied`

as `intercept`

and `coef`

, respectively.

For each student in the dataset, use the intercept and coefficient to calculate the log odds of passing the exam. Save the result as `log_odds`

.

**2.**

Now, convert the predicted log odds for each student into a predicted probability of passing the exam. Save the predicted probabilities as `pred_probability_passing`

.