Building on the last exercise, let’s run a model predicting `happy`

from `stress`

and `freetime`

with an interaction term for the predictors.

import statsmodels.api as sm modelQ = sm.OLS.from_formula('happy ~ stress + freetime + stress:freetime', data=happiness).fit() print(modelQ.params) # Output: # Intercept 7.731785 # stress -0.551098 # freetime 0.187882 # stress:freetime 0.040401

We form the regression equation from the coefficients just as we did for an interaction term with a binary variable.

`$\text{happy} = 7.73 - 0.55*\text{stress} + 0.19*\textbf{freetime} + 0.04*\text{stress}*\textbf{freetime}$`

We can write a new equation for participants with differing amounts of daily free time.

For participants with **0** hours of free time, the equation is:

```
$\begin{aligned}
\text{happy} = 7.73 - 0.55*\text{stress} + 0.19*\bm{0} + 0.04*\text{stress}*\bm{0} \\
\text{happy} = 7.73 - 0.55*\text{stress} + 0 + 0 \\
\text{happy} = 7.73 - 0.55*\text{stress} \\
\end{aligned}$
```

For participants with **1** hour of free time, the equation is:

```
$\begin{aligned}
\text{happy} = 7.73 - 0.55*\text{stress} + 0.19*\bm{1} + 0.04*\text{stress}*\bm{1} \\
\text{happy} = 7.73 - 0.55*\text{stress} + 0.19 + 0.04*\text{stress} \\
\text{happy} = (7.73+0.19) + (- 0.55 + 0.04)*\text{stress} \\
\end{aligned}$
```

When we simplify and combine terms, we see the intercept increases by 0.19 and the slope increases by 0.04 compared to the participants with 0 hours of free time. An additional 0.19 and 0.04 get added to the intercept and slope, respectively, when we increase `freetime`

to **2** hours:

```
$\begin{aligned}
\text{happy} = 7.73 - 0.55*\text{stress} + 0.19*\bm{2} + 0.04*\text{stress}*\bm{2} \\
\text{happy} = 7.73 - 0.55*\text{stress} + 0.19*2 + 0.04*\text{stress}*2 \\
\text{happy} = (7.73+0.19*2) + (- 0.55 + 0.04*2)*\text{stress} \\
\end{aligned}$
```

The 0.19 that gets added to the intercept with each increase in `freetime`

is the coefficient on `freetime`

from our regression. The 0.04 that gets added to the slope of `stress`

with each increase in `freetime`

is the coefficient on `stress:freetime`

from our regression.

### Instructions

**1.**

The `plants`

dataset has been loaded for you in **script.py**. Fit a model predicting `growth`

with `water`

, `fertilizer`

, and an interaction between them as predictors, and save this model as `model3`

.

**2.**

Print the intercept and coefficients from `model3`

. Are they what you expected?

**3.**

Save the value of the coefficient that represents the difference in the slope on `water`

as `fertilizer`

increases by one as `slopeDiff`

. What do we learn about the relationship between plant growth and number of times watered as amount of fertilizer increases?

**4.**

Using the model coefficients, write the regression equation for when `fertilizer = 3`

. In **script.py**, save the intercept of the equation as `intercept3`

and the coefficient on `water`

as `slope3`

. Do the same for when `fertilizer = 5`

, and save the intercept as `intercept5`

and the coefficient on `water`

as `slope5`

. Why are these slopes farther apart than 0.774034?