Learn

In the last exercise, we ran a regression predicting happiness scores from stress scores and exercise participation without an interaction term. We got the following model coefficients:

# Output:
# Intercept    10.256296
# stress       -0.707925
# exercise     -0.894058

Using these coefficients, our regression equation is:

$\text{happy} = 10.3 - 0.7*\text{stress} - 0.9*\text{exercise}$

In the Python library statsmodels.api, we can easily add an interaction term to the model formula by adding a third predictor that combines stress and exercise with a colon (stress:exercise). The code to run the updated model and print the coefficients is shown below.

import statsmodels.api as sm
model = sm.OLS.from_formula('happy ~ stress + exercise + stress:exercise', data=happiness).fit()
print(model.params)

# Output:
# Intercept          12.053583
# stress             -0.971225
# exercise           -3.135705
# stress:exercise     0.357365

In addition to the expected coefficients, when we add the interaction term, the coefficient table shows a new term with a coefficient: stress:exercise. The coefficient on stress:exercise is really a coefficient on a whole new variable formed by multiplying stress by exercise. Thus, our regression equation for this model looks like this:

$\text{happy} = 12.1 - 1.0*\text{stress} - 3.1*\text{exercise} + 0.4*\text{stress}*\text{exercise}$

Note that our other coefficients changed slightly with the additional predictor. This is because we have explicitly pulled out more of the relationship between stress and exercise, causing the other coefficients to adjust to take this into account.

### Instructions

1.

The plants dataset has been loaded for you in script.py and the regression predicting height from the predictors weight and species has been saved as model1. Run a regression predicting height with predictors weight, species, and an interaction term for these two predictors. Save the results of this model as model2.

2.

Print the coefficients from model1. What do we learn from the coefficients about each species’ regression line?

3.

Print the coefficients from model2. How did the coefficients and regression equation change when the interaction term was added?