By adding an interaction term for our binary predictor, we have made our model more complex and have therefore also added complexity to its interpretation.
Returning to our multiple regression equation with an interaction term from the last exercise, we have:
We can rewrite this equation for the group that doesn’t exercise regularly (
exercise = 0) and for the one that does (
exercise = 1).
exercise = 0, the last two terms become zero and go away:
exercise = 1, the intercept goes down by 3.1 and the coefficient on
stress increases by 0.4:
We can see the coefficient on
exercise tells us the difference in INTERCEPTS between the two exercise groups. In this case, the intercept of the regression line for the group that exercises (9.0) is 3.1 units lower than that of the group that doesn’t exercise (12.1).
On the other hand, the coefficient on the interaction term tells us the difference in SLOPES between the two regression lines. The slope on
stress for the group that exercises (-0.6) is 0.4 units greater than that of the group that doesn’t exercise (-1.0). This would lead us to conclude that the happiness level of the group who exercises is less negatively impacted by stress.
The output of the regression predicting
height from predictors
species with an interaction term from the last exercise is shown below.
# Output: # Intercept 8.168619 # species[T.B] -3.580515 # weight 1.658621 # weight:species[T.B] 1.115071
Write out and simplify the regression equation for species A. In script.py, save the value of the coefficient on
weight as a variable named
slopeA. What do we learn about the relationship between
weight for species A?
Write out and simplify the regression equation for species B. In script.py, save the value of the coefficient on
slopeB. Why might plants of different species show different relationships between their heights and weights?