Now that you’ve seen that an interaction term for two quantitative variables creates regression lines for each value of one of the interacting variables, let’s visualize this in a plot. We start with code for a scatter plot of
stress colored by
import seaborn as sns import matplotlib.pyplot as plt sns.lmplot(x='stress', y='happy', hue='freetime', palette='Purples', fit_reg=False, data=happiness)
Next, we’ll add lines to the plot for a few sample values of
freetime: 0, 3, and 6 hours. Rather than write out each model coefficient, we can call them directly from
modelQ, where we stored our regression results in the last exercise. Here is the code to add lines for 3 and 6 hours of free time:
plt.plot(happiness.stress, modelQ.params+modelQ.params*happiness.stress+modelQ.params*3+modelQ.params*happiness.stress*3, color='mediumpurple', linewidth=3) plt.plot(happiness.stress, modelQ.params+modelQ.params*happiness.stress+modelQ.params*6+modelQ.params*happiness.stress*6, color='indigo', linewidth=3) # Add legend and show plot plt.legend(['0 hours','3 hours','6 hours']) plt.show()
Just as we saw in the regression equations in the last exercise, the intercepts and slopes both increase by 0.19 and 0.04, respectively, for each additional hour of free time. In context, the relationship between stress and happiness appears to be negative. However, this flattening of the slope with increasing amounts of free time might be interpreted as stress impacting happiness less negatively as people have more free time.
This is NOT to say that the data shows that stress CAUSES unhappiness, or that free time CAUSES stress to be less impactful on happiness, just that the association between happiness and stress looks different for people with different amounts of free time.
In script.py, create a scatter plot of
growth on the y-axis and
water on the x-axis. Color the points by
fertilizer. Note that you’ll need to uncomment
plt.show() a few lines down in order to display the scatter plot.
Uncomment the rest of the
plt.plot() objects, which are regression lines for when
fertilizer is equal to 2 and 4. Using these as a guide, along with the regression coefficients from the model results, add one more line to the plot for when
fertilizer is equal to 6 and set its
color argument as
indigo. Make sure to uncomment the code for the legend as well as
Are the slopes of the lines what you expected based on the model coefficients you obtained?