In the last few exercises we examined interactions between a quantitative predictor and a binary predictor, but we may also wish to use an interaction term for two quantitative variables. Consider the scatter plot of
stress, this time colored by the quantitative variable
freetime, which represents the number of hours of free time a participant has on average each day.
If we divided the points into groups based on their
freetime value and fit a regression line for each group, would all the lines have the same slope?
- If we wanted to fit a line amongst the darker points (
freetimebetween 5 and 6), the line might have a flat but negative slope.
- In contrast, a line for lighter points (
freetimebetween 0 and 2) might be steeper in slope.
Thus, if we wanted to fit a regression for this data, we might consider fitting several lines for different values of
freetime rather than a single one across all points. Much like in the previous exercises, we can achieve this by adding a term to the model for the interaction of
The scatter plot below shows data from the
plants dataset. The plot shows the amount of new growth in centimeters (
growth) versus the number of times a plant was watered monthly (
water), colored by amount of fertilizer the plant received (
Imagine drawing in a regression line for each of three different groups: when the amount of fertilizer is 1 (very light points), 4 (medium), or 8 (dark). Just by inspecting the graph, estimate the slopes of these imaginary lines and save these in script.py as
slope8, respectively. Do the slopes indicate that we may need to include an interaction term in our model?