Learn

In the last few exercises we examined interactions between a quantitative predictor and a binary predictor, but we may also wish to use an interaction term for two quantitative variables. Consider the scatter plot of happy versus stress, this time colored by the quantitative variable freetime, which represents the number of hours of free time a participant has on average each day.

Scatter plot showing happy versus stress colored by the number of hours of free time, ranging from 0 to 6, becoming progressively darker in color as the value increases. Darker points are found in the upper lefthand corner becoming progressively lighter as the points move in a negative direction across the plot.

If we divided the points into groups based on their freetime value and fit a regression line for each group, would all the lines have the same slope?

  • If we wanted to fit a line amongst the darker points (freetime between 5 and 6), the line might have a flat but negative slope.
  • In contrast, a line for lighter points (freetime between 0 and 2) might be steeper in slope.

Thus, if we wanted to fit a regression for this data, we might consider fitting several lines for different values of freetime rather than a single one across all points. Much like in the previous exercises, we can achieve this by adding a term to the model for the interaction of stress and freetime.

Instructions

1.

The scatter plot below shows data from the plants dataset. The plot shows the amount of new growth in centimeters (growth) versus the number of times a plant was watered monthly (water), colored by amount of fertilizer the plant received (fertilizer).

Scatter plot showing growth versus water with points that get darker as the value of fertilizer increases. The points show a positive relationship moving left to right across the plot. When taken as groups, darker points seem to align more steeply than lighter points.

Imagine drawing in a regression line for each of three different groups: when the amount of fertilizer is 1 (very light points), 4 (medium), or 8 (dark). Just by inspecting the graph, estimate the slopes of these imaginary lines and save these in script.py as slope1, slope4, and slope8, respectively. Do the slopes indicate that we may need to include an interaction term in our model?

Take this course for free

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.
Already have an account?