The results of the regression from the previous exercise, saved as `modelP`

, modeled happiness level from hours of sleep with a polynomial term. The model coefficients are given below:

import statsmodels.api as sm import numpy as np modelP = sm.OLS.from_formula('happy ~ sleep + np.power(sleep,2)', data=happiness).fit() print(modelP.params) # Output: # Intercept -0.058995 # sleep 1.320429 # np.power(sleep, 2) -0.061827

It is generally difficult to interpret the coefficients on polynomial terms directly, but we can interpret the overall relationship if we visualize our regression line on the scatter plot. We can plot the regression line for `happy`

predicted by `sleep`

WITHOUT a polynomial term by excluding the `fit_reg=False`

argument in `lmplot()`

. However, note that we’ve added `ci=None`

to prevent the function from plotting a confidence interval.

import seaborn as sns import matplotlib.pyplot as plt import numpy as np sns.lmplot(x='sleep', y='happy', ci=None, data=happiness)

To add our curved regression line to the scatter plot, we first create a dataset of 100 values of `sleep`

ranging from 2 to 14 (saved as `x`

) and 100 predicted values of `happy`

from our regression equation (saved as `y`

). Then we plot our `x`

and `y`

values using `plt.plot()`

and add a legend.

x=np.linspace(2,14,100) y=modelP.params[0]+modelP.params[1]*x+modelP.params[2]*np.power(x,2) plt.plot(x, y, linestyle='dashed', linewidth=4, color='black') plt.legend(['Simple Model','Polynomial Model']) plt.show()

From the plot of the polynomial regression line, we see that happiness increases as sleep increases, but that the increase slows as sleep reaches around 10 hours and then begins to decrease with further hours of sleep.

The simple regression line misses these details and describes the relationship as a steady increase in happiness for any additional hour of sleep. Thus, the polynomial model captures a more detailed relationship that will make our predictions better.

### Instructions

**1.**

The `plants`

dataset has been loaded for you in **script.py**. A simple linear regression predicting `dead`

with predictor `light`

has been saved for you as `simple`

. Print the regression coefficients and inspect them. According to this model, what do we learn about the relationship between number of dead leaves and amount of light a plant receives?

**2.**

A new regression model predicting `dead`

from `light`

but with an additional squared term for `light`

has been saved for you as `polynomial`

. Print the resulting coefficients. Why can’t we interpret the coefficient on `light`

directly anymore?

**3.**

Uncomment the code for the scatter plot that automatically plots the simple regression line for you. Be sure to uncomment `plt.show()`

a few lines down as well to display the plot.

**4.**

A set of 100 values of `light`

has been saved for you as `x`

and 100 values of `dead`

predicted from those `light`

values substituted into the regression equation of `polynomial`

has been saved for you as `y`

.

Use `x`

and `y`

to add the curved polynomial line in black and dashed. What features of the relationship between `dead`

and `light`

does the polynomial model capture that the straight line of the simple model misses?