Learn

In addition to the quantitative measures that characterize our model accuracy, it is alway a best practice to produce visual summaries to assess our model. First, we should always visualize our model within our data. For simple linear regression this is quite simple; we can use `geom_point()` to plot our observed values, and `geom_smooth(method = "lm")` to plot our model. In addition, we can include a second call to `geom_smooth()`, with parameters `(se = FALSE, color = "red")`. This combination of function calls allows us to compare the linearity of our model, visualized below as the blue line with the 95% confidence interval covering the shaded region, in comparison to a non-linear LOESS smoother visualized in red.

``````ggplot(train, aes(podcast, sales)) +
geom_point() +
geom_smooth(method = "lm") +
geom_smooth(se = FALSE, color = "red") `````` LOESS smoothers plot a line based on the weighted value of data points; the line produced by a LOESS smoother is similar to taking a moving average of data points as our x-axis variable increases. The smoother should not be used to predict new values, as it relies heavily on our training data, but it is a helpful tool for visualizing where our linear model diverges from our training data.

Considering the LOESS smoother remains within the confidence interval of our model, we can assume the linear trend fits the essence of this relationship. However, we should note that as the podcast advertising budget gets closer to 0 there is a stronger reduction in sales beyond what the linear trend follows; this means that our model might be less accurate in instances where the podcast budget is very low.

### Instructions

1.

We’ve plotted clicks against total converts. Let’s add a LOESS smoother. Add two calls of `geom_smooth()` to `plot`. The first should use the parameter `method = "lm"`. The second should use the parameters `se = FALSE` and `color = "red"`.

2.

How closely does the relationship between clicks and conversion follow a linear trend? Set the variable `linear_relationship` equal to either `"a"`, `"b"`, `"c"`, or `"d"` depending on the statement that best characterizes the relations:

A. The relationship is less linear when `clicks` approaches large values.

B. There is a clear divergence from a linear relationship when `clicks` approaches zero or when `clicks` approaches infinity.

C. The relationship between `clicks` and `total_conversion` is perfectly linear.

D. There is no linear relationship between `clicks` and `total_conversion`

3.

Let’s extend our linearity analysis to our `model2`, which describes the relationship between `impressions` and `total_conversion`. Add the two calls to `geom_smooth()` to `plot_2` to make a comparison to a LOESS model.

4.

How closely does the relationship between impressions and conversion follow a linear trend? Set the variable `linear_relationship_2` equal to `"a"`, `"b"`, `"c"`, or `"d"` depending on the statement that best characterizes the relations:

A. The relationship between `impressions` and `total_conversion` is perfectly linear.

B. There is a clear divergence from a linear relationship when `impressions` approaches zero and when `impressions` is around 500,000.

C. The relationship is less linear when `impressions` approaches very large values and when `impressions` is around 500,000.

D. There is no linear relationship between `impressions` and `total_conversion`