Learn

Linear Regression

Points and Lines

In the last exercise, you were probably able to make a rough estimate about the next data point for Sandra’s lemonade stand without thinking too hard about it. For our program to make the same level of guess, we have to determine what a line would look like through those data points.

A line is determined by its *slope* and its *intercept*. In other words, for each point `y`

on a line we can say:

`$y = m x + b$`

where `m`

is the slope, and `b`

is the intercept. `y`

is a given point on the y-axis, and it corresponds to a given `x`

on the x-axis.

The slope is a measure of how steep the line is, while the intercept is a measure of where the line hits the y-axis.

When we perform Linear Regression, the goal is to get the “best” `m`

and `b`

for our data. We will determine what “best” means in the next exercises.

We have provided a slope, `m`

, and an intercept, `b`

, that seems to describe the revenue data you have been given.

Create a new list, `y`

, that has every element in `months`

, multiplied by `m`

and added to `b`

.

A list comprehension is probably the easiest way to do this!

Plot the `y`

values against `months`

as a line on top of the scatterplot that was plotted with the line `plt.plot(months, revenue, "o")`

.

Change `m`

and `b`

to the values that you think match the data the best.

What does the slope look like it should be? And the intercept?