We often write the equation of a line in the form y=mx+b, where m is the slope of the line and b is the y-intercept. Since we will be adding at least two predictors to a multiple regression equation, it is helpful to modify our ordering and notation of this equation:
- First, we may rewrite this equation by putting the intercept term first and the slope term second.
- Next, instead of using the names b and m, we use the names b0 and b1, respectively.
Notice that we’ve also changed our variable name x to x1 because it is our FIRST predictor.
- We are now able to add as many predictors as we need in the form
where y is the response variable, b0 is the intercept, and bi is the coefficient on the ith predictor variable.
- The “slopes” (b1, b2, b3, etc.) on the variables in multiple regression are called partial regression coefficients.
While this is the proper mathematical way to write a multiple regression equation, it is often easier to write out the equation using actual variable names. For example, if we are modeling test scores (
score) based on number of hours studied (
hours_studied) and another variable that indicates whether a student ate breakfast (
breakfast), our multiple regression equation might look like this:
Of course, after fitting our model, the intercept (b0) and coefficients (b1 and b2) could be filled in with actual numbers from the output of our regression. For instance, our final equation might have an intercept of 32.7, a coefficient of 8.5 on
hours_studied, and a coefficient of 22.5 on
Inspect the multiple regression equation:
In script.py, save the value of the intercept as a variable named
b0, and the values of the coefficients for
independent as variables named
How are the associations of
rating different from each other?