While we can view a binary categorical variable as a way of creating two new regression equations with different intercepts, we don’t need to make these equations every time we want to interpret a binary predictor in a multiple regression equation.
breakfast is a binary variable that is equal to
1 for students who ate breakfast on test day and
0 for those who didn’t. For predicting
score based on
breakfast, the multiple regression equation is:
Take a look at the scatter plot with regression lines on top:
We can interpret the regression coefficients as follows:
breakfastvariable has a coefficient of 22.5. The interpretation is: holding all other variables constant, students who ate breakfast scored 22.5 points higher than students who did not. “Holding all other variables constant” means that we’re comparing breakfast groups among students who studied the same number of hours. Visually, this means that the distance between the two regression lines is always 22.5 for any value of
hours_studied(the dotted lines in the picture above are all the same length).
The intercept (32.7) is the average value of the response variable when all predictors in the equation are equal to 0. According to our full regression equation, this means that students who didn’t study (
hours_studied = 0) and didn’t eat breakfast (
breakfast = 0) earned an average score of 32.7 (the y-intercept for the blue line).
Suppose that we fit a model to predict
port3 (final Portuguese score) with predictors
math1 (first semester math score) and
address (urban or rural residence). The coefficients are printed below.
# Output: # Intercept 3.234071 # address[T.U] 0.557631 # math1 0.475892
In the file interpretations.txt write a one-sentence interpretation for the intercept. Does this interpretation make practical sense?
Add a one-sentence interpretation to interpretations.txt for the coefficient on
address in terms of the average Portuguese scores (
port3) of students from rural areas (
address = 0) and students from urban areas (
address = 1). Check your solution against the sample solutions in solutions.txt.