*R-squared* is one of the most common metrics to evaluate linear regression models. We can interpret R-squared as the proportion of variation in an outcome variable that is explained by a linear regression model. More explained variation is generally better.

For example, suppose we have a dataset containing information about apartment rentals for NYC apartments. We can build two different models to predict rental price and print out the R-Squared for each model as follows:

# Create and fit the first model to predict rent model1 = sm.OLS.from_formula('rent ~ bedrooms + size_sqft + min_to_subway', data=rentals).fit() # Create and fit the second model model2 = sm.OLS.from_formula('rent ~ bathrooms + building_age_yrs + borough', data=rentals).fit() # Print out R-squared for both models print(model1.rsquared) #Output: 0.664 print(model2.rsquared) #Output: 0.596

This tells us that the first model (using bedrooms, square-footage, and minutes to the subway) explains about 66.4% of the variation in rental prices, whereas the second model only explains about 59.6% of the variation. This would lead us to choose the first model over the second.

### Instructions

**1.**

Using the `bikes`

dataset, fit a model to predict `cnt`

(the number of bike rentals) based on the temperature (`temp`

), windspeed (`windspeed`

), and whether or not it is a holiday (`holiday`

). Save the fitted model as `model1`

.

**2.**

Using the `bikes`

dataset, fit a second model to predict `cnt`

(the number of bike rentals) based on humidity (`hum`

), season (`season`

), and the day of the week (`weekday`

). Save the fitted model as `model2`

.

**3.**

Print out the R-squared for both models.

**4.**

Based on the R-squared values, which model would you choose? Indicate your answer by setting a variable named `which_model`

equal to `1`

if you would choose `model1`

and equal to `2`

if you would choose `model2`

.