Learn

Congratulations! In this lesson, you’ve learned a number of different methods for model comparison:

  • For choosing a model that best represents the data we have:
    • R-squared
    • Adjusted R-squared
    • F-test
  • For choosing a model for accurate out-of-sample prediction:
    • Log likelihood
    • AIC/BIC
    • Training/test sets

Note that we’ve covered many different methods for choosing a model and they don’t always agree. In order to choose a method, it’s important to consider your ultimate goal (analysis vs. prediction) and what you want to prioritize (simplicity and interpretability vs. accuracy)

Instructions

In this final workspace, we’ve loaded the StreetEasy dataset for you to investigate further. The dataset contains the following columns:

  • rent: the monthly rental price in dollars
  • bedrooms: the number of bedrooms
  • bathrooms: the number of bathrooms
  • size_sqft: the area in square feet
  • min_to_subway: minutes walking distance to the nearest subway station
  • building_age_yrs: age of the building in years
  • no_fee: whether or not there is a broker fee
  • has_roofdeck: whether or not there is a roofdeck
  • has_washer_dryer: whether or not there is a washer and dryer
  • has_doorman: whether or not there is a doorman
  • elevator: whether or not there is an elevator
  • has_dishwasher: whether or not there is a dishwasher
  • has_patio: whether or not there is a patio
  • has_gym: whether or not there is a gym
  • neighborhood: neighborhood where the apartment is located
  • borough: borough where the apartment is located

Which predictors do you think will be most important in predicting the rental price of an apartment in NYC? Using the predictors you think are most relevant:

  1. Fit a few different models
  2. Compare the models based on adjusted R-squared. Which would you choose?
  3. Compare the models using an F-test. Which would you choose?
  4. Compare the models using AIC/BIC. Which would you choose?
  5. Overall, think about which model you would choose based on your analysis. Did these comparison methods agree or disagree in terms of what was considered “best”?

Take this course for free

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Already have an account?