Just like for decision trees, we can use random forests for regression as well! It is important to know when to use which – this comes down to what type of variable your target is. Previously, we were using a binary categorical variable, acceptable vs not, so a classification model was used.
We will now consider a hypothetical new target variable, price, for this data set, which is a continuous variable. We’ve generated some fake prices in the dataset so that we have numerical values instead of the previous categorical variables. (Please note that these are not reflective of the previous categories of high and low prices - we just wanted some numeric values so we can perform regression! :) )
Now, instead of a classification task, this will be a regression model.
RandomForestRegressor() model on the training data. Print the default scores (or R^2 values) on the test and train sets.
Print the average price of a car. Print the MAE (Mean Absolute Error) for the test and train sets to see how the error compares to the mean.