Now that we can make different decision trees, it’s time to plant a whole forest! Let’s say we make different
8 trees using bagging and feature bagging. We can now take a new unlabeled point, give that point to each tree in the forest, and count the number of times different labels are predicted.
The trees give us their votes and the label that is predicted most often will be our final classification! For example, if we gave our random forest of 8 trees a new data point, we might get the following results:
["vgood", "vgood", "good", "vgood", "acc", "vgood", "good", "vgood"]
Since the most commonly predicted classification was
"vgood", this would be the random forest’s final classification.
Let’s write some code that can classify an unlabeled point!
At the top of your code, we’ve included a new unlabeled car named
unlabeled_point that we want to classify. We’ve also created a tree named
subset_tree that was created using bagging and feature bagging.
Let’s see how that tree classifies this point. Print the results of
subset_tree as parameters.
That’s the prediction using one tree. Let’s make
20 trees and record the prediction of each one!
Take all of your code between creating
indices and the
Above your for loop, create a variable named
predictions and set it equal to an empty list. Inside your for loop, instead of printing the prediction, use
.append() to add it to
Finally after your for loop, print
We now have a list of 20 predictions — let’s find the most common one! You can find the most common element in a list by using this line of code:
Outside of your for loop, store the most common element in a variable named
final_prediction and print that variable.