Wow, that was a lot to take in! Let’s take this opportunity to implement AdaBoost on a real dataset and solve a classification problem.
We will be using a dataset from UCI’s Machine Learning Repository to evaluate the acceptability of a car based on a set of features that encompasses their price and technical characteristics.
Instructions
Create the base estimator for the AdaBoost classifier in the form a decision stump using DecisionTreeClassifier()
and store it in a variable named decision_stump
. Recall, that a decision stump is a decision tree with only two leaf nodes.
Print the parameters of the decision stump using the .get_params()
method.
Create an AdaBoost classification model with the base_estimator
parameter set to decision_stump
and n_estimators
set to 5
. Store the model in a variable named ada_classifier
.
Print the parameters of the AdaBoost model using the .get_params()
method.
Fit ada_classifier
using the training features (X_train
) and corresponding labels (y_train
).
Predict the classes of the testing dataset (X_test
) and store them as an array in a variable named y_pred
.
Now we will explore some of the most common evaluation metrics for classification on our trained AdaBoost model.
- Calculate the accuracy and store it in a variable named
accuracy
. - Calculate the precision and store it in a variable named
precision
. - Calculate the recall and store it in a variable named
recall
. - Calculate the f1-score and store it in a variable named
f1
.
Remove the comments from the code block to print the evaluation metrics you just stored.
Take a look at the confusion matrix by removing the comments in the following code block.