Now that we have taken a look at what is going on under the hood, we are ready to implement Gradient Boosting on a real dataset and solve a classification problem.
We will be using a dataset from UCI’s Machine Learning Repository to evaluate the acceptability of a car based on a set of features that encompasses their price and technical characteristics.
Instructions
Create a Gradient Boosted Trees classification model using GradientBoostingClassifier()
with the n_estimators
set to 15
. Leave all other parameters to their default values. Store the model in a variable named grad_classifier
.
Print the parameters of the GradientBoostedTrees model using the .get_params()
method.
Fit grad_classifier
using the training features (X_train
) and corresponding labels (y_train
).
Predict the classes of the testing dataset (X_test
) and store them as an array in a variable named y_pred
.
Now we will explore some of the most common evaluation metrics for classification on our trained Gradient Boosted Trees model.
- Calculate the accuracy and store it in a variable named
accuracy
. - Calculate the precision and store it in a variable named
precision
. - Calculate the recall and store it in a variable named
recall
. - Calculate the f1-score and store it in a variable named
f1
.
Remove the comments from the code block to print the evaluation metrics you just stored.
Take a look at the confusion matrix by removing the comments in the following code block.