Now that we have taken a look at what is going on under the hood, we are ready to implement Gradient Boosting on a real dataset and solve a classification problem.
We will be using a dataset from UCI’s Machine Learning Repository to evaluate the acceptability of a car based on a set of features that encompasses their price and technical characteristics.
Create a Gradient Boosted Trees classification model using
GradientBoostingClassifier() with the
n_estimators set to
15. Leave all other parameters to their default values. Store the model in a variable named
Print the parameters of the GradientBoostedTrees model using the
grad_classifier using the training features (
X_train) and corresponding labels (
Predict the classes of the testing dataset (
X_test) and store them as an array in a variable named
Now we will explore some of the most common evaluation metrics for classification on our trained Gradient Boosted Trees model.
- Calculate the accuracy and store it in a variable named
- Calculate the precision and store it in a variable named
- Calculate the recall and store it in a variable named
- Calculate the f1-score and store it in a variable named
Remove the comments from the code block to print the evaluation metrics you just stored.
Take a look at the confusion matrix by removing the comments in the following code block.