Why is Boosting Important?
Boosting is an ensemble technique where new models are added to correct the errors made by existing models and Gradient boosting technique employs gradient descent to minimize the loss function.
XGBoost further improves the generalization capabilities of gradient boosting method by employing advanced vector norms such as L1 norm (sum of the absolute values of the vector) and L2 norm (square root of the sum of the squared vector values).
Intel has powered several optimizations for XGBoost to accelerate gradient boosting models and improve its training and inference capabilities, including XGBoost Optimized for Intel® Architecture.
Installing and using XGBoost optimizations
For versions of XGBoost higher than 0.81, Intel has up-streamed training optimizations of XGBoost using the ‘hist’ parameter method. Optimizations for the inference stage were up-streamed after version 1.3.1, so any version higher than 1.3.1 has Intel optimizations for both training and inference phases. To get an extra inference boost to the already up-streamed optimizations or if you are using a version of XGBoost older than 0.81, you can convert the XGBoost model to daal4py (an API to the Intel® oneAPI Data Analytics Library (oneDAL)).
Intel-optimized XGBoost can be installed in the following ways:
- As a part of Intel® AI Analytics Toolkit
- From PyPI repository, using pip package manager: pip install xgboost
- From Anaconda package manager:
- Using Intel channel: conda install xgboost –c intel
- Using conda-forge channel: conda install xgboost –c conda-forge
- As a Docker container (provided you have a DockerHub account)
ML training using XGBoost: Comparative performance analysis with and without Intel optimizations
We have created a code sample that shows the performance comparison of XGBoost without (0.81 version) and with (1.4.2 version) Intel optimizations. In this sample, we used the popular Higgs dataset with particle features and functions of those features to distinguish between a signal process which produces Higgs bosons and a background process which does not produce them. The Higgs boson is a basic particle in the standard model produced by the quantum excitation of the Higgs field. Here, a XGBoost model is trained, and the results with and without Intel’s optimizations are compared.
IMP NOTE: To leverage training optimizations of XGBoost for Intel Architecture, initialize the ‘tree_method’ parameter to ‘hist’ as mentioned in the code sample.
You can learn more about the XGBoost parameters mentioned in the code sample and other available parameters here.
The performance of XGBoost with (0.9, 1.0, 1.1 versions) and without Intel optimizations (0.81 version) is compared on the following real-life datasets:
- Higgs (1,000,000 rows, 2 classes)
- Airline-OHE (1,000,000 rows, 2 classes)
- MSRank (3,000,000 rows, 5 classes)
- Letters (20,000 rows, 26 classes)
Comparing the execution times for training shows up to 16X improvement with the Intel optimizations.
Figure 1: Release-to-release acceleration of XGBoost training (See configuration details below1)
ML inference using XGBoost: Comparative performance analysis using up-streamed Intel optimizations and with oneDAL
For Python 3.6 and higher versions, the inference speedup attained by XGBoost optimized by Intel can further be elevated using the daal4py API of Intel® oneDAL library. Install daal4py, import it and you are all set to boost the inference stage by adding a single line of code between your model training and prediction phases – yes, it is that simple!
Here is the way to expedite your inference process using XGBoost optimized by Intel.
Install the daal4py API:
!pip install daal4py
For more ways to install daal4py and detailed information on the API, check out this link.
Import daal4py as follows:
import daal4py as dp
The performance of stock XGBoost with daal4py acceleration is compared on the following datasets,
- Mortgage (45 features, ~9M observations)
- Airline (691 features, one-hot encoding, ~1M observations)
- Higgs (28 features, 1M observations)
- MSRank (136 features, 3M observations)
The result shows up to 36X improvement using daal4py/the oneDAL library.
Figure 2: Performance comparison: stock XGBoost Prediction vs. Daal4py Prediction (See configuration details below2)
Additionally, here is a simple example demonstrating the use of daal4py. In the example, a XGBoost model is trained, and results are predicted using the daal4py prediction method. Also, this code sample shows performance comparison between XGBoost prediction and daal4py prediction for the same accuracy.
If XGBoost is your library of choice for AI/ML workflow, you can get performance optimizations driven by Intel automatically by updating to the latest library version to make your gradient boosted tree algorithm run faster by several times. You can also leverage the daal4py API of Intel oneAPI Data Analytics Library if you want even greater performance enhancement for Inference workloads. We also encourage you to check out and incorporate Intel’s other AI/ML Framework optimizations and end-to-end portfolio of tools into your AI workflow and, if interested, to learn about the unified, open, standards-based oneAPI programming model at the foundation of these tools and optimizations.
We would like to thank Vadim Sherman, Rachel Oberman, Preethi Venkatesh, Praveen Kundurthy, John Kinsky, Jimmy Wei, Louie Tsai, Tom Lenth, Jeff Reilly, Monique Torres, Keenan Connolly, and John Somoza for their review and approval help.
See Related Content
On-demand Webinars & Workshops
- Maximize Your CPU Resources for XGBoost Training and Inference
- Intel® AI Analytics Toolkit and XGBoost for Predictive Modeling
- Learn Predictive Modeling with Intel® AI Tools