Learn Predictive Modeling with Intel® AI Tools
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Get hands-on experience using the Intel® AI Analytics Toolkit (AI Kit) to explore predictive modeling techniques based on decision trees. This workshop takes popular decision-tree algorithms—useful for regression and classification tasks—and addresses the challenge of handling training when data sizes increase.
Starting with a basic decision tree and advancing to techniques that balance the tradeoffs of speed and accuracy, this workshop explores the ways that the AI Kit improves implementations of predictive modeling.
The workshop uses Intel® Developer Cloud to show how these algorithms are implemented.
Workshop Objectives
- Learn the capabilities and offerings of the AI Kit
- Discover three ways to install and use the AI Kit
- Review the fundamentals of decision trees and see how to create them using the AI Kit
- Explore advanced forms of decision trees and their advantages
- See how the performance features of XGBoost can accelerate operations with minimal sacrifice of accuracy
Highlights
0:00 Introductions
1:39 Agenda
3:20 Intel® oneAPI software tools for AI and analytics
4:33 Key features and benefits
5:54 Portfolio of Intel® AI software
6:37 Machine learning performance with Intel®-optimized XGBoost
7:34 Customer use cases
8:08 Decision trees
9:16 Overview of classifier characteristics
9:55 Introduction to decision trees
11:18 Regression trees predict continuous values
13:10 Build a decision tree
15:17 Splitting based on a classification error
17:25 Splitting based on entropy
19:03 Classification error versus entropy
20:08 Information gained by splitting
20:48 The Gini index
21:15 Decision trees are high variance
21:52 Prune decision trees
22:40 Strengths of decision trees
23:16 DecisionTreeClassifier: The syntax
24:14 How Intel Developer Cloud for oneAPI works
26:20 Get started
45:33 A quick look at regression
48:10 Bagging/RandomForest
49:35 How to create multiple trees
50:58 Distribution of data in bootstrapped samples
51:35 Aggregate results
52:55 Bagging error calculations
53:53 How many trees to fit?
54:40 Strengths of bagging
55:46 BaggingClassifier: The syntax
56:43 Reduction in variance due to bagging
57:32 Introducing more randomness
58:44 How many random forest trees?
59:10 RandomForest: The syntax
59:46 Create even more randomness
1:00:44 ExtraTreesClassifier: The syntax
1:18:46 XGBoost
1:19:10 Gradient boosting
1:20:55 Decision Stump: The boosting base learner
1:21:54 Overview of boosting
1:25:50 Boosting specifics
1:26:50 Gradient boosting loss function
1:27:45 Bagging versus boosting
1:28:33 Tune a gradient-boosted model
1:30:04 GradientBoostingClassifier: The syntax
1:30:00 Intel® Optimization
1:32:20 Advance memory prefetching
1:34:30 Histogram building code sample
1:36:00 Partition algorithm
1:37:00 Get started with the AI Kit
1:37:30 XGBoost hands-on lab
1:48:40 Results
2:03:00 Q&A
Intel® AI Analytics Toolkit
Accelerate end-to-end machine learning and data science pipelines with optimized deep learning frameworks and high-performing Python* libraries.
Get It Now
New AI Reference Kits Enable Scaling of Machine Learning and Deep Learning Models
Accelerate Linear Regression Models for Machine Learning
Hunt Dinosaurs with Intel® AI Tools for Computer Vision
Machine Learning Tricks to Optimize CatBoost* Performance Up to 4x
Optimize End-to-End AI Pipelines