Get hands-on experience using the Intel® AI Analytics Toolkit (AI Kit) to explore predictive modeling techniques based on decision trees. This workshop takes popular decision-tree algorithms—useful for regression and classification tasks—and addresses the challenge of handling training when data sizes increase.

Starting with a basic decision tree and advancing to techniques that balance the tradeoffs of speed and accuracy, this workshop explores the ways that the AI Kit improves implementations of predictive modeling.

The workshop uses Intel® Developer Cloud to show how these algorithms are implemented.


Workshop Objectives

  • Learn the capabilities and offerings of the AI Kit
  • Discover three ways to install and use the AI Kit
  • Review the fundamentals of decision trees and see how to create them using the AI Kit
  • Explore advanced forms of decision trees and their advantages
  • See how the performance features of XGBoost can accelerate operations with minimal sacrifice of accuracy

Highlights

0:00 Introductions

1:39 Agenda

3:20 Intel® oneAPI software tools for AI and analytics

4:33 Key features and benefits

5:54 Portfolio of Intel® AI software

6:37 Machine learning performance with Intel®-optimized XGBoost

7:34 Customer use cases

8:08 Decision trees

9:16 Overview of classifier characteristics

9:55 Introduction to decision trees

11:18 Regression trees predict continuous values

13:10 Build a decision tree

15:17 Splitting based on a classification error

17:25 Splitting based on entropy

19:03 Classification error versus entropy

20:08 Information gained by splitting

20:48 The Gini index

21:15 Decision trees are high variance

21:52 Prune decision trees

22:40 Strengths of decision trees

23:16 DecisionTreeClassifier: The syntax

24:14 How Intel Developer Cloud for oneAPI works

26:20 Get started

45:33 A quick look at regression

48:10 Bagging/RandomForest

49:35 How to create multiple trees

50:58 Distribution of data in bootstrapped samples

51:35 Aggregate results

52:55 Bagging error calculations

53:53 How many trees to fit?

54:40 Strengths of bagging

55:46 BaggingClassifier: The syntax

56:43 Reduction in variance due to bagging

57:32 Introducing more randomness

58:44 How many random forest trees?

59:10 RandomForest: The syntax

59:46 Create even more randomness

1:00:44 ExtraTreesClassifier: The syntax

1:18:46 XGBoost

1:19:10 Gradient boosting

1:20:55 Decision Stump: The boosting base learner

1:21:54 Overview of boosting

1:25:50 Boosting specifics

1:26:50 Gradient boosting loss function

1:27:45 Bagging versus boosting

1:28:33 Tune a gradient-boosted model

1:30:04 GradientBoostingClassifier: The syntax

1:30:00 Intel® Optimization

1:32:20 Advance memory prefetching

1:34:30 Histogram building code sample

1:36:00 Partition algorithm

1:37:00 Get started with the AI Kit

1:37:30 XGBoost hands-on lab

1:48:40 Results

2:03:00 Q&A

 

 

Intel® AI Analytics Toolkit

Accelerate end-to-end machine learning and data science pipelines with optimized deep learning frameworks and high-performing Python* libraries.

Get It Now

See All Tools