Regression Decision Tree
Regression decision tree is a kind of decision trees described in
Classification and Regression > Decision Tree.
Details
Given:
- n feature vectors
of size p
- The vector of responses
, where
describes the dependent variable for independent variables
.
The problem is to build a regression decision tree.
Split Criterion
The library provides the decision tree regression algorithm based
on the mean-squared error (MSE) [Breiman84]:
Where
is the set of all possible outcomes of test
is the subset of
, for which outcome of
is
, for example,
.
The test used in the node is selected as
. For binary decision tree with “true” and “false” branches,
Training Stage
The regression decision tree follows the algorithmic framework of
decision tree training described in Decision Tree.
Prediction Stage
The regression decision tree follows the algorithmic framework of
decision tree prediction described in Decision Tree.
Given the regression decision tree and vectors
,
the problem is to calculate the responses for those vectors.
Batch Processing
Decision tree regression follows the general workflow described
in Regression Usage Model.
Training
At the training stage, decision tree regression has the following
parameters:
Parameter | Default Value | Description |
---|---|---|
algorithmFPType | float | The floating-point type that the algorithm uses for intermediate computations. Can be float or double . |
method | defaultDense | The computation method used by the decision tree regression. The only
training method supported so far is the default dense method. |
pruning | reducedErrorPruning | Method to perform post-pruning. Available options for the pruning parameter:
|
maxTreeDepth | Maximum tree depth. Zero value means unlimited depth. Can be any non-negative number. | |
minObservationsInLeafNodes | Minimum number of observations in the leaf node. Can be any positive number. | |
pruningFraction | Fraction of observations from training dataset to be used as
observations for post-pruning via random sampling. The rest observations
(with fraction | |
engine | SharedPtr<engines::mt19937::Batch<> >() | Pointer to the random number engine to be used for random sampling for reduced error post-pruning. |
Prediction
At the prediction stage, decision tree regression has the following
parameters:
Parameter | Default Value | Description |
---|---|---|
algorithmFPType | float | The floating-point type that the algorithm uses for intermediate computations. Can be float or double . |
method | defaultDense | The computation method used by the decision tree regression. The only
training method supported so far is the default dense method. |
Examples
C++ (CPU)
Batch Processing:
Java*
There is no support for Java on GPU.
Batch Processing:
Python*
Batch Processing: