Developer Guide and Reference

  • 2021.4
  • 09/27/2021
  • Public Content
Contents

Cross-entropy Loss

Cross-entropy loss is an objective function minimized in the process of logistic regression training when a dependent variable takes more than two values.

Details

Given
n
feature vectors LaTex Math image. of
n
p
-dimensional feature vectors, a vector of class labels LaTex Math image., where LaTex Math image. describes the class, to which the feature vector LaTex Math image. belongs, where
T
is the number of classes, optimization solver optimizes cross-entropy loss objective function by argument LaTex Math image., it is a matrix of size LaTex Math image.. The cross entropy loss objective function LaTex Math image. has the following format LaTex Math image. where
  • LaTex Math image., with LaTex Math image. and LaTex Math image., LaTex Math image., LaTex Math image.
  • LaTex Math image.
For a given set of indices LaTex Math image., LaTex Math image., LaTex Math image., the value and the gradient of the sum of functions in the argument X respectively have the format:
LaTex Math image.
LaTex Math image.
where
LaTex Math image.
Hessian matrix is a symmetric matrix of size LaTex Math image., where LaTex Math image.
LaTex Math image.
LaTex Math image.
LaTex Math image., where LaTex Math image. is the learning rate
LaTex Math image.
For more details, see [Hastie2009].

Computation

Algorithm Input
The cross entropy loss algorithm accepts the input described below. Pass the
Input ID
as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
argument
A numeric table of size LaTex Math image. with the input argument LaTex Math image. of the objective function.
The sizes of the argument, gradient, and hessian numeric tables do not depend on
interceptFlag
. When
interceptFlag
is set to
false
, the computation of LaTex Math image. value is skipped, but the sizes of the tables should remain the same.
data
A numeric table of size LaTex Math image. with the data LaTex Math image..
This parameter can be an object of any class derived from
NumericTable
.
dependentVariables
A numeric table of size LaTex Math image. with dependent variables LaTex Math image..
This parameter can be an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
Algorithm Parameters
The cross entropy loss algorithm has the following parameters. Some of them are required only for specific values of the computation method’s parameter
method
:
Parameter
Default value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
Performance-oriented computation method.
numberOfTerms
Not applicable
The number of terms in the objective function.
batchIndices
Not applicable
The numeric table of size LaTex Math image., where
m
is the batch size, with a batch of indices to be used to compute the function results. If no indices are provided, the implementation uses all the terms in the computation.
This parameter can be an object of any class derived from
NumericTable
except
PackedTriangularMatrix
and
PackedSymmetricMatrix
.
resultsToCompute
gradient
The 64-bit integer flag that specifies which characteristics of the objective function to compute.
Provide one of the following values to request a single characteristic or use bitwise OR to request a combination of the characteristics:
value
Value of the objective function
nonSmoothTermValue
Value of non-smooth term of the objective function
gradient
Gradient of the smooth term of the objective function
hessian
Hessian of smooth term of the objective function
proximalProjection
Projection of proximal operator for non-smooth term of the objective function
lipschitzConstant
Lipschitz constant of the smooth term of the objective function
gradientOverCertainFeature
Certain component of gradient vector
hessianOverCertainFeature
Certain component of hessian diagonal
proximalProjectionOfCertainFeature
Certain component of proximal projection
interceptFlag
true
A flag that indicates a need to compute LaTex Math image..
penaltyL1
0
L1 regularization coefficient
penaltyL2
0
L2 regularization coefficient
nClasses
Not applicable
The number of classes (different values of dependent variable)
Algorithm Output
For the output of the cross entropy loss algorithm, see Output for objective functions.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.