Intel® oneAPI Data Analytics Library Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
Quality Metrics for Binary Classification Algorithms
For two classes 
 and 
, given a vector 
 of class labels computed at the prediction stage of the classification algorithm and a vector 
 of expected class labels, the problem is to evaluate the classifier by computing the confusion matrix and connected quality metrics: precision, recall, and so on.
QualityMetricsId for binary classification is confusionMatrix.
Details
Further definitions use the following notations:
  |  
        true positive  |  
        the number of correctly recognized observations for class   |  
       
  |  
        true negative  |  
        the number of correctly recognized observations that do not belong to the class   |  
       
  |  
        false positive  |  
        the number of observations that were incorrectly assigned to the class   |  
       
  |  
        false negative  |  
        the number of observations that were not recognized as belonging to the class   |  
       
The library uses the following quality metrics for binary classifiers:
Quality Metric  |  
        Definition  |  
       
|---|---|
Accuracy  |  
        
  |  
       
Precision  |  
        
  |  
       
Recall  |  
        
  |  
       
F-score  |  
        
  |  
       
Specificity  |  
        
  |  
       
Area under curve (AUC)  |  
        
  |  
       
For more details of these metrics, including the evaluation focus, refer to [Sokolova09].
The confusion matrix is defined as follows:
Classified as Class   |  
        Classified as Class   |  
       |
|---|---|---|
Actual Class   |  
        tp  |  
        fn  |  
       
Actual Class   |  
        fp  |  
        tn  |  
       
Batch Processing
Algorithm Input
The quality metric algorithm for binary classifiers accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID  |  
        Input  |  
       
|---|---|
predictedLabels  |  
        Pointer to the  This input can be an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and CSRNumericTable.  |  
       
groundTruthLabels  |  
        Pointer to the  This input can be an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and CSRNumericTable.  |  
       
Algorithm Parameters
The quality metric algorithm has the following parameters:
Parameter  |  
        Default Value  |  
        Description  |  
       
|---|---|---|
algorithmFPType  |  
        float  |  
        The floating-point type that the algorithm uses for intermediate computations. Can be float or double.  |  
       
method  |  
        defaultDense  |  
        Performance-oriented computation method, the only method supported by the algorithm.  |  
       
beta  |  
        1  |  
        The   |  
       
Algorithm Output
The quality metric algorithm calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID  |  
        Result  |  
       
|---|---|
confusionMatrix  |  
        Pointer to the  
            NOTE: 
            By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedTriangularMatrix, PackedSymmetricMatrix, and CSRNumericTable. 
          |  
       
binaryMetrics  |  
        Pointer to the  
 
            NOTE: 
            By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedTriangularMatrix, PackedSymmetricMatrix, and CSRNumericTable. 
          |  
       











 numeric table that contains labels computed at the prediction stage of the classification algorithm.
 parameter of the F-score quality metric provided by the library.
 numeric table with the confusion matrix.
 numeric table that contains quality metrics, which you can access by an appropriate Binary Metrics ID: