C++ API Reference for Intel® Data Analytics Acceleration Library 2020 Update 1

Namespaces | Enumerations
daal::algorithms::kmeans::init Namespace Reference

Contains classes for computing initial centroids for K-Means algorithm.

Namespaces

 interface1
 Contains version 1.0 of the Intel(R) Data Analytics Acceleration Library (Intel(R) DAAL) interface.
 

Enumerations

enum  Method {
  deterministicDense = 0, defaultDense = 0, randomDense = 1, plusPlusDense = 2,
  parallelPlusDense = 3, deterministicCSR = 4, randomCSR = 5, plusPlusCSR = 6,
  parallelPlusCSR = 7
}
 
enum  InputId { data }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm. More...
 
enum  DistributedStep2MasterInputId { partialResults }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm in the distributed processing mode. More...
 
enum  DistributedLocalPlusPlusInputDataId { internalInput = lastDistributedStep2MasterInputId + 1 }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm used with plusPlus and parallelPlus methods only on a local node. More...
 
enum  DistributedStep2LocalPlusPlusInputId { inputOfStep2 = lastDistributedLocalPlusPlusInputDataId + 1 }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
enum  DistributedStep3MasterPlusPlusInputId { inputOfStep3FromStep2 }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm used with plusPlus and parallelPlus methods only on the 3rd step on a master node. More...
 
enum  DistributedStep4LocalPlusPlusInputId { inputOfStep4FromStep3 = lastDistributedLocalPlusPlusInputDataId + 1 }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm used with plusPlus and parallelPlus methods only on a local node. More...
 
enum  DistributedStep5MasterPlusPlusInputId { inputCentroids, inputOfStep5FromStep2 }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm used with parallelPlus method only on a master node. More...
 
enum  DistributedStep5MasterPlusPlusInputDataId { inputOfStep5FromStep3 = lastDistributedStep5MasterPlusPlusInputId + 1 }
 Available identifiers of input objects for computing initial centroids for K-Means algorithm used with parallelPlus methods only on the 5th step on a master node. More...
 
enum  PartialResultId { partialCentroids, partialClusters = partialCentroids, partialClustersNumber }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode. More...
 
enum  DistributedStep2LocalPlusPlusPartialResultId { outputOfStep2ForStep3, outputOfStep2ForStep5 }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
enum  DistributedStep2LocalPlusPlusPartialResultDataId { internalResult = lastDistributedStep2LocalPlusPlusPartialResultId + 1 }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
enum  DistributedStep3MasterPlusPlusPartialResultId { outputOfStep3ForStep4 }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 3rd step on a master node. More...
 
enum  DistributedStep3MasterPlusPlusPartialResultDataId { rngState = lastDistributedStep3MasterPlusPlusPartialResultId + 1, outputOfStep3ForStep5 = rngState }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode used with parallelPlus method only on the 3rd step on a master node. More...
 
enum  DistributedStep4LocalPlusPlusPartialResultId { outputOfStep4 }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 4th step on a local node. More...
 
enum  DistributedStep5MasterPlusPlusPartialResultId { candidates, weights }
 Available identifiers of partial results of computing initial centroids for K-Means algorithm in the distributed processing mode used with parallelPlus method only on the 5th step on a master node. More...
 
enum  ResultId { centroids }
 Available identifiers of the results of computing initial centroids for K-Means algorithm. More...
 

Enumeration Type Documentation

Enumerator
internalInput 

DataCollection with internal algorithm data calculated by previous steps on this node

Enumerator
inputOfStep2 

Numeric table with the new centroids calculated by previous steps of initialization algorithm

Enumerator
internalResult 

DataCollection with internal algorithm data required as an input for the future steps on the node

Enumerator
outputOfStep2ForStep3 

Numeric table containing output from step 2 on the local node used by step 3 on a master node

outputOfStep2ForStep5 

Numeric table containing output from step 2 on the local node used by step 5 on a master node

Enumerator
partialResults 

Collection of partial results computed on local nodes

Enumerator
inputOfStep3FromStep2 

Numeric table with the data calculated on step2 on local nodes

Enumerator
rngState 

Service data generated as the output of step3Master to be used in step5Master

outputOfStep3ForStep5 

Service data generated as the output of step3Master to be used in step5Master

Enumerator
outputOfStep3ForStep4 

KeyValueDataCollection with the input for local nodes on step 4

Enumerator
inputOfStep4FromStep3 

Numeric table with the data calculated on step3 on master node

Enumerator
outputOfStep4 

NumericTable with the new centroids calculated on step 4 on the local node

Enumerator
inputOfStep5FromStep3 

Service data generated as the output of step3Master

Enumerator
inputCentroids 

DataCollection of NumericTables with the new centroids

inputOfStep5FromStep2 

DataCollection of NumericTables with the new centroids rating

Enumerator
candidates 

NumericTable with the new centroids calculated on the previous steps

weights 

NumericTable with the weights of the new centroids calculated on the previous steps

enum InputId

Enumerator
data 

Input data table

enum Method

Available methods for computing initial centroids for K-Means algorithm

Enumerator
deterministicDense 

Default: uses first nClusters points as initial centroids

defaultDense 

Synonym of deterministicDense

randomDense 

Uses random nClusters points as initial centroids

plusPlusDense 

Kmeans++ algorithm by Arthur and Vassilvitskii (2007): http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [1] the first center is selected at random, each subsequent center is selected with a probability proportional to its contribution to the overall error

parallelPlusDense 

Kmeans|| algorithm: scalable Kmeans++ by Bahmani et al. (2012) http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf [2]

deterministicCSR 

Uses first nClusters points as initial centroids for data in a CSR numeric table

randomCSR 

Uses random nClusters points as initial centroids for data in a CSR numeric table

plusPlusCSR 

Kmeans++ algorithm Arthur and Vassilvitskii (2007) http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [1] for the data in a CSR numeric table: the first center is selected at random, each subsequent center is selected with a probability proportional to its contribution to the overall error

parallelPlusCSR 

Kmeans|| algorithm: scalable Kmeans++ by Bahmani et al. (2012) http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf [2] for the data in a CSR numeric table

Enumerator
partialCentroids 

Table with the sum of observations assigned to centroids

partialClusters 

Table with the sum of observations assigned to centroids

Deprecated:
This item will be removed in a future release.
partialClustersNumber 

Table with the number of observations assigned to centroids

Deprecated:
This item will be removed in a future release.
enum ResultId

Enumerator
centroids 

Table for cluster centroids

For more complete information about compiler optimizations, see our Optimization Notice.