Intel® oneAPI Data Analytics Library Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
K-Means initialization
The K-Means initialization algorithm receives n feature vectors as input and chooses k initial centroids. After initialization, K-Means algorithm uses the initialization result to partition input data into k clusters.
Operation  |  
       Computational methods  |  
       Programming Interface  |  
      |||||
Mathematical formulation
Refer to Developer Guide: K-Means Initialization.
Programming Interface
All types and functions in this section are declared in the oneapi::dal::kmeans_init namespace and be available via inclusion of the oneapi/dal/algo/kmeans_init.hpp header file.
Descriptor
template<typenameFloat=float,typenameMethod=method::by_default,typenameTask=task::by_default>classdescriptor
- Template Parameters
 -  
     
Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of K-Means Initialization algorithm.
Task – Tag-type that specifies the type of the problem to solve. Can be task::init.
 
Constructors
descriptor(std::int64_tcluster_count=2)
Creates a new instance of the class with the given cluster_count.
Properties
auto&seed
- Getter & Setter
 -  
     
template <typename M = Method, typename None = detail::v1::enable_if_not_default_dense<M>> auto & get_seed() const
template <typename M = Method, typename None = detail::v1::enable_if_not_default_dense<M>> auto & set_seed(std::int64_t value)
 
auto&local_trials_count
Number of attempts to find the best sample in terms of potential value If the value is equal to -1, the number of trials is 2 + int(log(cluster_count)). Default value: -1.
- Getter & Setter
 -  
     
template <typename M = Method, typename None = detail::v1::enable_if_plus_plus_dense<M>> auto & get_local_trials_count() const
template <typename M = Method, typename None = detail::v1::enable_if_plus_plus_dense<M>> auto & set_local_trials_count(std::int64_t value=-1)
 - Invariants
 -  
     
local_trials > 0 or :expr`local_trials = -1`
 
std::int64_tcluster_count
The number of clusters k. Default value: 2.
- Getter & Setter
 -  
     
std::int64_t get_cluster_count() const
auto & set_cluster_count(std::int64_t value)
 - Invariants
 -  
     
cluster_count > 0
 
Method tags
structdense
Tag-type that denotes dense computational method.
structparallel_plus_dense
Tag-type that denotes parallel_plus_dense computational method.
structplus_plus_dense
Tag-type that denotes plus_plus_dense computational method.
structrandom_dense
Tag-type that denotes random_dense computational method.
usingby_default=dense
Task tags
structinit
Tag-type that parameterizes entities used for obtaining the initial K-Means centroids.
usingby_default=init
Alias tag-type for the initialization task.
Computing compute(...)
Input
template<typenameTask=task::by_default>classcompute_input
- Template Parameters
 -  
     
Task – Tag-type that specifies type of the problem to solve. Can be task::init.
 
Constructors
compute_input(consttable&data)
Creates a new instance of the class with the given data.
Properties
consttable&data
An 
 table with the data to be clustered, where each row stores one feature vector. Default value: table{}.
- Getter & Setter
 -  
     
const table & get_data() const
auto & set_data(const table &data)
 
Result
template<typenameTask=task::by_default>classcompute_result
- Template Parameters
 -  
     
Task – Tag-type that specifies type of the problem to solve. Can be oneapi::dal::kmeans::task::clustering.
 
Constructors
compute_result()
Creates a new instance of the class with the default property values.
Properties
consttable¢roids
A 
 table with the initial centroids. Each row of the table stores one centroid. Default value: table{}.
- Getter & Setter
 -  
     
const table & get_centroids() const
auto & set_centroids(const table &value)
 
Operation
template<typenameDescriptor>kmeans_init::compute_resultcompute(constDescriptor&desc, constkmeans_init::compute_input&input)
- Parameters
 -  
     
desc – K-Means algorithm descriptor kmeans_init::descriptor
input – Input data for the computing operation
 
Usage Example
Computing
table run_compute(const table& data) {
   const auto kmeans_desc = kmeans_init::descriptor<float,
                                                   kmeans_init::method::dense>{}
      .set_cluster_count(10)
   const auto result = compute(kmeans_desc, data);
   print_table("centroids", result.get_centroids());
   return result.get_centroids();
} 
  Examples
oneAPI DPC++
Batch Processing:
oneAPI C++
Batch Processing: