Developer Guide and Reference

  • 2021.4
  • 09/27/2021
  • Public Content
Contents

Computation Modes

The library algorithms support the following computation modes:
You can select the computation mode during initialization of the Algorithm.
For a list of computation parameters of a specific algorithm in each computation mode, possible input types, and output results, refer to the description of an appropriate algorithm.

Batch processing

All oneDAL algorithms support at least the batch processing computation mode. In the batch processing mode, the only compute method of a particular algorithm class is used.

Online processing

Some oneDAL algorithms enable processing of data sets in blocks. In the online processing mode, the
compute()
, and
finalizeCompute()
methods of a particular algorithm class are used. This computation mode assumes that the data arrives in blocks LaTex Math image.. Call the
compute()
method each time a new input becomes available. When the last block of data arrives, call the
finalizeCompute()
method to produce final results. If the input data arrives in an asynchronous mode, you can use the
getStatus()
method for a given data source to check whether a new block of data is available for loading.
The following diagram illustrates the computation schema for online processing:
While different data blocks may have different numbers of observations LaTex Math image., they must have the same number of feature vectors
p
.

Distributed processing

Some oneDAL algorithms enable processing of data sets distributed across several devices. In distributed processing mode, the
compute()
and the
finalizeCompute()
methods of a particular algorithm class are used. This computation mode assumes that the data set is split in nblocks blocks across computation nodes.
Computation is done in several steps. You need to define the computation step for an algorithm by providing the computeStep value to the constructor during initialization of the algorithm. Use the
compute()
method on each computation node to compute partial results. Use the
input.add()
method on the master node to add pointers to partial results processed on each computation node. When the last partial result arrives, call the
compute()
method followed by
finalizeCompute()
to produce final results. If the input data arrives in an asynchronous mode, you can use the
getStatus()
method for a given data source to check whether a new block of data is available for loading.
The computation schema is algorithm-specific. The following diagram illustrates a typical computation schema for distribute processing:
For the algorithm-specific computation schema, refer to the Distributed Processing section in the description of an appropriate algorithm.
Distributed algorithms in oneDAL are abstracted from underlying cross-device communication technology, which enables use of the library in a variety of multi-device computing and data transfer scenarios. They include but are not limited to MPI* based cluster environments, Hadoop* or Spark* based cluster environments, low-level data exchange protocols, and more.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.