Visible to Intel only — GUID: GUID-6627011A-2A05-48D3-9F61-4E6EE195BC12
Visible to Intel only — GUID: GUID-6627011A-2A05-48D3-9F61-4E6EE195BC12
Distributed Processing
The distributed processing mode assumes that the data set R is split in nblocks blocks across computation nodes.
Parameters
In the distributed processing mode, initialization of item factors for the implicit ALS algorithm has the following parameters:
Parameter |
Default Value |
Description |
---|---|---|
algorithmFPType |
float |
The floating-point type that the algorithm uses for intermediate computations. Can be float or double. |
method |
fastCSR |
Performance-oriented computation method for CSR numeric tables, the only method supported by the algorithm. |
nFactors |
10 |
The total number of factors. |
fullNUsers |
0 |
The total number of users m. |
partition |
Not applicable |
A numeric table of size either |
engine |
SharePtr< engines:: mt19937:: Batch>() |
Pointer to the random number generator engine that is used internally at the initialization step. |
To initialize the implicit ALS algorithm in the distributed processing mode, use the one-step process illustrated by the following diagram for :
![](/content/dam/docs/us/en/developer-guide-reference/2024-0/E22C09B8-C9EE-4330-AA61-C9D6190F7606-low.png)
Step 1 - on Local Nodes
![](/content/dam/docs/us/en/developer-guide-reference/2024-0/F7646C00-EC7D-41BD-A986-235260814D67-low.png)
Input
In the distributed processing mode, initialization of item factors for the implicit ALS algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID |
Input |
---|---|
dataColumnSlice |
An The input should be an object of CSRNumericTable class. |
Output
In the distributed processing mode, initialization of item factors for the implicit ALS algorithm calculates the results described below. Pass the Partial Result ID as a parameter to the methods that access the results of your algorithm. Partial results that correspond to the outputOfInitForComputeStep3 and offsets Partial Result IDs should be transferred to Step 3 of the distributed ALS training algorithm.
Output of Initialization for Computing Step 3 (outputOfInitForComputeStep3) is a key-value data collection that maps components of the partial model on the i-th node to all local nodes. Keys in this data collection are indices of the nodes and the value that corresponds to each key i is a numeric table that contains indices of the factors of the items to be transferred to the i-th node on Step 3 of the distributed ALS training algorithm.
User Offsets (offsets) is a key-value data collection, where the keys are indices of the nodes and the value that correspond to the key i is a numeric table of size that contains the value of the starting offset of the user factors stored on the i-th node.
For more details, see Algorithms.
Partial Result ID |
Result |
---|---|
partialModel |
The model with initialized item factors. The result can only be an object of the PartialModel class. |
outputOfInitForComputeStep3 |
A key-value data collection that maps components of the partial model to the local nodes. |
offsets |
A key-value data collection of size nblocks that holds the starting offsets of the factor indices on each node. |
outputOfStep1ForStep2 |
A key-value data collection of size nblocks that contains the parts of the input numeric table: j -th element of this collection is a numeric table of size |
Step 2 - on Local Nodes
![](/content/dam/docs/us/en/developer-guide-reference/2024-0/F5974773-D5F9-40D1-91D3-0602FE14712A-low.png)
Input
This step uses the results of the previous step.
Input ID |
Input |
---|---|
inputOfStep2FromStep1 |
A key-value data collection of size nblocks that contains the parts of the input data set: i -th element of this collection is a numeric table of size |
Output
In this step, implicit ALS initialization calculates the partial results described below. Pass the Partial Result ID as a parameter to the methods that access the results of your algorithm. Partial results that correspond to the outputOfInitForComputeStep3 and offsets Partial Result IDs should be transferred to Step 3 of the distributed ALS training algorithm.
Output of Initialization for Computing Step 3 (outputOfInitForComputeStep3) is a key-value data collection that maps components of the partial model on the i-th node to all local nodes. Keys in this data collection are indices of the nodes and the value that corresponds to each key i is a numeric table that contains indices of the user factors to be transferred to the i-th node on Step 3 of the distributed ALS training algorithm.
Item Offsets (offsets) is a key-value data collection, where the keys are indices of the nodes and the value that correspond to the key i is a numeric table of size that contains the value of the starting offset of the item factors stored on the i-th node.
For more details, see Algorithms.
Partial Result ID |
Result |
---|---|
dataRowSlice |
An |
outputOfInitForComputeStep3 |
A key-value data collection that maps components of the partial model to the local nodes. |
offsets |
A key-value data collection of size nblocks that holds the starting offsets of the factor indices on each node. |