The distributed_descriptor class template
This page describes the distributed_descriptor class template and its member functions, which belongs to the oneapi::mkl::experimental::dft namespace and is declared in oneapi/mkl/experimental/distributed_dft.hpp (file to be included).
namespace oneapi::mkl::experimental::dft {
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
class distributed_descriptor;
}
The usage of prepended namespace specifiers oneapi::mkl::experimental::dft is omitted below for conciseness while the namespace specifier oneapi::mkl::dft is kept for clarity.
Users of the distributed DFT DPC++ interface of oneMKL must use instances of a specialization of this class template to specify and fully configure the required global DFT computation(s): successfully-committed objects of a (specialized) distributed_descriptor class are required arguments to the distributed DFT specific compute functions.
Template parameters
The distributed_descriptor class template is parameterized by two non-type template parameters, in the following order:
a value of type oneapi::mkl::dft::precision, that determines the floating-point format to be considered by its instances;
a value of type oneapi::mkl::dft::domain, that determines the type of forward domain to be considered by its instances.
Instances of a distributed_descriptor class specialized with value oneapi::mkl::dft::precision::SINGLE (resp. oneapi::mkl::dft::precision::DOUBLE) for the former are referred to as “single-precision descriptors” (resp. “double-precision descriptors”). Similarly, instances of a distributed_descriptor class specialized with value oneapi::mkl::dft::domain::COMPLEX (resp. oneapi::mkl::dft::domain::REAL) for the latter are referred to as “complex descriptors” (resp. “real descriptors”).
This section describes the constructors and destructor of the distributed_descriptor class template.
namespace oneapi::mkl::experimental::dft {
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
class distributed_descriptor {
public:
// parameterized constructors:
distributed_descriptor(MPI_Comm Comm, std::vector<std::int64_t> dimensions);
// destructor
~distributed_descriptor();
// unsupported copy constructor and assignment operator:
distributed_descriptor(const distributed_descriptor&) = delete;
distributed_descriptor& operator=(const distributed_descriptor&) = delete;
// unsupported move constructor and assignment operator:
distributed_descriptor(distributed_descriptor&&) = delete;
distributed_descriptor& operator=(distributed_descriptor&&) = delete;
}
}
Parameterized constructors
The parameterized constructors of any distributed_descriptor class allocate memory for an object’s data structures and default-configure it for the precision, forward domain, and length(s) of the transform it defines. These constructors do not trigger any significant computational work in preparation of the DFT that the object defines upon creation; all such tasks are operated when the object is committed to its DFT definition and to a given sycl::queue instance.
Input parameter
The input parameter of the constructor is detailed in the tables below. The tables use the notations presented in the introduction when referring to parameters of the DFT operation defined by the object.
Name |
Type |
Description |
---|---|---|
Comm |
MPI_Comm |
The MPI communicator containing the processes among which the given DFT problem will be distributed. |
lengths |
std::vector<std::int64_t> |
Vector of size |
Exceptions
The parameterized constructors may throw an oneapi::mkl::exception if
or
;
The dimensions have length less than the number of processes.
Any MPI related error occurs.
The construction of the distributed_descriptor object fails to allocate its required resources.
Copy constructor and assignment operator
Copy construction and assignment are not supported.
Move constructor and assignment operator
Move constructor and assignment operator are not supported.
Destructor
The destructor of any distributed_descriptor class frees all resources allocated for and by objects of that class within the respective MPI process.
This section describes the overloaded configuration-setting member functions set_value of the distributed_descriptor class template.
namespace oneapi::mkl::experimental::dft {
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
class distributed_descriptor {
using real_scalar_t = std::conditional_t<prec == oneapi::mkl::dft::precision::DOUBLE, double, float>;
public:
void set_value(oneapi::mkl::dft::config_param, oneapi::mkl::dft::config_value);
void set_value(oneapi::mkl::dft::config_param, std::int64_t);
void set_value(distributed_config_param, std::int64_t);
void set_value(oneapi::mkl::dft::config_param, const std::vector<std::int64_t>&);
template <typename T, std::enable_if_t<std::is_integral_v<T>, bool> = true>
void set_value(distributed_config_param param, T value) {
set_value(param, static_cast<std::int64_t>(value));
}
void set_value(oneapi::mkl::dft::config_param, real_scalar_t);
template <typename T, std::enable_if_t<std::is_integral_v<T>, bool> = true>
void set_value(oneapi::mkl::dft::config_param param, T value) {
set_value(param, static_cast<std::int64_t>(value));
}
template <typename T, std::enable_if_t<std::is_floating_point_v<T>, bool> = true>
void set_value(oneapi::mkl::dft::config_param param, T value) {
set_value(param, static_cast<real_scalar_t>(value));
}
void set_value(distributed_config_param param,
const std::vector<std::int64_t> &lower_bound,
const std::vector<std::int64_t> &upper_bound,
const std::vector<std::int64_t> &strides);
}
}
The set_value functions enable oneMKL users to assign a configuration value to any writable configuration parameter either oneapi::mkl::dft::config_param or distributed_config_param of a distributed_descriptor object. Most of the set_value functions are similar to what was discussed in configuration-setting-functions and should be referenced for setting global transform configuration like forward/backward scale, etc. These params should be uniformly set across all processes. Also note that the deprecated functionality is not supported by the distributed DFT, trying to set any of the deprecated parameters will throw a oneapi::mkl::invalid_argument exception. Below, we will discuss distributed DFT specific set_value overload functions and also document the default configuration values associated with the respective configuration parameters.
Configuration values may be successfully and correctly set for a distributed_descriptor object but found invalid later on, when attempting to commit that object. Assessing the validity of a distributed_descriptor object’s configuration requires knowledge (and analysis) of all its configuration values (considered “frozen” when the object is committed). Any call to set_value resulting in a configuration change for a committeddistributed_descriptor object effectively uncommits that object: indeed, such a change typically invalidates the object’s compute-readiness preparation steps operated when it was last committed. As a consequence, any such operation changes the object’s (read-only) configuration value associated with configuration parameter oneapi::mkl::dft::config_param::COMMIT_STATUS from oneapi::mkl::dft::config_value::COMMITTED to oneapi::mkl::dft::config_value::UNCOMMITTED.
Setting slab distribution
The following member functions
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
void distributed_descriptor<prec, dom>::set_value(distributed_config_param param, std::int64_t value);
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
template <typename T, std::enable_if_t<std::is_integral_v<T>, bool> = true>
void distributed_descriptor<prec, dom>::set_value(distributed_config_param param, T value) {
set_value(param, static_cast<std::int64_t>(value));
}
enable users to set the dimension to be slab decomposed in the respective domain(forward or backward).
Accepted configuration parameters |
Accepted values |
Default value |
---|---|---|
distributed_config_param::fwd_divided_dimension |
0 to rank-1 |
0 |
distributed_config_param::bwd_divided_dimension |
0 to rank-1 |
1 |
Setting custom distribution
The following member function
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
void distributed_descriptor<prec, dom>::set_value(distributed_config_param param,
const std::vector<std::int64_t> &lower_bound,
const std::vector<std::int64_t> &upper_bound,
const std::vector<std::int64_t> &strides);
enables users to set the per-process rectangle/block’s lower and upper bounds along with the strides within that local portion.
Parameter |
Accepted values |
Definition |
---|---|---|
param |
distributed_config_param::fwd_distribution or distributed_config_param::bwd_distribution |
distributed_config_param determining the domain for which the custom decompostion is being set. |
lower_bound |
std::vector<std::int64_t> object of size |
An std::vector of type std::int64_t whose length equals the rank of the transform, representing the lower-corner of the portion of the global array owned by the current process. |
upper_bound |
std::vector<std::int64_t> object of size |
An std::vector of type std::int64_t whose length equals the rank of the transform, representing the upper-corner of the portion of the global array owned by the current process. |
strides |
std::vector<std::int64_t> object of size |
An std::vector of type std::int64_t whose length equals the rank of the transform, representing the local data layout in memory either for forward or backward domain respectively. Strides must be in decreasing order and positive. |
Exceptions
The configuration-setting member functions may throw
an std::runtime_error exception if an issue is found with the calling object;
a oneapi::mkl::invalid_argument exception if
the parameter being set is not writable;
the parameter being set is rejected, e.g., if it is inconsistent with the type of configuration value being used;
the configuration value being set is rejected for the specific configuration parameter being set.
a oneapi::mkl::unimplemented exception if - the parameter being set is yet to be implemented.
This section describes the overloaded configuration-querying member functions get_value of the distributed_descriptor class template.
namespace oneapi::mkl::experimental::dft {
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
class distributed_descriptor {
using real_scalar_t = std::conditional_t<prec == oneapi::mkl::dft::precision::DOUBLE, double, float>;
public:
// for the type of forward domain:
void get_value(oneapi::mkl::dft::config_param, oneapi::mkl::dft::domain*) const;
// for the floating-point format:
void get_value(oneapi::mkl::dft::config_param, oneapi::mkl::dft::precision*) const;
// for integer-valued parameters:
void get_value(oneapi::mkl::dft::config_param, std::int64_t*) const;
void get_value(distributed_config_param, std::int64_t*) const;
// for vector-valued parameters:
void get_value(oneapi::mkl::dft::config_param, std::vector<std::int64_t>*) const;
// for real-valued parameters:
void get_value(oneapi::mkl::dft::config_param, real_scalar_t*) const;
// for custom distribution:
void get_value(distributed_config_param param,
std::vector<std::int64_t> *lower_bound,
std::vector<std::int64_t> *upper_bound,
std::vector<std::int64_t> *strides) const;
// for other parameters:
void get_value(oneapi::mkl::dft::config_param, oneapi::mkl::dft::config_value*) const;
}
}
The get_value functions enable oneMKL users to query the configuration value associated with a oneapi::mkl::dft::config_param or distributed_config_param of a distributed_descriptor object. Most of the get_value functions are similar to what was discussed in configuration-querying-functions and should be referenced for querying transform configuration like forward/backward scale, etc. The calling distributed_descriptor object is left unchanged by any call to a configuration-querying member function. Also note that the deprecated functionality is not supported by the distributed DFT, trying to query any of the deprecated parameters will throw an exception. Below, we will discuss distributed DFT specific get_value overload functions.
Querying integer-valued parameters
The following member function
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
void distributed_descriptor<prec, dom>::get_value(distributed_config_param param, std::int64_t* value_ptr) const;
enables users to query the configuration value for the configuration parameters in the table below.
Accepted configuration parameters |
Value written by oneMKL |
---|---|
distributed_config_param::fwd_divided_dimension, distributed_config_param::bwd_divided_dimension, distributed_config_param::fwd_local_data_size_bytes, distributed_config_param::bwd_local_data_size_bytes |
value_ptr[0] |
The distributed DFT may require more memory than that which can be deduced from the data distribution for the forward or backward domains within each process. The configuration parameters distributed_config_param::fwd_local_data_size_bytes and distributed_config_param::bwd_local_data_size_bytes can be used to query the process-specific number of bytes in the respective domain (forward or backward) for which the device-accessible memory must be allocated and then initialized with input data. This must be done after the distributed_descriptor has been committed (only after which the exact number of bytes that must be allocated is known).
Querying custom distribution
The following member function
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
void get_value(distributed_config_param param,
std::vector<std::int64_t> *lower_bound,
std::vector<std::int64_t> *upper_bound,
std::vector<std::int64_t> *strides) const;
enables users to query the custom distribution bounds for that process.
Parameter |
Accepted values |
Definition |
---|---|---|
param |
distributed_config_param::fwd_distribution or distributed_config_param::bwd_distribution |
distributed_config_param determining the domain for which the custom decompostion is being queried. |
lower_bound |
std::vector<std::int64_t> * object of size |
A pointer to std::vector of type std::int64_t whose length equals the rank of the transform, to query the lower-corner of the portion of the global array owned by the current process. |
upper_bound |
std::vector<std::int64_t> * object of size |
A pointer to std::vector of type std::int64_t whose length equals the rank of the transform, to query the upper-corner of the portion of the global array owned by the current process. |
strides |
std::vector<std::int64_t> * object of size |
An std::vector of type std::int64_t whose length equals the rank of the transform, to query the local data layout in memory either for forward or backward domain respectively. |
Exceptions
The configuration-querying member functions may throw
an std::runtime_error exception if an issue is found with the calling object;
a oneapi::mkl::uninitialized exception if the calling object is uncommitted yet queried about a configuration parameter that requires the calling object to be committed;
a oneapi::mkl::unimplemented exception if the queried parameter corresponds to a feature that is not implemented for the calling object;
a oneapi::mkl::invalid_argument exception if
the pointer arguments are nullptr;
the parameter being queried is rejected, e.g., if it is inconsistent with the pointer type used as second argument;
the size of the vector pointed by value_ptr is not as required;
the configuration value to be returned cannot be safely or accurately converted into the desired type, e.g., if querying a scaling factor that happens to be
using the configuration-querying member function specific to integer-valued configuration parameters.
This section describes the configuration- and queue-committing member function commit of the distributed_descriptor class template.
namespace oneapi::mkl::experimental::dft {
template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
class distributed_descriptor {
public:
void commit(sycl::queue &user_queue);
}
}
Invoking this function notifies that the calling object’s configuration is completed and triggers it to complete the required initialization steps (e.g., pre-computing data, exploring various factorizations, assessing suitability of various algorithms) relevant to enqueueing the DFT computations that it defines to the user-provided user_queue. That sycl::queue object is mapped to a physical device within the available devices by the MPI.
Upon successful completion, the calling object is “committed” (its configuration value associated with config_param::COMMIT_STATUS is then config_value::COMMITTED). distributed_descriptor objects must be committed to be used in any compute function. Note that all the processes should successfully commit their distributed_descriptor object to obtain correct results.
Changing any configuration setting of a committed object effectively leaves it “uncommitted” (its configuration value associated with config_param::COMMIT_STATUS is then config_value::UNCOMMITTED). As a consequence, it is best to avoid any call to any set_value member function after invoking the commit member function.
Input parameter
Name |
Type |
Description |
---|---|---|
user_queue |
sycl::queue |
Queue to which the local DFT computations are to be enqueued by the calling object, when used in compute functions thereafter |
Exceptions
The configuration- and queue-committing member function may throw
-
- a oneapi::mkl::unimplemented exception, e.g., if the calling object’s configuration is not supported (yet);
-
Non-default slab distribution is used.
-
- a oneapi::mkl::exception exception if,
-
Device is not an Intel® Data Center GPU Max Series.
The environment variable I_MPI_OFFLOAD is not set to 1.
SYCL Backend is not Level Zero.
Default packed layouts are not used for the global array.
Batching is used (setting oneapi::mkl::dft::config_param::NUMBER_OF_TRANSFORMS > 1).
Failure to allocate and initialize resources required for the distributed_descriptor object.