dgmm_batch
Computes a group of (diagonal matrix-matrix product (
dgmm
) operations.Description
The
dgmm_batch
routines perform multiple diagonal matrix-matrix product (dgmm
) operations in a single call.
The diagonal matrices are stored as dense vectors and the operations are performed with groups of matrices and vectors.dgmm_batch
supports the following precisions:T |
---|
float |
double |
std::complex<float> |
std::complex<double> |
dgmm_batch (Buffer Version)
Buffer version of
dgmm_batch
supports only strided API.Strided API
Strided API operation is defined as:
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea in a, i * stridec in c.
X is a vector at offset i * stridex in x
if (left_right == side::left)
C = diag(X) * A
else
C = A * diag(X)
end for
where:
- Ais a matrix
- Xis a diagonal matrix stored as a vector
For strided API, all matrices
A
and C
and vector X
have the same parameters (size, increments) and are stored at a constant stride given by stridea
, stridec
and stridex
from each other.The
a
and x
buffers contain all the input matrices. Total number of matrices in a
and x
are given by batch_size
parameter.Syntax
namespace oneapi::mkl::blas::column_major {
void dgmm_batch(sycl::queue &queue,
oneapi::mkl::side left_right,
std::inte64_t m,
std::int64_t n,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &x,
std::int64_t incx,
std::int64_t stridex,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size);
}
namespace oneapi::mkl::blas::row_major {
void dgmm_batch(sycl::queue &queue,
oneapi::mkl::side left_right,
std::inte64_t m,
std::int64_t n,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &x,
std::int64_t incx,
std::int64_t stridex,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size);
}
Input Parameters
- queue
- The queue where the routine should be executed.
- left_right
- Specifies the position of the diagonal matrix in the product. See Data Types for more details.
- m
- Number of rows of matrixAand matrixC. Must be at least zero.
- n
- Number of columns of matrixAand matrixC. Must be at least zero.
- a
- Buffer holding input matricesA. Size of the buffer must be at leastlda*k+stridea* (batch_size- 1) wherekisnif column major layout ormif row major layout is used.
- lda
- Leading dimension of matricesA. Must be at leastmif column major layout ornif row major layout is used. Must be positive.
- stridea
- Stride between two consecutiveAmatrices. Must be at least zero. See Matrix Storage for more details.
- x
- Buffer holding input matricesX. Size of the buffer must be at least (1 + (len- 1)*abs(incx)) +stridex* (batch_size- 1) wherelenisnif the diagonal matrix is on the right of the product ormotherwise.
- incx
- Stride between two consecutive elements of theXvectors.
- stridex
- Stride between two consecutiveXvectors. Must be at least zero. See Matrix Storage for more details.
- c
- Buffer holding input/output matricesC. Size of the buffer must be at leastbatch_size*stridec.
- ldc
- Leading dimension of matricesC. Must be at leastmif column major layout ornif row major layout is used. Must be positive.
- stridec
- Stride between two consecutiveCmatrices. Must be at leastldc*nif column major layout orldc*mif row major layout is used. See Matrix Storage for more details.
- batch_size
- Number ofdgmmcomputations to perform. Must be at least zero.
Output Parameters
- c
- Buffer holding output matricesCoverwritten bybatch_sizedgmmoperations.
dgmm_batch (USM Version)
USM version of
dgmm_batch
supports group API and strided API.Group API
Group API operation is defined as:
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
A and C are matrices at a[idx] and c[idx]
X is a vector at x[idx]
if (left_right[idx] == side::left)
C = diag(X) * A
else
C = A * diag(X)
idx = idx + 1
end for
end for
where:
- Ais a matrix
- Xis a diagonal matrix stored as a vector
For group API, each group contain matrices and vectors with the same parameters (size, increment).
The
a
and x
arrays contain the pointers for all the input matrices. Total number of matrices in a
and x
are given by:
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event dgmm_batch(sycl::queue &queue,
oneapi::mkl::side *left_right,
std::int64_t *m,
std::int64_t *n,
const T **a,
std::int64_t *lda,
const T **x,
std::int64_t *incx,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event dgmm_batch(sycl::queue &queue,
oneapi::mkl::side *left_right,
std::int64_t *m,
std::int64_t *n,
const T **a,
std::int64_t *lda,
const T **x,
std::int64_t *incx,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
- The queue where the routine should be executed.
- left_right
- Array ofgroup_countparameters.left_right[i]specifies the position of the diagonal matrix in groupi. See Data Types for more details.
- m
- Array ofgroup_countintegers.m[i]specifies number of rows ofAfor every matrix in groupi. All entries must be at least zero.
- n
- Array ofgroup_countintegers.n[i]specifies number of columns ofAfor every matrix in groupi. All entries must be at least zero.
- a
- Array of pointers to input matricesAwith sizetotal_batch_count. Size of the array must be at leastlda[i]*n[i]if column major layout or at leastlda[i]*m[i]if row major layout is used. See Matrix Storage for more details.
- lda
- Array ofgroup_countintegers.lda[i]specifies the leading dimension ofAfor every matrix in groupi. All entries must be positive and at leastm[i]if column major layout or at leastn[i]if row major layout is used.
- x
- Array of pointers to input vectorsXwith sizetotal_batch_count. Size of the array must be at least (1 +len[i]– 1)*abs(incx[i])) wherelen[i]isn[i]if diagonal matrix is on the right of the product orm[i]otherwise. See Matrix Storage for more details.
- incx
- Array ofgroup_countintegers.incx[i]specifies the stride ofXfor every vector in groupi. All entries must be positive.
- c
- Array of pointers to input/output matricesCwith sizetotal_batch_count. Size of the array must be leastldc[i]*n[i]if column major layout or at leastldc[i]*m[i]if row major layout is used. See Matrix Storage for more details.
- ldc
- Array ofgroup_countintegers.ldc[i]specifies the leading dimension ofCfor every matrix in groupi. All entries must be positive and at leastm[i]if column major layout or at leastn[i]if row major layout is used.
- group_count
- Specifies number of groups. Must be at least zero.
- group_size
- Array ofgroup_countintegers.group_size[i]specifies the number of diagonal matrix-matrix product operations in groupi. All entries must be at least zero.
- dependencies
- List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
- Array of pointers to output matricesCoverwritten bytotal_batch_countdgmmoperations.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Strided API operation is defined as:
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea in a, i * stridec in c.
X is a vector at offset i * stridex in x
if (left_right == side::left)
C = diag(X) * A
else
C = A * diag(X)
end for
where:
- Ais a matrix
- Xis a diagonal matrix stored as a vector
For strided API, all matrices
A
and C
and vector X
have the same parameters (size, increments) and are stored at a constant stride given by stridea
, stridec
and stridex
from each other.The
a
and x
buffers contain all the input matrices. Total number of matrices in a
and x
are given by batch_size
parameter.Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event dgmm_batch(sycl::queue &queue,
oneapi::mkl::side left_right,
std::inte64_t m,
std::int64_t n,
const T *a,
std::int64_t lda,
std::int64_t stridea,
const T *x,
std::int64_t incx,
std::int64_t stridex,
T *c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event dgmm_batch(sycl::queue &queue,
oneapi::mkl::side left_right,
std::inte64_t m,
std::int64_t n,
const T *a,
std::int64_t lda,
std::int64_t stridea,
const T *x,
std::int64_t incx,
std::int64_t stridex,
T *c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
- The queue where the routine should be executed.
- left_right
- Specifies the position of the diagonal matrix in the product. See Data Types for more details.
- m
- Number of rows of matrixAand matrixC. Must be at least zero.
- n
- Number of columns of matrixAand matrixC. Must be at least zero.
- a
- Pointer to input matricesA. Size of the array must be at leastlda*k+stridea* (batch_size- 1) wherekisnif column major layout ormif row major layout is used.
- lda
- Leading dimension of matricesA. Must be at leastmif column major layout ornif row major layout is used. Must be positive.
- stridea
- Stride between two consecutiveAmatrices. Must be at least zero. See Matrix Storage for more details.
- x
- Pointer to input matricesX. Size of the array must be at least (1 + (len- 1)*abs(incx)) +stridex* (batch_size- 1) wherelenisnif the diagonal matrix is on the right of the product ormotherwise.
- incx
- Stride between two consecutive elements of theXvectors.
- stridex
- Stride between two consecutiveXvectors. Must be at least zero. See Matrix Storage for more details.
- c
- Pointer to input/output matricesC. Size of the array must be at leastbatch_size*stridec.
- ldc
- Leading dimension of matricesC. Must be at leastmif column major layout ornif row major layout is used. Must be positive.
- stridec
- Stride between two consecutiveCmatrices. Must be at leastldc*nif column major layout orldc*mif row major layout is used. See Matrix Storage for more details.
- batch_size
- Number ofdgmmcomputations to perform. Must be at least zero.
- dependencies
- List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
- Pointer to output matricesCoverwritten bybatch_sizedgmmoperations.
Return Values
Output event to wait on to ensure computation is complete.