syrk_batch
Computes a group of
syrk
operations.Description
The
syrk_batch
routines are batched versions of syrk, performing multiple syrk
operations in a single call. Each syrk
operation performs a rank-k update with general matrices.syrk_batch
supports the following precisions:T |
---|
float |
double |
std::complex<float> |
std::complex<double> |
syrk_batch (Buffer Version)
Buffer version of
syrk_batch
supports only strided API.Strided API
Strided API operation is defined as:
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea and i * stridec in a and c.
C = alpha * op(A) * op(A)^T + beta * C
end for
where:
- op(X) is one of op(X) =X, or op(X) =XT, or op(X) =XH
- alphaandbetaare scalars
- Ais general matrix andCis symmetric matrix
- op(A) isnxkandCisnxn
For strided API,
a
and c
buffers contain all the input matrices. The stride between matrices is given by the stride parameters. Total number of matrices in a
and c
buffers is given by batch_size
parameter.Syntax
namespace oneapi::mkl::blas::column_major {
void syrk_batch(sycl::queue &queue,
oneapi::mkl::uplo upper_lower,
oneapi::mkl::transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
T beta,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
void syrk_batch(sycl::queue &queue,
oneapi::mkl::uplo upper_lower,
oneapi::mkl::transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
T beta,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size)
}
Input Parameters
- queue
- The queue where the routine should be executed.
- upper_lower
- Specifies whether matricesCare upper or lower triangular. See Data Types for more details.
- trans
- Specifies op(A), transposition operation applied to matricesA. Conjugation is never performed even iftrans=transpose::conjtrans. See Data Types for more details.
- n
- Number of rows and columns of matricesC. Must be at least zero.
- k
- Number of columns of matrices op(A). Must be at least zero.
- alpha
- Scaling factor for rank-k update.
- a
- Buffer holding input matricesA. Size of the buffer must be at leaststridea*batch_size.
- lda
- Leading dimension of matricesA. Must be positive.transa=transpose::nontranstransa=transpose::transortrans=transpose::conjtransColumn majorMust be at leastnMust be at leastkRow majorMust be at leastkMust be at leastn
- stridea
- Stride between two consecutiveAmatrices.transa=transpose::nontranstransa=transpose::transortrans=transpose::conjtransColumn majorMust be at leastlda*kMust be at leastlda*nRow majorMust be at leastlda*nMust be at leastlda*k
- beta
- Scaling factor for matricesC.
- c
- Buffer holding input/output matricesC. Size of the buffer must be at leaststridec*batch_size.
- ldc
- Leading dimension of matricesC. Must be positive and at leastn.
- stridec
- Stride between two consecutiveCmatrices. Must be leastldc*n.
- batch_size
- Specifies the number of matrix multiply operations to perform.
Output Parameters
- c
- Output buffer overwritten bybatch_sizesyrkoperations of the formalpha* op(A) * op(A)T+beta*C.
syrk_batch (USM Version)
USM version of
syrk_batch
supports group API and strided API.Group API
Group API operation is defined as:
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
A, and C are matrices in a[idx] and c[idx]
C = alpha[i] * op(A) * op(A)^T + beta[i] * C
idx := idx + 1
end for
end for
where:
- op(X) is one of op(X) =X, or op(X) =XT, or op(X) =XH
- alphaandbetaare scalars
- Ais general matrix andCis symmetric matrix
- op(A) isnxkandCisnxn
For group API,
a
and c
arrays contain the pointers for all the input matrices.
The total number of matrices in a
and c
are given by:
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event syrk_batch(sycl::queue &queue,
oneapi::mkl::uplo *upper_lower,
oneapi::mkl::transpose *trans,
std::int64_t *n,
std::int64_t *k,
T *alpha,
const T **a,
std::int64_t *lda,
T *beta,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event syrk_batch(sycl::queue &queue,
oneapi::mkl::uplo *upper_lower,
oneapi::mkl::transpose *trans,
std::int64_t *n,
std::int64_t *k,
T *alpha,
const T **a,
std::int64_t *lda,
T *beta,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
- The queue where the routine should be executed.
- upper_lower
- Array ofgroup_countoneapi::mkl::uplovalues.upper_lower[i]specifies whether matricesCare upper or lower triangular in groupi. See Data Types for more details.
- trans
- Array ofgroup_countoneapi::mkl::transposevalues.trans[i]specifies op(A), transposition operation applied to matricesAin groupi. See Data Types for more details.
- n
- Array ofgroup_countintegers.n[i]specifies number of rows and columns of matricesCin groupi. All entries must be at least zero.
- k
- Array ofgroup_countintegers.k[i]specifies number of columns of matrices op(A) in groupi. All entries must be at least zero.
- alpha
- Array ofgroup_countscalar elements.alpha[i]specifies scaling factor for every rank-k update in groupi.
- a
- trans=transpose::nontranstrans=transpose::transortrans=transpose::conjtransColumn majorSize of arrayA[i]must be at leastlda[i]*k[i]Size of arrayA[i]must be at leastlda[i]*n[i]Row majorSize of arrayA[i]must be at leastlda[i]*n[i]Size of arrayA[i]must be at leastlda[i]*k[i]
- lda
- Array ofgroup_countintegers.lda[i]specifies leading dimension of matricesAin groupi. Must be positive.trans=transpose::nontranstrans=transpose::transortrans=transpose::conjtransColumn majorMust be at leastn[i].Must be at leastk[i].Row majorMust be at leastk[i].Must be at leastn[i].
- beta
- Array ofgroup_countscalar elements.beta[i]specifies scaling factor for matricesCin groupi.
- c
- Array oftotal_batch_countpointers for input/output matricesC. Size of arrayC[i]must be at leastldc[i]*n[i]. See Matrix Storage for more details.
- ldc
- Array ofgroup_countintegers.ldc[i]specifies leading dimension of matricesCin groupi. Must be positive.
- group_count
- Number of groups. Must be at least zero.
- group_size
- Array ofgroup_countintegers.group_size[i]specifies the number ofsyrkoperations in groupi. Each element ingroup_sizemust be at least zero.
- dependencies
- List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
- Array of pointers to output matricesCoverwritten bytotal_batch_countsyrkoperations of the formalpha* op(A) * op(A)T+beta*C.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Strided API operation is defined as:
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea and i * stridec in a and c.
C = alpha * op(A) * op(A)^T + beta * C
end for
where:
- op(X) is one of op(X) =X, or op(X) =XT, or op(X) =XH
- alphaandbetaare scalars
- Ais general matrix andCis symmetric matrix
- op(A) isnxkandCisnxn
For strided API,
a
and c
arrays contain all the input matrices. The stride between matrices is given by the stride parameters. Total number of matrices in a
and c
arrays is given by batch_size
parameter.Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event syrk_batch(sycl::queue &queue,
oneapi::mkl::uplo upper_lower,
oneapi::mkl::transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
const T *a,
std::int64_t lda,
std::int64_t stridea,
T beta,
T *c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event syrk_batch(sycl::queue &queue,
oneapi::mkl::uplo upper_lower,
oneapi::mkl::transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
const T *a,
std::int64_t lda,
std::int64_t stridea,
T beta,
T *c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
- The queue where the routine should be executed.
- upper_lower
- Specifies whether matricesCare upper or lower triangular. See Data Types for more details.
- trans
- Specifies op(A), transposition operation applied to matricesA. Conjugation is never performed even iftrans=transpose::conjtrans. See Data Types for more details.
- n
- Number of rows and columns of matricesC. Must be at least zero.
- k
- Number of columns of matrices op(A). Must be at least zero.
- alpha
- Scaling factor for rank-k update.
- a
- Pointer to input matricesA. Size of the array must be at leaststridea*batch_size.
- lda
- Leading dimension of matricesA. Must be positive.transa=transpose::nontranstransa=transpose::transortrans=transpose::conjtransColumn majorMust be at leastnMust be at leastkRow majorMust be at leastkMust be at leastn
- stridea
- Stride between two consecutiveAmatrices.transa=transpose::nontranstransa=transpose::transortrans=transpose::conjtransColumn majorMust be at leastlda*kMust be at leastlda*nRow majorMust be at leastlda*nMust be at leastlda*k
- beta
- Scaling factor for matricesC.
- c
- Pointer to input/output matricesC. Size of the array must be at leaststridec*batch_size.
- ldc
- Leading dimension of matricesC. Must be positive and at leastn.
- stridec
- Stride between two consecutiveCmatrices. Must be leastldc*n.
- batch_size
- Specifies the number of matrix multiply operations to perform.
- dependencies
- List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
- Pointer to output matricesCoverwritten bybatch_sizesyrkoperations of the formalpha* op(A) * op(A)T+beta*C.
Return Values
Output event to wait on to ensure computation is complete.