gemm

Description

The gemm routines compute a scalar-matrix-matrix product and add the result to a scalar-matrix product, with general matrices. The operation is defined as:

where:

op(X) is one of op(X) = X, or op(X) = X^T, or op(X) = X^H
alpha and beta are scalars
A, B and C are matrices
op(A) is m x k matrix
op(B) is k x n matrix
C is m x n matrix

gemm supports the following precisions:

Ta (A matrix)	Tb (B matrix)	Tc (C matrix)	Ts (alpha/beta)
`sycl::half`	`sycl::half`	`sycl::half`	`sycl::half`
`sycl::half`	`sycl::half`	`float`	`float`
`oneapi::mkl::bfloat16`	`oneapi::mkl::bfloat16`	`oneapi::mkl::bfloat16`	`float`
`oneapi::mkl::bfloat16`	`oneapi::mkl::bfloat16`	`float`	`float`
`std::int8_t`	`std::int8_t`	`std::int32_t`	`float`
`std::int8_t`	`std::int8_t`	`float`	`float`
`float`	`float`	`float`	`float`
`double`	`double`	`double`	`double`
`std::complex<float>`	`std::complex<float>`	`std::complex<float>`	`std::complex<float>`
`std::complex<double>`	`std::complex<double>`	`std::complex<double>`	`std::complex<double>`

gemm (Buffer Version)

Syntax

namespace oneapi::mkl::blas::column_major {
    void gemm(sycl::queue &queue,
              oneapi::mkl::transpose transa,
              oneapi::mkl::transpose transb,
              std::int64_t m,
              std::int64_t n,
              std::int64_t k,
              Ts alpha,
              sycl::buffer<Ta,1> &a,
              std::int64_t lda,
              sycl::buffer<Tb,1> &b,
              std::int64_t ldb,
              Ts beta,
              sycl::buffer<Tc,1> &c,
              std::int64_t ldc,
              compute_mode mode = compute_mode::unset)
}

namespace oneapi::mkl::blas::row_major {
    void gemm(sycl::queue &queue,
              oneapi::mkl::transpose transa,
              oneapi::mkl::transpose transb,
              std::int64_t m,
              std::int64_t n,
              std::int64_t k,
              Ts alpha,
              sycl::buffer<Ta,1> &a,
              std::int64_t lda,
              sycl::buffer<Tb,1> &b,
              std::int64_t ldb,
              Ts beta,
              sycl::buffer<Tc,1> &c,
              std::int64_t ldc,
              compute_mode mode = compute_mode::unset)
}

Input Parameters

queue

The queue where the routine should be executed.

transa

Specifies op(A), the transposition operation applied to matrix A. See Data Types for more details.

transb

Specifies op(B), the transposition operation applied to matrix B. See Data Types for more details.

m

Number of rows of matrix op(A) and matrix C. Must be at least zero.

n

Number of columns of matrix op(B) and matrix C. Must be at least zero.

k

Number of columns of matrix op(A) and rows of matrix op(B). Must be at least zero.

alpha

Scaling factor for matrix-matrix product.

a

Buffer holding input matrix A. See Matrix Storage for more details.

	`transa` = `transpose::nontrans`	`transa` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	`A` is `m` x `k` matrix. Size of array `a` must be at least `lda` * `k`	`A` is `k` x `m` matrix. Size of array `a` must be at least `lda` * `m`
Row major	`A` is `m` x `k` matrix. Size of array `a` must be at least `lda` * `m`	`A` is `k` x `m` matrix. Size of array `a` must be at least `lda` * `k`

lda

Leading dimension of matrix A. Must be positive.

	`transa` = `transpose::nontrans`	`transa` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `m`	Must be at least `k`
Row major	Must be at least `k`	Must be at least `m`

b

Buffer holding input matrix B. See Matrix Storage for more details.

	`transb` = `transpose::nontrans`	`transb` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	`B` is `k` x `n` matrix. Size of array `b` must be at least `ldb` * `n`	`B` is `n` x `k` matrix. Size of array `b` must be at least `ldb` * `k`
Row major	`B` is `k` x `n` matrix. Size of array `b` must be at least `ldb` * `k`	`B` is `n` x `k` matrix. Size of array `b` must be at least `ldb` * `n`

ldb

Leading dimension of matrix B. Must be positive.

	`transb` = `transpose::nontrans`	`transb` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `k`	Must be at least `n`
Row major	Must be at least `n`	Must be at least `k`

beta

Scaling factor for matrix C.

c

Buffer holding input/output matrix C. See Matrix Storage for more details.

Column major	`C` is `m` x `n` matrix. Size of array `c` must be at least `ldc` * `n`
Row major	`C` is `m` x `n` matrix. Size of array `c` must be at least `ldc` * `m`

ldc

Leading dimension of matrix C. Must be positive.

Column major	Must be at least `m`
Row major	Must be at least `n`

mode

Optional. Compute mode settings. See Compute Modes for more details.

Output Parameters

c: Output buffer overwritten by alpha * op(A)*op(B) + beta * C.

NOTE:

If beta = 0, matrix C does not need to be initialized before calling gemm.

Examples

An example of how to use buffer version of gemm can be found in oneMKL installation directory, under:

examples/dpcpp/blas/source/gemm.cpp

gemm (USM Version)

Syntax

namespace oneapi::mkl::blas::column_major {
    sycl::event gemm(sycl::queue &queue,
                     oneapi::mkl::transpose transa,
                     oneapi::mkl::transpose transb,
                     std::int64_t m,
                     std::int64_t n,
                     std::int64_t k,
                     Ts alpha,
                     const Ta *a,
                     std::int64_t lda,
                     const Tb *b,
                     std::int64_t ldb,
                     Ts beta,
                     Tc *c,
                     std::int64_t ldc,
                     compute_mode mode = compute_mode::unset,
                     const std::vector<sycl::event> &dependencies = {})
}

namespace oneapi::mkl::blas::row_major {
    sycl::event gemm(sycl::queue &queue,
                     oneapi::mkl::transpose transa,
                     oneapi::mkl::transpose transb,
                     std::int64_t m,
                     std::int64_t n,
                     std::int64_t k,
                     Ts alpha,
                     const Ta *a,
                     std::int64_t lda,
                     const Tb *b,
                     std::int64_t ldb,
                     Ts beta,
                     Tc *c,
                     std::int64_t ldc,
                     compute_mode mode = compute_mode::unset,
                     const std::vector<sycl::event> &dependencies = {})
}

Input Parameters

queue

The queue where the routine should be executed.

transa

Specifies op(A), the transposition operation applied to matrix A. See Data Types for more details.

transb

Specifies op(B), the transposition operation applied to matrix B. See Data Types for more details.

m

Number of rows of matrix op(A) and matrix C. Must be at least zero.

n

Number of columns of matrix op(B) and matrix C. Must be at least zero.

k

Number of columns of matrix op(A) and rows of matrix op(B). Must be at least zero.

alpha

Scaling factor for matrix-matrix product.

a

Pointer to input matrix A. See Matrix Storage for more details.

	`A` not transposed	`A` transposed
Column major	`A` is `m` x `k` matrix. Size of array `a` must be at least `lda` * `k`	`A` is `k` x `m` matrix. Size of array `a` must be at least `lda` * `m`
Row major	`A` is `m` x `k` matrix. Size of array `a` must be at least `lda` * `m`	`A` is `k` x `m` matrix. Size of array `a` must be at least `lda` * `k`

lda

Leading dimension of matrix A. Must be positive.

	`A` not transposed	`A` transposed
Column major	Must be at least `m`	Must be at least `k`
Row major	Must be at least `k`	Must be at least `m`

b

Pointer to input matrix B. See Matrix Storage for more details.

	`B` not transposed	`B` transposed
Column major	`B` is `k` x `n` matrix. Size of array `b` must be at least `ldb` * `n`	`B` is `n` x `k` matrix. Size of array `b` must be at least `ldb` * `k`
Row major	`B` is `k` x `n` matrix. Size of array `b` must be at least `ldb` * `k`	`B` is `n` x `k` matrix. Size of array `b` must be at least `ldb` * `n`

ldb

Leading dimension of matrix B. Must be positive.

	`B` not transposed	`B` transposed
Column major	Must be at least `k`	Must be at least `n`
Row major	Must be at least `n`	Must be at least `k`

beta

Scaling factor for matrix C.

c

Pointer to input/output matrix C. See Matrix Storage for more details.

Column major	`C` is `m` x `n` matrix. Size of array `c` must be at least `ldc` * `n`
Row major	`C` is `m` x `n` matrix. Size of array `c` must be at least `ldc` * `m`

ldc

Leading dimension of matrix C. Must be positive.

Column major	Must be at least `m`
Row major	Must be at least `n`

mode

Optional. Compute mode settings. See Compute Modes for more details.

dependencies

Optional. List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.

mode and dependencies may be omitted independently; it is not necessary to specify mode in order to provide dependencies.

Output Parameters

c: Pointer to output matrix overwritten by alpha * op(A)*op(B) + beta * C.

NOTE:

If beta = 0, matrix C does not need to be initialized before calling gemm.

Return Values

Output event to wait on to ensure computation is complete.

Examples

An example of how to use USM version of gemm can be found in oneMKL installation directory, under:

examples/dpcpp/blas/source/gemm_usm.cpp

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Math Kernel Library (oneMKL) - Data Parallel C++ Developer Reference

gemm

Description

gemm (Buffer Version)

Examples

gemm (USM Version)

Examples