Developer Guide and Reference

  • 2022.1
  • 04/11/2022
  • Public Content
Contents

BLAS functions

Overview

A subset of Basic Linear Algebra (BLAS) functions that perform matrix-matrix multiplication. More…
// global functions status dnnl::sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc ); status dnnl::gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co ); status dnnl::gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co ); dnnl_status_t DNNL_API dnnl_sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc ); dnnl_status_t DNNL_API dnnl_gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co ); dnnl_status_t DNNL_API dnnl_gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co );

Detailed Documentation

A subset of Basic Linear Algebra (BLAS) functions that perform matrix-matrix multiplication.
Global Functions
status dnnl::sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc )
Performs single-precision matrix-matrix multiply.
The operation is defined as:
C := alpha * op( A ) * op( B ) + beta * C
where
  • op( X ) = X
    or
    op( X ) = X**T
    ,
  • alpha
    and
    beta
    are scalars, and
  • A
    ,
    B
    , and
    C
    are matrices:
    • op( A )
      is an
      MxK
      matrix,
    • op( B )
      is an
      KxN
      matrix,
    • C
      is an
      MxN
      matrix.
The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Parameters:
transa
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed.
transb
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed.
M
The M dimension.
N
The N dimension.
K
The K dimension.
alpha
The alpha parameter that is used to scale the product of matrices A and B.
A
A pointer to the A matrix data.
lda
The leading dimension for the matrix A.
B
A pointer to the B matrix data.
ldb
The leading dimension for the matrix B.
beta
The beta parameter that is used to scale the matrix C.
C
A pointer to the C matrix data.
ldc
The leading dimension for the matrix C.
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
status dnnl::gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset
where
  • op( X ) = X
    or
    op( X ) = X**T
    ,
  • alpha
    and
    beta
    are scalars, and
  • A
    ,
    B
    , and
    C
    are matrices:
    • op( A )
      is an
      MxK
      matrix,
    • op( B )
      is an
      KxN
      matrix,
    • C
      is an
      MxN
      matrix.
  • A_offset
    is an
    MxK
    matrix with every element equal the
    ao
    value,
  • B_offset
    is an
    KxN
    matrix with every element equal the
    bo
    value,
  • C_offset
    is an
    MxN
    matrix which is defined by the
    co
    array of size
    len
    :
    • if
      offsetc = F
      : the
      len
      must be at least
      1
      ,
    • if
      offsetc = C
      : the
      len
      must be at least
      max(1, m)
      ,
    • if
      offsetc = R
      : the
      len
      must be at least
      max(1, n)
      ,
The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed.
transb
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed.
offsetc
Flag specifying how offsets should be applied to matrix C:
  • ‘F’ means that the same offset will be applied to each element of the matrix C,
  • ‘C’ means that individual offset will be applied to each element within each column,
  • ‘R’ means that individual offset will be applied to each element within each row.
M
The M dimension.
N
The N dimension.
K
The K dimension.
alpha
The alpha parameter that is used to scale the product of matrices A and B.
A
A pointer to the A matrix data.
lda
The leading dimension for the matrix A.
ao
The offset value for the matrix A.
B
A pointer to the B matrix data.
ldb
The leading dimension for the matrix B.
bo
The offset value for the matrix B.
beta
The beta parameter that is used to scale the matrix C.
C
A pointer to the C matrix data.
ldc
The leading dimension for the matrix C.
co
An array of offset values for the matrix C. The number of elements in the array depends on the value of
offsetc
.
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
status dnnl::gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset
where
  • op( X ) = X
    or
    op( X ) = X**T
    ,
  • alpha
    and
    beta
    are scalars, and
  • A
    ,
    B
    , and
    C
    are matrices:
    • op( A )
      is an
      MxK
      matrix,
    • op( B )
      is an
      KxN
      matrix,
    • C
      is an
      MxN
      matrix.
  • A_offset
    is an
    MxK
    matrix with every element equal the
    ao
    value,
  • B_offset
    is an
    KxN
    matrix with every element equal the
    bo
    value,
  • C_offset
    is an
    MxN
    matrix which is defined by the
    co
    array of size
    len
    :
    • if
      offsetc = F
      : the
      len
      must be at least
      1
      ,
    • if
      offsetc = C
      : the
      len
      must be at least
      max(1, m)
      ,
    • if
      offsetc = R
      : the
      len
      must be at least
      max(1, n)
      ,
The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed.
transb
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed.
offsetc
Flag specifying how offsets should be applied to matrix C:
  • ‘F’ means that the same offset will be applied to each element of the matrix C,
  • ‘C’ means that individual offset will be applied to each element within each column,
  • ‘R’ means that individual offset will be applied to each element within each row.
M
The M dimension.
N
The N dimension.
K
The K dimension.
alpha
The alpha parameter that is used to scale the product of matrices A and B.
A
A pointer to the A matrix data.
lda
The leading dimension for the matrix A.
ao
The offset value for the matrix A.
B
A pointer to the B matrix data.
ldb
The leading dimension for the matrix B.
bo
The offset value for the matrix B.
beta
The beta parameter that is used to scale the matrix C.
C
A pointer to the C matrix data.
ldc
The leading dimension for the matrix C.
co
An array of offset values for the matrix C. The number of elements in the array depends on the value of
offsetc
.
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc )
Performs single-precision matrix-matrix multiply.
The operation is defined as:
C := alpha * op( A ) * op( B ) + beta * C
where
  • op( X ) = X
    or
    op( X ) = X**T
    ,
  • alpha
    and
    beta
    are scalars, and
  • A
    ,
    B
    , and
    C
    are matrices:
    • op( A )
      is an
      MxK
      matrix,
    • op( B )
      is an
      KxN
      matrix,
    • C
      is an
      MxN
      matrix.
The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Parameters:
transa
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed.
transb
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed.
M
The M dimension.
N
The N dimension.
K
The K dimension.
alpha
The alpha parameter that is used to scale the product of matrices A and B.
A
A pointer to the A matrix data.
lda
The leading dimension for the matrix A.
B
A pointer to the B matrix data.
ldb
The leading dimension for the matrix B.
beta
The beta parameter that is used to scale the matrix C.
C
A pointer to the C matrix data.
ldc
The leading dimension for the matrix C.
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset
where
  • op( X ) = X
    or
    op( X ) = X**T
    ,
  • alpha
    and
    beta
    are scalars, and
  • A
    ,
    B
    , and
    C
    are matrices:
    • op( A )
      is an
      MxK
      matrix,
    • op( B )
      is an
      KxN
      matrix,
    • C
      is an
      MxN
      matrix.
  • A_offset
    is an
    MxK
    matrix with every element equal the
    ao
    value,
  • B_offset
    is an
    KxN
    matrix with every element equal the
    bo
    value,
  • C_offset
    is an
    MxN
    matrix which is defined by the
    co
    array of size
    len
    :
    • if
      offsetc = F
      : the
      len
      must be at least
      1
      ,
    • if
      offsetc = C
      : the
      len
      must be at least
      max(1, m)
      ,
    • if
      offsetc = R
      : the
      len
      must be at least
      max(1, n)
      ,
The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed.
transb
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed.
offsetc
Flag specifying how offsets should be applied to matrix C:
  • ‘F’ means that the same offset will be applied to each element of the matrix C,
  • ‘C’ means that individual offset will be applied to each element within each column,
  • ‘R’ means that individual offset will be applied to each element within each row.
M
The M dimension.
N
The N dimension.
K
The K dimension.
alpha
The alpha parameter that is used to scale the product of matrices A and B.
A
A pointer to the A matrix data.
lda
The leading dimension for the matrix A.
ao
The offset value for the matrix A.
B
A pointer to the B matrix data.
ldb
The leading dimension for the matrix B.
bo
The offset value for the matrix B.
beta
The beta parameter that is used to scale the matrix C.
C
A pointer to the C matrix data.
ldc
The leading dimension for the matrix C.
co
An array of offset values for the matrix C. The number of elements in the array depends on the value of
offsetc
.
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset
where
  • op( X ) = X
    or
    op( X ) = X**T
    ,
  • alpha
    and
    beta
    are scalars, and
  • A
    ,
    B
    , and
    C
    are matrices:
    • op( A )
      is an
      MxK
      matrix,
    • op( B )
      is an
      KxN
      matrix,
    • C
      is an
      MxN
      matrix.
  • A_offset
    is an
    MxK
    matrix with every element equal the
    ao
    value,
  • B_offset
    is an
    KxN
    matrix with every element equal the
    bo
    value,
  • C_offset
    is an
    MxN
    matrix which is defined by the
    co
    array of size
    len
    :
    • if
      offsetc = F
      : the
      len
      must be at least
      1
      ,
    • if
      offsetc = C
      : the
      len
      must be at least
      max(1, m)
      ,
    • if
      offsetc = R
      : the
      len
      must be at least
      max(1, n)
      ,
The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed.
transb
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed.
offsetc
Flag specifying how offsets should be applied to matrix C:
  • ‘F’ means that the same offset will be applied to each element of the matrix C,
  • ‘C’ means that individual offset will be applied to each element within each column,
  • ‘R’ means that individual offset will be applied to each element within each row.
M
The M dimension.
N
The N dimension.
K
The K dimension.
alpha
The alpha parameter that is used to scale the product of matrices A and B.
A
A pointer to the A matrix data.
lda
The leading dimension for the matrix A.
ao
The offset value for the matrix A.
B
A pointer to the B matrix data.
ldb
The leading dimension for the matrix B.
bo
The offset value for the matrix B.
beta
The beta parameter that is used to scale the matrix C.
C
A pointer to the C matrix data.
ldc
The leading dimension for the matrix C.
co
An array of offset values for the matrix C. The number of elements in the array depends on the value of
offsetc
.
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.