Visible to Intel only — GUID: GUID-A04AF30D-38CE-424A-AE9A-5988303A3FE8
Visible to Intel only — GUID: GUID-A04AF30D-38CE-424A-AE9A-5988303A3FE8
cblas_?gemm_compute
Computes a matrix-matrix product with general matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
void cblas_hgemm_compute (const CBLAS_LAYOUT Layout, const MKL_INT transa, const MKL_INT transb, const MKL_INT m, const MKL_INT n, const MKL_INT k, const MKL_F16 *a, const MKL_INT lda, const MKL_F16 *b, const MKL_INT ldb, const MKL_F16 beta, MKL_F16 *c, const MKL_INT ldc);
void cblas_sgemm_compute (const CBLAS_LAYOUT Layout, const MKL_INT transa, const MKL_INT transb, const MKL_INT m, const MKL_INT n, const MKL_INT k, const float *a, const MKL_INT lda, const float *b, const MKL_INT ldb, const float beta, float *c, const MKL_INT ldc);
void cblas_dgemm_compute (const CBLAS_LAYOUT Layout, const MKL_INT transa, const MKL_INT transb, const MKL_INT m, const MKL_INT n, const MKL_INT k, const double *a, const MKL_INT lda, const double *b, const MKL_INT ldb, const double beta, double *c, const MKL_INT ldc);
- mkl.h
The cblas_?gemm_compute routine is one of a set of related routines that enable use of an internal packed storage. After calling cblas_?gemm_pack call cblas_?gemm_compute to compute
C := op(A)*op(B) + beta*C,
where:
- op(X) is one of the operations op(X) = X, op(X) = XT, or op(X) = XH,
- beta is a scalar,
- A , B, and C are matrices:
- op(A) is an m-by-k matrix,
- op(B) is a k-by-n matrix,
- C is an m-by-n matrix.
You must use the same value of the Layout parameter for the entire sequence of related cblas_?gemm_pack and cblas_?gemm_compute calls.
For best performance, use the same number of threads for packing and for computing.
If packing for both A and B matrices, you must use the same number of threads for packing A as for packing B.
- Layout
-
Specifies whether two-dimensional array storage is row-major (CblasRowMajor) or column-major (CblasColMajor).
- transa
-
Specifies the form of op(A) used in the matrix multiplication, one of the CBLAS_TRANSPOSE or CBLAS_STORAGE enumerated types:
If transa = CblasNoTrans op(A) = A.
If transa = CblasTrans op(A) = AT.
If transa = CblasConjTrans op(A) = AH.
If transa = CblasPacked the matrix in array a is packed and lda is ignored.
- transb
-
Specifies the form of op(B) used in the matrix multiplication, one of the CBLAS_TRANSPOSE or CBLAS_STORAGE enumerated types:
If transb = CblasNoTrans op(B) = B.
If transb = CblasTrans op(B) = BT.
If transb = CblasConjTrans op(B) = BH.
If transb = CblasPacked the matrix in array b is packed and ldb is ignored.
- m
-
Specifies the number of rows of the matrix op(A) and of the matrix C. The value of m must be at least zero.
- n
-
Specifies the number of columns of the matrix op(B) and the number of columns of the matrix C. The value of n must be at least zero.
- k
-
Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B). The value of k must be at least zero.
- a
-
Array:
transa = CblasNoTrans
transa = CblasTrans or transa = CblasConjTrans
transa = CblasPacked
Layout = CblasColMajor
Size lda*k.
Before entry, the leading m-by-k part of the array a must contain the matrix A.
Size lda*m.
Before entry, the leading k-by-m part of the array a must contain the matrix A.
Stored in internal packed format.
Layout = CblasRowMajor
Size lda*m.
Before entry, the leading k-by-m part of the array a must contain the matrix A.
Size lda*k.
Before entry, the leading m-by-k part of the array a must contain the matrix A.
Stored in internal packed format.
- lda
-
Specifies the leading dimension of a as declared in the calling (sub)program.
transa = CblasNoTrans
transa = CblasTrans or transa = CblasConjTrans
transa = CblasPacked
Layout = CblasColMajor
lda must be at least max(1, m).
lda must be at least max(1, k).
lda is ignored.
Layout = CblasRowMajor
lda must be at least max(1, k).
lda must be at least max(1, m).
lda is ignored.
- b
-
Array:
transb = CblasNoTrans
transb = CblasTrans or transb = CblasConjTrans
transb = CblasPacked
Layout = CblasColMajor
Size ldb*n.
Before entry, the leading k-by-n part of the array b must contain the matrix B.
Size ldb*k.
Before entry, the leading n-by-k part of the array b must contain the matrix B.
Stored in internal packed format.
Layout = CblasRowMajor
Size ldb*k.
Before entry, the leading n-by-k part of the array b must contain the matrix B.
Size ldb*n.
Before entry, the leading k-by-n part of the array b must contain the matrix B.
Stored in internal packed format.
- ldb
-
Specifies the leading dimension of b as declared in the calling (sub)program.
transb = CblasNoTrans
transb = CblasTransor transb = CblasConjTrans
transb = CblasPacked
Layout = CblasColMajor
ldb must be at least max(1, k).
ldb must be at least max(1, n).
ldb is ignored.
Layout = CblasRowMajor
ldb must be at least max(1, n).
ldb must be at least max(1, k).
ldb is ignored.
- beta
-
Specifies the scalar beta. When beta is equal to zero, then c need not be set on input.
- c
-
Array:
Layout = CblasColMajor
Size ldc*n.
Before entry, the leading m-by-n part of the array c must contain the matrix C, except when beta is equal to zero, in which case c need not be set on entry.
Layout = CblasRowMajor
Size ldc*m.
Before entry, the leading n-by-m part of the array c must contain the matrix C, except when beta is equal to zero, in which case c need not be set on entry.
- ldc
-
Specifies the leading dimension of c as declared in the calling (sub)program.
Layout = CblasColMajor
ldc must be at least max(1, m).
Layout = CblasRowMajor
ldc must be at least max(1, n).
c |
Overwritten by the m-by-n matrix op(A)*op(B) + beta*C. |