Developer Reference for Intel® oneAPI Math Kernel Library for C
A newer version of this document is available. Customers should click here to go to the newest version.
p?gemm
Computes a scalar-matrix-matrix product and adds the result to a scalar-matrix product for distributed matrices.
Syntax
void psgemm (const char *transa , const char *transb , const MKL_INT *m , const MKL_INT *n , const MKL_INT *k , const float *alpha , const float *a , const MKL_INT *ia , const MKL_INT *ja , const MKL_INT *desca , const float *b , const MKL_INT *ib , const MKL_INT *jb , const MKL_INT *descb , const float *beta , float *c , const MKL_INT *ic , const MKL_INT *jc , const MKL_INT *descc );
void pdgemm (const char *transa , const char *transb , const MKL_INT *m , const MKL_INT *n , const MKL_INT *k , const double *alpha , const double *a , const MKL_INT *ia , const MKL_INT *ja , const MKL_INT *desca , const double *b , const MKL_INT *ib , const MKL_INT *jb , const MKL_INT *descb , const double *beta , double *c , const MKL_INT *ic , const MKL_INT *jc , const MKL_INT *descc );
void pcgemm (const char *transa , const char *transb , const MKL_INT *m , const MKL_INT *n , const MKL_INT *k , const MKL_Complex8 *alpha , const MKL_Complex8 *a , const MKL_INT *ia , const MKL_INT *ja , const MKL_INT *desca , const MKL_Complex8 *b , const MKL_INT *ib , const MKL_INT *jb , const MKL_INT *descb , const MKL_Complex8 *beta , MKL_Complex8 *c , const MKL_INT *ic , const MKL_INT *jc , const MKL_INT *descc );
void pzgemm (const char *transa , const char *transb , const MKL_INT *m , const MKL_INT *n , const MKL_INT *k , const MKL_Complex16 *alpha , const MKL_Complex16 *a , const MKL_INT *ia , const MKL_INT *ja , const MKL_INT *desca , const MKL_Complex16 *b , const MKL_INT *ib , const MKL_INT *jb , const MKL_INT *descb , const MKL_Complex16 *beta , MKL_Complex16 *c , const MKL_INT *ic , const MKL_INT *jc , const MKL_INT *descc );
Include Files
- mkl_pblas.h
 
Description
The p?gemm routines perform a matrix-matrix operation with general distributed matrices. The operation is defined as
sub(C) := alpha*op(sub(A))*op(sub(B)) + beta*sub(C),
where:
op(x) is one of op(x) = x, or op(x) = x',
alpha and beta are scalars,
sub(A)=A(ia:ia+m-1, ja:ja+k-1), sub(B)=B(ib:ib+k-1, jb:jb+n-1), and sub(C)=C(ic:ic+m-1, jc:jc+n-1), are distributed matrices.
Input Parameters
- transa
 -  
     
(global) Specifies the form of op(sub(A)) used in the matrix multiplication:
if transa = 'N' or 'n', then op(sub(A)) = sub(A);
if transa = 'T' or 't', then op(sub(A)) = sub(A)';
if transa = 'C' or 'c', then op(sub(A)) = sub(A)'.
 - transb
 -  
     
(global) Specifies the form of op(sub(B)) used in the matrix multiplication:
if transb = 'N' or 'n', then op(sub(B)) = sub(B);
if transb = 'T' or 't', then op(sub(B)) = sub(B)';
if transb = 'C' or 'c', then op(sub(B)) = sub(B)'.
 - m
 -  
     
(global) Specifies the number of rows of the distributed matrices op(sub(A)) and sub(C), m≥ 0.
 - n
 -  
     
(global) Specifies the number of columns of the distributed matrices op(sub(B)) and sub(C), n≥ 0.
The value of n must be at least zero.
 - k
 -  
     
(global) Specifies the number of columns of the distributed matrix op(sub(A)) and the number of rows of the distributed matrix op(sub(B)).
The value of k must be greater than or equal to 0.
 - alpha
 -  
     
(global)
Specifies the scalar alpha.
When alpha is equal to zero, then the local entries of the arrays a and b corresponding to the entries of the submatrices sub(A) and sub(B) respectively need not be set on input.
 - a
 -  
     
(local)
Array, size lld_a by kla, where kla is LOCc(ja+k-1) when transa = 'N' or 'n', and is LOCq(ja+m-1) otherwise. Before entry this array must contain the local pieces of the distributed matrix sub(A).
 - ia, ja
 -  
     
(global) The row and column indices in the distributed matrix A indicating the first row and the first column of the submatrix sub(A), respectively
 - desca
 -  
     
(global and local) array of dimension 9. The array descriptor of the distributed matrix A.
 - b
 -  
     
(local)
Array, size lld_b by klb, where klb is LOCc(jb+n-1) when transb = 'N' or 'n', and is LOCq(jb+k-1) otherwise. Before entry this array must contain the local pieces of the distributed matrix sub(B).
 - ib, jb
 -  
     
(global) The row and column indices in the distributed matrix B indicating the first row and the first column of the submatrix sub(B), respectively
 - descb
 -  
     
(global and local) array of dimension 9. The array descriptor of the distributed matrix B.
 - beta
 -  
     
(global)
Specifies the scalar beta.
When beta is equal to zero, then sub(C) need not be set on input.
 - c
 -  
     
(local)
Array, size (lld_a, LOCq(jc+n-1)). Before entry this array must contain the local pieces of the distributed matrix sub(C).
 - ic, jc
 -  
     
(global) The row and column indices in the distributed matrix C indicating the first row and the first column of the submatrix sub(C), respectively
 - descc
 -  
     
(global and local) array of dimension 9. The array descriptor of the distributed matrix C.
 
Output Parameters
- c
 -  
     
Overwritten by the m-by-n distributed matrix alpha*op(sub(A))*op(sub(B)) + beta*sub(C).