Developer Reference for Intel® oneAPI Math Kernel Library for Fortran
ID
766686
Date
7/13/2023
Public
A newer version of this document is available. Customers should click here to go to the newest version.
Getting Help and Support
What's New
Notational Conventions
Overview
OpenMP* Offload
BLAS and Sparse BLAS Routines
LAPACK Routines
ScaLAPACK Routines
Sparse Solver Routines
Extended Eigensolver Routines
Vector Mathematical Functions
Statistical Functions
Fourier Transform Functions
PBLAS Routines
Partial Differential Equations Support
Nonlinear Optimization Problem Solvers
Support Functions
BLACS Routines
Data Fitting Functions
Appendix A: Linear Solvers Basics
Appendix B: Routine and Function Arguments
Appendix C: Specific Features of Fortran 95 Interfaces for LAPACK Routines
Appendix D: FFTW Interface to Intel® Math Kernel Library
Appendix E: Code Examples
Appendix F: oneMKL Functionality
Bibliography
Glossary
Notices and Disclaimers
mkl_?csrgemv
mkl_?bsrgemv
mkl_?coogemv
mkl_?diagemv
mkl_?csrsymv
mkl_?bsrsymv
mkl_?coosymv
mkl_?diasymv
mkl_?csrtrsv
mkl_?bsrtrsv
mkl_?cootrsv
mkl_?diatrsv
mkl_cspblas_?csrgemv
mkl_cspblas_?bsrgemv
mkl_cspblas_?coogemv
mkl_cspblas_?csrsymv
mkl_cspblas_?bsrsymv
mkl_cspblas_?coosymv
mkl_cspblas_?csrtrsv
mkl_cspblas_?bsrtrsv
mkl_cspblas_?cootrsv
mkl_?csrmv
mkl_?bsrmv
mkl_?cscmv
mkl_?coomv
mkl_?csrsv
mkl_?bsrsv
mkl_?cscsv
mkl_?coosv
mkl_?csrmm
mkl_?bsrmm
mkl_?cscmm
mkl_?coomm
mkl_?csrsm
mkl_?cscsm
mkl_?coosm
mkl_?bsrsm
mkl_?diamv
mkl_?skymv
mkl_?diasv
mkl_?skysv
mkl_?diamm
mkl_?skymm
mkl_?diasm
mkl_?skysm
mkl_?dnscsr
mkl_?csrcoo
mkl_?csrbsr
mkl_?csrcsc
mkl_?csrdia
mkl_?csrsky
mkl_?csradd
mkl_?csrmultcsr
mkl_?csrmultd
Naming Conventions in Inspector-Executor Sparse BLAS Routines
Sparse Matrix Storage Formats for Inspector-executor Sparse BLAS Routines
Supported Inspector-executor Sparse BLAS Operations
Two-stage Algorithm in Inspector-Executor Sparse BLAS Routines
Matrix Manipulation Routines
Inspector-Executor Sparse BLAS Analysis Routines
Inspector-Executor Sparse BLAS Execution Routines
mkl_sparse_?_create_csr
mkl_sparse_?_create_csc
mkl_sparse_?_create_coo
mkl_sparse_?_create_bsr
mkl_sparse_copy
mkl_sparse_destroy
mkl_sparse_convert_csr
mkl_sparse_convert_bsr
mkl_sparse_?_export_csr
mkl_sparse_?_export_csc
mkl_sparse_?_export_bsr
mkl_sparse_?_set_value
mkl_sparse_?_update_values
mkl_sparse_order
mkl_sparse_?_lu_smoother
mkl_sparse_?_mv
mkl_sparse_?_trsv
mkl_sparse_?_mm
mkl_sparse_?_trsm
mkl_sparse_?_add
mkl_sparse_spmm
mkl_sparse_?_spmmd
mkl_sparse_sp2m
mkl_sparse_?_sp2md
mkl_sparse_sypr
mkl_sparse_?_syprd
mkl_sparse_?_symgs
mkl_sparse_?_symgs_mv
mkl_sparse_syrk
mkl_sparse_?_syrkd
mkl_sparse_?_dotmv
mkl_sparse_?_sorv
?axpy_batch
?axpy_batch_strided
?axpby
?gem2vu
?gem2vc
?gemmt
?gemm3m
?gemm_batch
?gemm_batch_strided
?gemm3m_batch
?trsm_batch
?trsm_batch_strided
mkl_?imatcopy
mkl_?imatcopy_batch
mkl_?imatcopy_batch_strided
mkl_?omatadd_batch_strided
mkl_?omatcopy
mkl_?omatcopy_batch
mkl_?omatcopy_batch_strided
mkl_?omatcopy2
mkl_?omatadd
?gemm_pack_get_size, gemm_*_pack_get_size
?gemm_pack
gemm_*_pack
?gemm_compute
gemm_*_compute
?gemm_free
gemm_*
?gemv_batch_strided
?gemv_batch
?dgmm_batch_strided
?dgmm_batch
mkl_jit_create_?gemm
mkl_jit_get_?gemm_ptr
mkl_jit_destroy
Naming Conventions for LAPACK Routines
Fortran 95 Interface Conventions for LAPACK Routines
Matrix Storage Schemes for LAPACK Routines
Mathematical Notation for LAPACK Routines
Error Analysis
LAPACK Linear Equation Routines
LAPACK Least Squares and Eigenvalue Problem Routines
LAPACK Auxiliary Routines
LAPACK Utility Functions and Routines
LAPACK Test Functions and Routines
Additional LAPACK Routines (Included for Compatibility with Netlib LAPACK)
Matrix Factorization: LAPACK Computational Routines
Solving Systems of Linear Equations: LAPACK Computational Routines
Estimating the Condition Number: LAPACK Computational Routines
Refining the Solution and Estimating Its Error: LAPACK Computational Routines
Matrix Inversion: LAPACK Computational Routines
Matrix Equilibration: LAPACK Computational Routines
?getrf
?getrf_batch
?getrf_batch_strided
mkl_?getrfnp
?getrfnp_batch_strided
mkl_?getrfnpi
?getrf2
?getri_oop_batch
?getri_oop_batch_strided
?gbtrf
?gttrf
?dttrfb
?potrf
?potrf2
?pstrf
?pftrf
?pptrf
?pbtrf
?pttrf
?sytrf
?sytrf_aa
?sytrf_rook
?sytrf_rk
?hetrf
?hetrf_aa
?hetrf_rook
?hetrf_rk
?sptrf
?hptrf
mkl_?spffrt2, mkl_?spffrtx
Orthogonal Factorizations: LAPACK Computational Routines
Singular Value Decomposition: LAPACK Computational Routines
Symmetric Eigenvalue Problems: LAPACK Computational Routines
Generalized Symmetric-Definite Eigenvalue Problems: LAPACK Computational Routines
Nonsymmetric Eigenvalue Problems: LAPACK Computational Routines
Generalized Nonsymmetric Eigenvalue Problems: LAPACK Computational Routines
Generalized Singular Value Decomposition: LAPACK Computational Routines
Cosine-Sine Decomposition: LAPACK Computational Routines
Linear Least Squares (LLS) Problems: LAPACK Driver Routines
Generalized Linear Least Squares (LLS) Problems: LAPACK Driver Routines
Symmetric Eigenvalue Problems: LAPACK Driver Routines
Nonsymmetric Eigenvalue Problems: LAPACK Driver Routines
Singular Value Decomposition: LAPACK Driver Routines
Cosine-Sine Decomposition: LAPACK Driver Routines
Generalized Symmetric Definite Eigenvalue Problems: LAPACK Driver Routines
Generalized Nonsymmetric Eigenvalue Problems: LAPACK Driver Routines
?lacgv
?lacrm
?lacrt
?laesy
?rot
?spmv
?spr
?syconv
?symv
?syr
i?max1
?sum1
?gbtf2
?gebd2
?gehd2
?gelq2
?gelqt3
?geql2
?geqr2
?geqr2p
?geqrt2
?geqrt3
?gerq2
?gesc2
?getc2
?getf2
?gtts2
?isnan
?laisnan
?labrd
?lacn2
?lacon
?lacpy
?ladiv
?lae2
?laebz
?laed0
?laed1
?laed2
?laed3
?laed4
?laed5
?laed6
?laed7
?laed8
?laed9
?laeda
?laein
?laev2
?laexc
?lag2
?lags2
?lagtf
?lagtm
?lagts
?lagv2
?lahqr
?lahrd
?lahr2
?laic1
?lakf2
?laln2
?lals0
?lalsa
?lalsd
?lamrg
?lamswlq
?lamtsqr
?laneg
?langb
?lange
?langt
?lanhs
?lansb
?lanhb
?lansp
?lanhp
?lanst/?lanht
?lansy
?lanhe
?lantb
?lantp
?lantr
?lanv2
?lapll
?lapmr
?lapmt
?lapy2
?lapy3
?laqgb
?laqge
?laqhb
?laqp2
?laqps
?laqr0
?laqr1
?laqr2
?laqr3
?laqr4
?laqr5
?laqsb
?laqsp
?laqsy
?laqtr
?laqz0
?lar1v
?lar2v
?laran
?larf
?larfb
?larfg
?larfgp
?larft
?larfx
?larfy
?large
?largv
?larnd
?larnv
?laror
?larot
?larra
?larrb
?larrc
?larrd
?larre
?larrf
?larrj
?larrk
?larrr
?larrv
?lartg
?lartgp
?lartgs
?lartv
?laruv
?larz
?larzb
?larzt
?las2
?lascl
?lasd0
?lasd1
?lasd2
?lasd3
?lasd4
?lasd5
?lasd6
?lasd7
?lasd8
?lasd9
?lasda
?lasdq
?lasdt
?laset
?lasq1
?lasq2
?lasq3
?lasq4
?lasq5
?lasq6
?lasr
?lasrt
?lassq
?lasv2
?laswlq
?laswp
?lasy2
?lasyf
?lasyf_aa
?lasyf_rook
?lahef
?lahef_aa
?lahef_rook
?latbs
?latm1
?latm2
?latm3
?latm5
?latm6
?latme
?latmr
?latdf
?latps
?latrd
?latrs
?latrz
?latsqr
?lauu2
?lauum
?orbdb1/?unbdb1
?orbdb2/?unbdb2
?orbdb3/?unbdb3
?orbdb4/?unbdb4
?orbdb5/?unbdb5
?orbdb6/?unbdb6
?org2l/?ung2l
?org2r/?ung2r
?orgl2/?ungl2
?orgr2/?ungr2
?orm2l/?unm2l
?orm2r/?unm2r
?orml2/?unml2
?ormr2/?unmr2
?ormr3/?unmr3
?pbtf2
?potf2
?ptts2
?rscl
?syswapr
?heswapr
?syswapr1
?sygs2/?hegs2
?sytd2/?hetd2
?sytf2
?sytf2_rook
?hetf2
?hetf2_rook
?tgex2
?tgsy2
?trti2
clag2z
dlag2s
slag2d
zlag2c
?larfp
ila?lc
ila?lr
?gsvj0
?gsvj1
?sfrk
?hfrk
?tfsm
?lansf
?lanhf
?tfttp
?tfttr
?tplqt2
?tpqrt2
?tprfb
?tpttf
?tpttr
?trttf
?trttp
?pstf2
dlat2s
zlat2c
?lacp2
?la_gbamv
?la_gbrcond
?la_gbrcond_c
?la_gbrcond_x
?la_gbrfsx_extended
?la_gbrpvgrw
?la_geamv
?la_gercond
?la_gercond_c
?la_gercond_x
?la_gerfsx_extended
?la_heamv
?la_hercond_c
?la_hercond_x
?la_herfsx_extended
?la_herpvgrw
?la_lin_berr
?la_porcond
?la_porcond_c
?la_porcond_x
?la_porfsx_extended
?la_porpvgrw
?laqhe
?laqhp
?larcm
?la_gerpvgrw
?larscl2
?lascl2
?la_syamv
?la_syrcond
?la_syrcond_c
?la_syrcond_x
?la_syrfsx_extended
?la_syrpvgrw
?la_wwaddw
mkl_?tppack
mkl_?tpunpack
Additional LAPACK Routines
Systems of Linear Equations: ScaLAPACK Computational Routines
Matrix Factorization: ScaLAPACK Computational Routines
Solving Systems of Linear Equations: ScaLAPACK Computational Routines
Estimating the Condition Number: ScaLAPACK Computational Routines
Refining the Solution and Estimating Its Error: ScaLAPACK Computational Routines
Matrix Inversion: ScaLAPACK Computational Routines
Matrix Equilibration: ScaLAPACK Computational Routines
Orthogonal Factorizations: ScaLAPACK Computational Routines
Symmetric Eigenvalue Problems: ScaLAPACK Computational Routines
Nonsymmetric Eigenvalue Problems: ScaLAPACK Computational Routines
Singular Value Decomposition: ScaLAPACK Driver Routines
Generalized Symmetric-Definite Eigenvalue Problems: ScaLAPACK Computational Routines
b?laapp
b?laexc
b?trexc
p?lacgv
p?max1
pilaver
pmpcol
pmpim2
?combamax1
p?sum1
p?dbtrsv
p?dttrsv
p?gebal
p?gebd2
p?gehd2
p?gelq2
p?geql2
p?geqr2
p?gerq2
p?getf2
p?labrd
p?lacon
p?laconsb
p?lacp2
p?lacp3
p?lacpy
p?laevswp
p?lahrd
p?laiect
p?lamve
p?lange
p?lanhs
p?lansy, p?lanhe
p?lantr
p?lapiv
p?lapv2
p?laqge
p?laqr0
p?laqr1
p?laqr2
p?laqr3
p?laqr4
p?laqr5
p?laqsy
p?lared1d
p?lared2d
p?larf
p?larfb
p?larfc
p?larfg
p?larft
p?larz
p?larzb
p?larzc
p?larzt
p?lascl
p?lase2
p?laset
p?lasmsub
p?lasrt
p?lassq
p?laswp
p?latra
p?latrd
p?latrs
p?latrz
p?lauu2
p?lauum
p?lawil
p?org2l/p?ung2l
p?org2r/p?ung2r
p?orgl2/p?ungl2
p?orgr2/p?ungr2
p?orm2l/p?unm2l
p?orm2r/p?unm2r
p?orml2/p?unml2
p?ormr2/p?unmr2
p?pbtrsv
p?pttrsv
p?potf2
p?rot
p?rscl
p?sygs2/p?hegs2
p?sytd2/p?hetd2
p?trord
p?trsen
p?trti2
?lahqr2
?lamsh
?lapst
?laqr6
?lar1va
?laref
?larrb2
?larrd2
?larre2
?larre2a
?larrf2
?larrv2
?lasorte
?lasrt2
?stegr2
?stegr2a
?stegr2b
?stein2
?dbtf2
?dbtrf
?dttrf
?dttrsv
?pttrsv
?steqr2
?trmvt
pilaenv
pilaenvx
pjlaenv
Additional ScaLAPACK Routines
oneMKL PARDISO - Parallel Direct Sparse Solver Interface
Parallel Direct Sparse Solver for Clusters Interface
Direct Sparse Solver (DSS) Interface Routines
Iterative Sparse Solvers based on Reverse Communication Interface (RCI ISS)
Preconditioners based on Incomplete LU Factorization Technique
Sparse Matrix Checker Routines
pardiso
pardisoinit
pardiso_64
mkl_pardiso_pivot
pardiso_getdiag
pardiso_export
pardiso_handle_store
pardiso_handle_restore
pardiso_handle_delete
pardiso_handle_store_64
pardiso_handle_restore_64
pardiso_handle_delete_64
oneMKL PARDISO Parameters in Tabular Form
pardiso iparm Parameter
PARDISO_DATA_TYPE
vslNewStream
vslNewStreamEx
vsliNewAbstractStream
vsldNewAbstractStream
vslsNewAbstractStream
vslDeleteStream
vslCopyStream
vslCopyStreamState
vslSaveStreamF
vslLoadStreamF
vslSaveStreamM
vslLoadStreamM
vslGetStreamSize
vslLeapfrogStream
vslSkipAheadStream
vslSkipAheadStreamEx
vslGetStreamStateBrng
vslGetNumRegBrngs
Convolution and Correlation Naming Conventions
Convolution and Correlation Data Types
Convolution and Correlation Parameters
Convolution and Correlation Task Status and Error Reporting
Convolution and Correlation Task Constructors
Convolution and Correlation Task Editors
Task Execution Routines
Convolution and Correlation Task Destructors
Convolution and Correlation Task Copiers
Convolution and Correlation Usage Examples
Convolution and Correlation Mathematical Notation and Definitions
Convolution and Correlation Data Allocation
Summary Statistics Naming Conventions
Summary Statistics Data Types
Summary Statistics Parameters
Summary Statistics Task Status and Error Reporting
Summary Statistics Task Constructors
Summary Statistics Task Editors
Summary Statistics Task Computation Routines
Summary Statistics Task Destructor
Summary Statistics Usage Examples
Summary Statistics Mathematical Notation and Definitions
DFTI_PRECISION
DFTI_FORWARD_DOMAIN
DFTI_DIMENSION, DFTI_LENGTHS
DFTI_PLACEMENT
DFTI_FORWARD_SCALE, DFTI_BACKWARD_SCALE
DFTI_NUMBER_OF_USER_THREADS
DFTI_THREAD_LIMIT
DFTI_INPUT_STRIDES, DFTI_OUTPUT_STRIDES
DFTI_NUMBER_OF_TRANSFORMS
DFTI_INPUT_DISTANCE, DFTI_OUTPUT_DISTANCE
DFTI_COMPLEX_STORAGE, DFTI_REAL_STORAGE, DFTI_CONJUGATE_EVEN_STORAGE
DFTI_PACKED_FORMAT
DFTI_WORKSPACE
DFTI_COMMIT_STATUS
DFTI_ORDERING
Matrix Shapes
Repeatability and Coherence
BLACS Combine Operations
BLACS Point To Point Communication
BLACS Broadcast Routines
BLACS Support Routines
Examples of BLACS Routines Usage
Example. BLACS Usage. Hello World
Example. BLACS Usage. PROCMAP
Example. BLACS Usage. PARALLEL DOT PRODUCT
Example. BLACS Usage. PARALLEL MATRIX INFINITY NORM
Data Fitting Function Naming Conventions
Data Fitting Function Data Types
Mathematical Conventions for Data Fitting Functions
Data Fitting Usage Model
Data Fitting Usage Examples
Data Fitting Function Task Status and Error Reporting
Data Fitting Task Creation and Initialization Routines
Task Configuration Routines
Data Fitting Computational Routines
Data Fitting Task Destructors
DSS Symmetric Matrix Storage
DSS Nonsymmetric Matrix Storage
DSS Structurally Symmetric Matrix Storage
DSS Distributed Symmetric Matrix Storage
Sparse BLAS CSR Matrix Storage Format
Sparse BLAS CSC Matrix Storage Format
Sparse BLAS Coordinate Matrix Storage Format
Sparse BLAS Diagonal Matrix Storage Format
Sparse BLAS Skyline Matrix Storage Format
Sparse BLAS BSR Matrix Storage Format
Examples of BLACS Routines Usage
Example. BLACS Usage. Hello World
The following routine takes the available processes, forms them into a process grid, and then has each process check in with the process at {0,0} in the process grid.
PROGRAM HELLO * -- BLACS example code -- * Written by Clint Whaley 7/26/94 * Performs a simple check-in type hello world * .. * .. External Functions .. INTEGER BLACS_PNUM EXTERNAL BLACS_PNUM * .. * .. Variable Declaration .. INTEGER CONTXT, IAM, NPROCS, NPROW, NPCOL, MYPROW, MYPCOL INTEGER ICALLER, I, J, HISROW, HISCOL * * Determine my process number and the number of processes in * machine * CALL BLACS_PINFO(IAM, NPROCS) * * If in PVM, create virtual machine if it doesn't exist * IF (NPROCS .LT. 1) THEN IF (IAM .EQ. 0) THEN WRITE(*, 1000) READ(*, 2000) NPROCS END IF CALL BLACS_SETUP(IAM, NPROCS) END IF * * Set up process grid that is as close to square as possible * NPROW = INT( SQRT( REAL(NPROCS) ) ) NPCOL = NPROCS / NPROW * * Get default system context, and define grid * CALL BLACS_GET(0, 0, CONTXT) CALL BLACS_GRIDINIT(CONTXT, 'Row', NPROW, NPCOL) CALL BLACS_GRIDINFO(CONTXT, NPROW, NPCOL, MYPROW, MYPCOL) * * If I'm not in grid, go to end of program * IF ( (MYPROW.GE.NPROW) .OR. (MYPCOL.GE.NPCOL) ) GOTO 30 * * Get my process ID from my grid coordinates * ICALLER = BLACS_PNUM(CONTXT, MYPROW, MYPCOL) * * If I am process {0,0}, receive check-in messages from * all nodes * IF ( (MYPROW.EQ.0) .AND. (MYPCOL.EQ.0) ) THEN WRITE(*,*) ' ' DO 20 I = 0, NPROW-1 DO 10 J = 0, NPCOL-1 IF ( (I.NE.0) .OR. (J.NE.0) ) THEN CALL IGERV2D(CONTXT, 1, 1, ICALLER, 1, I, J) END IF * * Make sure ICALLER is where we think in process grid * CALL BLACS_PCOORD(CONTXT, ICALLER, HISROW, HISCOL) IF ( (HISROW.NE.I) .OR. (HISCOL.NE.J) ) THEN WRITE(*,*) 'Grid error! Halting . . .' STOP END IF WRITE(*, 3000) I, J, ICALLER 10 CONTINUE 20 CONTINUE WRITE(*,*) ' ' WRITE(*,*) 'All processes checked in. Run finished.' * * All processes but {0,0} send process ID as a check-in * ELSE CALL IGESD2D(CONTXT, 1, 1, ICALLER, 1, 0, 0) END IF 30 CONTINUE CALL BLACS_EXIT(0) 1000 FORMAT('How many processes in machine?') 2000 FORMAT(I) 3000 FORMAT('Process {',i2,',',i2,'} (node number =',I, $ ') has checked in.') STOP END
Example. BLACS Usage. PROCMAP
This routine maps processes to a grid using blacs_gridmap.
SUBROUTINE PROCMAP(CONTEXT, MAPPING, BEGPROC, NPROW, NPCOL, IMAP) * * -- BLACS example code -- * Written by Clint Whaley 7/26/94 * .. * .. Scalar Arguments .. INTEGER CONTEXT, MAPPING, BEGPROC, NPROW, NPCOL * .. * .. Array Arguments .. INTEGER IMAP(NPROW, *) * .. * * Purpose * ======= * PROCMAP maps NPROW*NPCOL processes starting from process BEGPROC to * the grid in a variety of ways depending on the parameter MAPPING. * * Arguments * ========= * * CONTEXT (output) INTEGER * This integer is used by the BLACS to indicate a context. * A context is a universe where messages exist and do not * interact with other context's messages. The context * includes the definition of a grid, and each process's * coordinates in it. * * MAPPING (input) INTEGER * Way to map processes to grid. Choices are: * 1 : row-major natural ordering * 2 : column-major natural ordering * * BEGPROC (input) INTEGER * The process number (between 0 and NPROCS-1) to use as * {0,0}. From this process, processes will be assigned * to the grid as indicated by MAPPING. * * NPROW (input) INTEGER * The number of process rows the created grid * should have. * * NPCOL (input) INTEGER * The number of process columns the created grid * should have. * * IMAP (workspace) INTEGER array of dimension (NPROW, NPCOL) * Workspace, where the array which maps the * processes to the grid will be stored for the * call to GRIDMAP. * * =============================================================== * * .. * .. External Functions .. INTEGER BLACS_PNUM EXTERNAL BLACS_PNUM * .. * .. External Subroutines .. EXTERNAL BLACS_PINFO, BLACS_GRIDINIT, BLACS_GRIDMAP * .. * .. Local Scalars .. INTEGER TMPCONTXT, NPROCS, I, J, K * .. * .. Executable Statements .. * * See how many processes there are in the system * CALL BLACS_PINFO( I, NPROCS ) IF (NPROCS-BEGPROC .LT. NPROW*NPCOL) THEN WRITE(*,*) 'Not enough processes for grid' STOP END IF * * Temporarily map all processes into 1 x NPROCS grid * CALL BLACS_GET( 0, 0, TMPCONTXT ) CALL BLACS_GRIDINIT( TMPCONTXT, 'Row', 1, NPROCS ) K = BEGPROC * * If we want a row-major natural ordering * IF (MAPPING .EQ. 1) THEN DO I = 1, NPROW DO J = 1, NPCOL IMAP(I, J) = BLACS_PNUM(TMPCONTXT, 0, K) K = K + 1W END DO END DO * * If we want a column-major natural ordering * ELSE IF (MAPPING .EQ. 2) THEN DO J = 1, NPCOL DO I = 1, NPROW IMAP(I, J) = BLACS_PNUM(TMPCONTXT, 0, K) K = K + 1 END DO END DO ELSE WRITE(*,*) 'Unknown mapping.' STOP END IF * * Free temporary context * CALL BLACS_GRIDEXIT(TMPCONTXT) * * Apply the new mapping to form desired context * CALL BLACS_GET( 0, 0, CONTEXT ) CALL BLACS_GRIDMAP( CONTEXT, IMAP, NPROW, NPROW, NPCOL ) RETURN END
Example. BLACS Usage. PARALLEL DOT PRODUCT
This routine does a bone-headed parallel double precision dot product of two vectors. Arguments are input on process {0,0}, and output everywhere else.
DOUBLE PRECISION FUNCTION PDDOT( CONTEXT, N, X, Y ) * * -- BLACS example code -- * Written by Clint Whaley 7/26/94 * .. * .. Scalar Arguments .. INTEGER CONTEXT, N * .. * .. Array Arguments .. DOUBLE PRECISION X(*), Y(*) * .. * * Purpose * ======= * PDDOT is a restricted parallel version of the BLAS routine * DDOT. It assumes that the increment on both vectors is one, * and that process {0,0} starts out owning the vectors and * has N. It returns the dot product of the two N-length vectors * X and Y, that is, PDDOT = X' Y. * * Arguments * ========= * * CONTEXT (input) INTEGER * This integer is used by the BLACS to indicate a context. * A context is a universe where messages exist and do not * interact with other context's messages. The context * includes the definition of a grid, and each process's * coordinates in it. * * N (input/output) INTEGER * The length of the vectors X and Y. Input * for {0,0}, output for everyone else. * * X (input/output) DOUBLE PRECISION array of dimension (N) * The vector X of PDDOT = X' Y. Input for {0,0}, * output for everyone else. * * Y (input/output) DOUBLE PRECISION array of dimension (N) * The vector Y of PDDOT = X' Y. Input for {0,0}, * output for everyone else. * * =============================================================== * * .. * .. External Functions .. DOUBLE PRECISION DDOT EXTERNAL DDOT * .. * .. External Subroutines .. EXTERNAL BLACS_GRIDINFO, DGEBS2D, DGEBR2D, DGSUM2D * .. * .. Local Scalars .. INTEGER IAM, NPROCS, NPROW, NPCOL, MYPROW, MYPCOL, I, LN DOUBLE PRECISION LDDOT * .. * .. Executable Statements .. * * Find out what grid has been set up, and pretend it is 1-D * CALL BLACS_GRIDINFO( CONTXT, NPROW, NPCOL, MYPROW, MYPCOL ) IAM = MYPROW*NPCOL + MYPCOL NPROCS = NPROW * NPCOL * * Temporarily map all processes into 1 x NPROCS grid * CALL BLACS_GET( 0, 0, TMPCONTXT ) CALL BLACS_GRIDINIT( TMPCONTXT, 'Row', 1, NPROCS ) K = BEGPROC * * Do bone-headed thing, and just send entire X and Y to * everyone * IF ( (MYPROW.EQ.0) .AND. (MYPCOL.EQ.0) ) THEN CALL IGEBS2D(CONTXT, 'All', 'i-ring', 1, 1, N, 1 ) CALL DGEBS2D(CONTXT, 'All', 'i-ring', N, 1, X, N ) CALL DGEBS2D(CONTXT, 'All', 'i-ring', N, 1, Y, N ) ELSE CALL IGEBR2D(CONTXT, 'All', 'i-ring', 1, 1, N, 1, 0, 0 ) CALL DGEBR2D(CONTXT, 'All', 'i-ring', N, 1, X, N, 0, 0 ) CALL DGEBR2D(CONTXT, 'All', 'i-ring', N, 1, Y, N, 0, 0 ) ENDIF * * Find out the number of local rows to multiply (LN), and * where in vectors to start (I) * LN = N / NPROCS I = 1 + IAM * LN * * Last process does any extra rows * IF (IAM .EQ. NPROCS-1) LN = LN + MOD(N, NPROCS) * * Figure dot product of my piece of X and Y * LDDOT = DDOT( LN, X(I), 1, Y(I), 1 ) * * Add local dot products to get global dot product; * give all procs the answer * CALL DGSUM2D( CONTXT, 'All', '1-tree', 1, 1, LDDOT, 1, -1, 0 ) PDDOT = LDDOT RETURN END
Example. BLACS Usage. PARALLEL MATRIX INFINITY NORM
This routine does a parallel infinity norm on a distributed double precision matrix. Unlike the PDDOT example, this routine assumes the matrix has already been distributed.
DOUBLE PRECISION FUNCTION PDINFNRM(CONTXT, LM, LN, A, LDA, WORK) * * -- BLACS example code -- * Written by Clint Whaley. * .. * .. Scalar Arguments .. INTEGER CONTEXT, LM, LN, LDA * .. * .. Array Arguments .. DOUBLE PRECISION A(LDA, *), WORK(*) * .. * * Purpose * ======= * Compute the infinity norm of a distributed matrix, where * the matrix is spread across a 2D process grid. The result is * left on all processes. * * Arguments * ========= * * CONTEXT (input) INTEGER * This integer is used by the BLACS to indicate a context. * A context is a universe where messages exist and do not * interact with other context's messages. The context * includes the definition of a grid, and each process's * coordinates in it. * * LM (input) INTEGER * Number of rows of the global matrix owned by this * process. * * LN (input) INTEGER * Number of columns of the global matrix owned by this * process. * * A (input) DOUBLE PRECISION, dimension (LDA,N) * The matrix whose norm you wish to compute. * * LDA (input) INTEGER * Leading Dimension of A. * * WORK (temporary) DOUBLE PRECISION array, dimension (LM) * Temporary work space used for summing rows. * * .. External Subroutines .. EXTERNAL BLACS_GRIDINFO, DGEBS2D, DGEBR2D, DGSUM2D, DGAMX2D * .. * .. External Functions .. INTEGER IDAMAX DOUBLE PRECISION DASUM * * .. Local Scalars .. INTEGER NPROW, NPCOL, MYROW, MYCOL, I, J DOUBLE PRECISION MAX * * .. Executable Statements .. * * Get process grid information * CALL BLACS_GRIDINFO( CONTXT, NPROW, NPCOL, MYPROW, MYPCOL ) * * Add all local rows together * DO 20 I = 1, LM WORK(I) = DASUM(LN, A(I,1), LDA) 20 CONTINUE * * Find sum of global matrix rows and store on column 0 of * process grid * CALL DGSUM2D(CONTXT, 'Row', '1-tree', LM, 1, WORK, LM, MYROW, 0) * * Find maximum sum of rows for supnorm * IF (MYCOL .EQ. 0) THEN MAX = WORK(IDAMAX(LM,WORK,1)) IF (LM .LT. 1) MAX = 0.0D0 CALL DGAMX2D(CONTXT, 'Col', 'h', 1, 1, MAX, 1, I, I, -1, -1, 0) END IF * * Process column 0 has answer; send answer to all nodes * IF (MYCOL .EQ. 0) THEN CALL DGEBS2D(CONTXT, 'Row', ' ', 1, 1, MAX, 1) ELSE CALL DGEBR2D(CONTXT, 'Row', ' ', 1, 1, MAX, 1, 0, 0) END IF * PDINFNRM = MAX * RETURN * * End of PDINFNRM * END
Parent topic: BLACS Routines