Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Preconditioners based on Incomplete LU Factorization Technique

Preconditioners, or accelerators are used to accelerate an iterative solution process. In some cases, their use can reduce the number of iterations dramatically and thus lead to better solver performance. Although the terms preconditioner and accelerator are synonyms, hereafter only preconditioner is used.

Intel® oneAPI Math Kernel Library provides two preconditioners, ILU0 and ILUT, for sparse matrices presented in the format accepted in the Intel® oneAPI Math Kernel Library direct sparse solvers (three-array variation of the CSR storage format described inSparse Matrix Storage Format ). The algorithms used are described in [Saad03].

The ILU0 preconditioner is based on a well-known factorization of the original matrix into a product of two triangular matrices: lower and upper triangular matrices. Usually, such decomposition leads to some fill-in in the resulting matrix structure in comparison with the original matrix. The distinctive feature of the ILU0 preconditioner is that it preserves the structure of the original matrix in the result.

Unlike the ILU0 preconditioner, the ILUT preconditioner preserves some resulting fill-in in the preconditioner matrix structure. The distinctive feature of the ILUT algorithm is that it calculates each element of the preconditioner and saves each one if it satisfies two conditions simultaneously: its value is greater than the product of the given tolerance and matrix row norm, and its value is in the given bandwidth of the resulting preconditioner matrix.

Both ILU0 and ILUT preconditioners can apply to any non-degenerate matrix. They can be used alone or together with the Intel® oneAPI Math Kernel Library RCI FGMRES solver (seeSparse Solver Routines). Avoid using these preconditioners with MKL RCI CG solver because in general, they produce a non-symmetric resulting matrix even if the original matrix is symmetric. Usually, an inverse of the preconditioner is required in this case. To do this the Intel® oneAPI Math Kernel Library triangular solver routinemkl_dcsrtrsv must be applied twice: for the lower triangular part of the preconditioner, and then for its upper triangular part.

NOTE:

Although ILU0 and ILUT preconditioners apply to any non-degenerate matrix, in some cases the algorithm may fail to ensure successful termination and the required result. Whether or not the preconditioner produces an acceptable result can only be determined in practice.

A preconditioner may increase the number of iterations for an arbitrary case of the system and the initial solution, and even ruin the convergence. It is your responsibility as a user to choose a suitable preconditioner.

General Scheme of Using ILUT and RCI FGMRES Routines

The general scheme for use is the same for both preconditioners. Some differences exist in the calling parameters of the preconditioners and in the subsequent call of two triangular solvers. You can see all these differences in the preconditioner code examples (dcsrilu*.*) in the examplesfolder of the Intel® oneAPI Math Kernel Library installation directory:

  • examples/solverf/source

The following pseudocode shows the general scheme of using the ILUT preconditioner in the RCI FGMRES context.

...

generate matrix A

generate preconditioner C (optional)

   call dfgmres_init(n, x, b, RCI_request, ipar, dpar, tmp)

    change parameters in ipar, dpar if necessary

    call dcsrilut(n, a, ia, ja, bilut, ibilut, jbilut, tol, maxfil, ipar, dpar, ierr)

    call dfgmres_check(n, x, b, RCI_request, ipar, dpar, tmp)

1   call dfgmres(n, x, b, RCI_request, ipar, dpar, tmp)

    if (RCI_request.eq.1) then

      multiply the matrix A by tmp(ipar(22)) and put the result in tmp(ipar(23))

c  proceed with FGMRES iterations

      goto 1

    endif

    if (RCI_request.eq.2) then

      do the stopping test

      if (test not passed) then

proceed with FGMRES iterations

        go to 1

      else

stop FGMRES iterations.

        goto 2

      endif

    endif

    if (RCI_request.eq.3) then

Below, trvec is an intermediate vector of length at least n

Here is the recommended use of the result produced by the ILUT routine.

via standard Intel® oneAPI Math Kernel Library Sparse Blas solver routinemkl_dcsrtrsv.

    call mkl_dcsrtrsv('L','N','U', n, bilut, ibilut, jbilut, tmp(ipar(22)),trvec)

    call mkl_dcsrtrsv('U','N','N', n, bilut, ibilut, jbilut, trvec, tmp(ipar(23)))

proceed with FGMRES iterations

      goto 1

    endif

    if (RCI_request.eq.4) then

    check the norm of the next orthogonal vector, it is contained in dpar(7)

      if (the norm is not zero up to rounding/computational errors) then

c  proceed with FGMRES iterations

        goto 1

      else

c  stop FGMRES iterations

        goto 2

      endif

    endif

2  call dfgmres_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)

current iteration number is in itercount

the computed approximation is in the array x