FFTW to Intel® Math Kernel Library Wrappers
Technical User Notes

Contents

Introduction
Wrappers Reference
     1. Basic_Interface_Wrappers
     2. Advanced_Interface_Wrappers
     3. Guru_Interface_Wrappers
     4. Wisdom_Wrappers
     5. Memory_Allocation
Parallel Mode
Calling Wrappers from Fortran
Installation
     Creating_the_Wrapper_Library
     Application Assembling
     Running_Examples
Technical Support

  

Introduction

This document describes a collection of C routines – wrappers that allow the FFTW interface to call the Intel® Math Kernel Library (Intel® MKL) discrete Fourier transform interface (DFTI) . These wrappers correspond to the FFTW version 3.0.1 and the Intel® MKL versions 7.0 and later.

The purpose of this set of wrappers is to allow developers whose programs currently use FFTW to access the performance available in the Intel® MKL Fourier transforms without the necessity of changing the program source code. The only change that is required is to modify the header file fftw3.h (see Creating the Wrapper Library section below). There are differences between FFTW and Intel® MKL DFTI functionalities, so there are a lot of restrictions on using wrappers instead of the FTTW functions. However many typical DFTs functions can be performed through the use of these wrappers.

Please refer to the Intel® MKL DFTI documentation for a better understanding of the effects of the use of the wrappers.

Additional wrappers may be added in the future to extend the available FFTW functionality from the Intel MKL.

Wrappers Reference

Each FFTW function has its own wrapper.  Some of them are empty and do nothing, but they are still needed to avoid errors through the compilation and satisfy the function calls.
Note that
Intel® MKL DFTI operates on float and double precisions data types, and does not support the long-double data type that is used in the FFTW functions.

1. Basic Interface Wrappers

1.1 Complex DFTs

fftw_plan fftw_plan_dft_1d(int n, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);

fftw_plan fftw_plan_dft_2d(int nx, int ny, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);

fftw_plan fftw_plan_dft_3d(int nx, int ny, int nz, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);

fftw_plan fftw_plan_dft(int rank, const int *n, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);

fftwf_plan fftwf_plan_dft_1d(int n, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);

fftwf_plan fftwf_plan_dft_2d(int nx, int ny, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);

fftwf_plan fftwf_plan_dft_3d(int nx, int ny, int nz, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);

fftwf_plan fftwf_plan_dft(int rank, const int *n, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);

Argument restrictions.  The same algorithm corresponds to all values of the flags parameter.

1.2 Real-Data DFTs

fftw_plan fftw_plan_dft_r2c(int rank, const int *n, double *in, fftw_complex *out, unsigned flags);

fftw_plan fftw_plan_dft_r2c_1d(int n, double *in, fftw_complex *out, unsigned flags);

fftw_plan fftw_plan_dft_r2c_2d(int nx, int ny, double *in, fftw_complex *out, unsigned flags);

fftw_plan fftw_plan_dft_r2c_3d(int nx, int ny, int nz, double *in, fftw_complex *out, unsigned flags);

 

fftw_plan fftw_plan_dft_c2r(int rank, const int *n, fftw_complex *in, double *out, unsigned flags);

fftw_plan fftw_plan_dft_c2r_1d(int n, fftw_complex *in, double *out, unsigned flags);

fftw_plan fftw_plan_dft_c2r_2d(int nx, int ny, fftw_complex *in, double *out, unsigned flags);

fftw_plan fftw_plan_dft_c2r_3d(int nx, int ny, int nz, fftw_complex *in, double *out, unsigned flags);

 

fftwf_plan fftwf_plan_dft_r2c(int rank, const int *n, float *in, fftwf_complex *out, unsigned flags);

fftwf_plan fftwf_plan_dft_r2c_1d(int n, float *in, fftwf_complex *out, unsigned flags);

fftwf_plan fftwf_plan_dft_r2c_2d(int nx, int ny, float *in, fftwf_complex *out, unsigned flags);

fftwf_plan fftwf_plan_dft_r2c_3d(int nx, int ny, int nz, float *in, fftwf_complex *out, unsigned flags);

 

fftwf_plan fftwf_plan_dft_c2r(int rank, const int *n, fftwf_complex *in, float *out, unsigned flags);

fftwf_plan fftwf_plan_dft_c2r_1d(int n, fftwf_complex *in, float *out, unsigned flags);

fftwf_plan fftwf_plan_dft_c2r_2d(int nx, int ny, fftwf_complex *in, float *out, unsigned flags);

fftwf_plan fftwf_plan_dft_c2r_3d(int nx, int ny, int nz, fftwf_complex *in, float *out, unsigned flags);

Argument restrictions.  The same algorithm corresponds to all values of the flags parameter.
In-place transforms for 2D and 3D DFT are not supported.

1.3 Real-to-Real Transforms

All wrappers are empty and do nothing since the Intel® MKL DFTI does not currently support this functionality.

 

2. Advanced Interface Wrappers

2.1 Advanced Complex DFTs

fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany, fftw_complex *in, const int *inembed, int istride, int idist, fftw_complex *out, const int *onembed, int ostride, int odist, int sign, unsigned flags);

fftwf_plan fftwf_plan_many_dft(int rank, const int *n, int howmany, fftwf_complex *in, const int *inembed, int istride, int idist, fftwf_complex *out, const int *onembed, int ostride, int odist, int sign, unsigned flags);

Argument restrictions.  The same algorithm corresponds to all values of the flags parameter.

2.2 Advanced Real-Data DFTs

fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany, double* in, const int *inembed, int istride, int idist, fftw_complex *out, const int *onembed, int ostride, int odist, unsigned flags);

fftwf_plan fftwf_plan_many_dft_r2c(int rank, const int *n, int howmany, float* in, const int *inembed, int istride, int idist, fftwf_complex *out, const int *onembed, int ostride, int odist, unsigned flags);

fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany, fftw_complex * in, const int *inembed, int istride, int idist, double *out, const int *onembed, int ostride, int odist, unsigned flags);

fftwf_plan fftwf_plan_many_dft_c2r(int rank, const int *n, int howmany, fftwf_complex* in, const int *inembed, int istride, int idist, float *out, const int *onembed, int ostride, int odist, unsigned flags);

In-place transforms are not supported.

2.3 Advanced Real-to-Real Transforms

All wrappers are empty and do nothing since the Intel® MKL DFTI does not currently support this functionality.

3. Guru Interface Wrappers

3.1 Guru Complex DFTs

fftw_plan fftw_plan_guru_dft(int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);

fftwf_plan fftwf_plan_guru_dft(int rank, const fftwf_iodim *dims, int howmany_rank, const fftwf_iodim *howmany_dims, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);

Argument restrictions.  The same algorithm corresponds to all values of the flags parameter.

The rest of the wrappers are empty and do nothing since the Intel® MKL DFTI currently does not support split arrays.

3.2 Guru Real-Data DFTs

All wrappers are empty and do nothing.

Real-data wrappers (without split arrays supporting) will be added in later versions of the Intel® MKL

3.3 Guru Real-to-Real Transforms

All wrappers are empty and do nothing since the Intel® MKL DFTI currently does not support these functionalities.

3.4 Guru Execution of Plans

void fftw_execute_dft(const fftw_plan p, fftw_complex *in, fftw_complex *out);

void fftw_execute_dft_r2c(const fftw_plan p, double *in, fftw_complex *out);

void fftw_execute_dft_c2r(const fftw_plan p, fftw_complex *in, double *out);

void fftwf_execute_dft(const fftwf_plan p, fftwf_complex *in, fftwf_complex *out);

void fftwf_execute_dft_r2c(const fftwf_plan p, float *in, fftwf_complex *out);

void fftwf_execute_dft_c2r(const fftwf_plan p, fftwf_complex *in, float *out);

The rest of the wrappers are empty and do nothing.

Real-data wrappers (without split arrays supporting) will be added in later versions of the Intel® MKL

4. Wisdom Wrappers

All wrappers are empty and do nothing since the Intel® MKL DFTI currently does not support these functionalities.

5. Memory Allocation

All wrappers in this section are empty and do nothing. The Intel® MKL allocation function can not align the allocable array. We recommend allocate memory using functions of malloc type and align your array yourself. To do that it is necessary to allocate extra memory, and to shift the array address for DFT data. See also the Performance section in the Intel® MKL documentation file mkluse.htm.

Parallel Mode

All wrappers in this section are empty and do nothing since the Intel® MKL DFTI implements different mechanism of parallelization. If you want to use the parallel mode in the Intel® MKL DFTI routines or call wrappers from a multi-threaded application, please refer to the Intel® MKL documentation to learn about  managing the number of threads.

Calling Wrappers from Fortran

There are no wrappers in this section yet.

Installation

FFTW2MKL wrappers are delivered as the source code that must be compiled by the user to build the wrapper library. Then the FFTW library can be substituted by the wrappers and Intel® MKL libraries. The source code for the wrapper and the corresponding makefiles with the wrapper list files are located in the \examples\fftw2mkl sub-directory of the Intel® MKL directory.

Creating the Wrapper Library

Two header files are used to compile the wrapper library: fftw3_mkl.h and  fftw3.h.
fftw3_mkl.h file is located in the \examples\fftw2mkl\wrappers  sub-directory in the Intel® MKL directory. The original FFTW (www.fftw.org) header file fftw3.h file is slightly modified (all rows containing calls to the fftw3.lib are commented) and placed in the \include sub-directory in the Intel® MKL directory.

Makefiles contain the following parameters: platform (required), compiler, function, target. Description of these parameters can be found in the makefile comment heading. For example, the command

   make lib64

builds the wrapper library for the Intel® Itanium® processor family applications using the Intel® C++ and Fortran Compiler version 8.0 and later (by default).

The command

   make lib32 F=intel7

builds the wrapper library for the 32-bit platform using an Intel® C++ Compiler (version 7.1 and earlier).

 

As a result, the following files containing the wrapper library will be created: libfftw2mkl.a  (for Linux*),  and fftw2mkl.lib  (for Windows*).

Application Assembling

The adapted fftw3.h header file (see above) should be used when you build applications.

Running Examples

There are some examples that demonstrate how to use the wrapper library. They are located in the \examples\fftw2mkl\source in the Intel® MKL directory. Examples can be run by the same makefile file with the parameter target=examples. If parameter function=example_name is defined, then only specified example will run. Otherwise all examples from the source sub-directory will run.
The sub-directory \_results will be created, and the results will be stored here in the example_name.res files.

Technical Support

Please see the Intel® MKL support website at http://www.intel.com/support/performancetools/libraries/mkl/.  

Intel, the Intel logo, Intel SpeedStep, Intel NetBurst, Intel NetStructure, MMX, Intel386, Intel486, Celeron, Intel Centrino, Intel Xeon, Intel XScale, Itanium, Pentium, Pentium II Xeon, Pentium III Xeon, Pentium M, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
 
* Other names and brands may be claimed as the property of others.
 
Copyright © 2005, Intel Corporation.