Introduction
Wrappers Reference
1.
Basic_Interface_Wrappers
2.
Advanced_Interface_Wrappers
3.
Guru_Interface_Wrappers
4. Wisdom_Wrappers
5. Memory_Allocation
Parallel Mode
Calling Wrappers from Fortran
Installation
Creating_the_Wrapper_Library
Application
Assembling
Running_Examples
Technical Support
This document describes a collection of C routines – wrappers that allow the FFTW interface to call the Intel® Math Kernel Library (Intel® MKL) discrete Fourier transform interface (DFTI) . These wrappers correspond to the FFTW version 3.0.1 and the Intel® MKL versions 7.0 and later.
The purpose of this set of wrappers is to allow developers whose programs currently use FFTW to access the performance available in the Intel® MKL Fourier transforms without the necessity of changing the program source code. The only change that is required is to modify the header file fftw3.h (see Creating the Wrapper Library section below). There are differences between FFTW and Intel® MKL DFTI functionalities, so there are a lot of restrictions on using wrappers instead of the FTTW functions. However many typical DFTs functions can be performed through the use of these wrappers.
Please refer to the Intel® MKL DFTI documentation for a better understanding of the effects of the use of the wrappers.
Additional wrappers may be added in the future to extend the available FFTW functionality from the Intel MKL.
Each FFTW function has its own wrapper.
Some of them are empty and do nothing, but they are still needed to avoid errors
through the compilation and satisfy the
function calls.
Note that Intel® MKL DFTI
operates on float and double precisions data types, and does not support
the long-double data type that is used in the FFTW functions.
fftw_plan fftw_plan_dft_1d(int n, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftw_plan fftw_plan_dft_2d(int nx, int ny, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftw_plan fftw_plan_dft_3d(int nx, int ny, int nz, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftw_plan fftw_plan_dft(int rank, const int *n, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft_1d(int n, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft_2d(int nx, int ny, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft_3d(int nx, int ny, int nz, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_dft(int rank, const int *n, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
Argument restrictions. The same algorithm corresponds to all values of the flags parameter.
fftw_plan fftw_plan_dft_r2c(int rank, const int *n, double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_r2c_1d(int n, double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_r2c_2d(int nx, int ny, double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_r2c_3d(int nx, int ny, int nz, double *in, fftw_complex *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r(int rank, const int *n, fftw_complex *in, double *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r_1d(int n, fftw_complex *in, double *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r_2d(int nx, int ny, fftw_complex *in, double *out, unsigned flags);
fftw_plan fftw_plan_dft_c2r_3d(int nx, int ny, int nz, fftw_complex *in, double *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c(int rank, const int *n, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c_1d(int n, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c_2d(int nx, int ny, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_r2c_3d(int nx, int ny, int nz, float *in, fftwf_complex *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r(int rank, const int *n, fftwf_complex *in, float *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r_1d(int n, fftwf_complex *in, float *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r_2d(int nx, int ny, fftwf_complex *in, float *out, unsigned flags);
fftwf_plan fftwf_plan_dft_c2r_3d(int nx, int ny, int nz, fftwf_complex *in, float *out, unsigned flags);
Argument
restrictions.
The same algorithm corresponds to all values of the
flags
parameter.
In-place transforms for 2D and 3D DFT are not supported.
All wrappers are empty and do nothing since the Intel® MKL DFTI does not currently support this functionality.
fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany, fftw_complex *in, const int *inembed, int istride, int idist, fftw_complex *out, const int *onembed, int ostride, int odist, int sign, unsigned flags);
fftwf_plan fftwf_plan_many_dft(int rank, const int *n, int howmany, fftwf_complex *in, const int *inembed, int istride, int idist, fftwf_complex *out, const int *onembed, int ostride, int odist, int sign, unsigned flags);
Argument restrictions. The same algorithm corresponds to all values of the flags parameter.
fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany, double* in, const int *inembed, int istride, int idist, fftw_complex *out, const int *onembed, int ostride, int odist, unsigned flags);
fftwf_plan fftwf_plan_many_dft_r2c(int rank, const int *n, int howmany, float* in, const int *inembed, int istride, int idist, fftwf_complex *out, const int *onembed, int ostride, int odist, unsigned flags);
fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany, fftw_complex * in, const int *inembed, int istride, int idist, double *out, const int *onembed, int ostride, int odist, unsigned flags);
fftwf_plan fftwf_plan_many_dft_c2r(int rank, const int *n, int howmany, fftwf_complex* in, const int *inembed, int istride, int idist, float *out, const int *onembed, int ostride, int odist, unsigned flags);
In-place transforms are not supported.
All wrappers are empty and do nothing since the Intel® MKL DFTI does not currently support this functionality.
fftw_plan fftw_plan_guru_dft(int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, fftw_complex *in, fftw_complex *out, int sign, unsigned flags);
fftwf_plan fftwf_plan_guru_dft(int rank, const fftwf_iodim *dims, int howmany_rank, const fftwf_iodim *howmany_dims, fftwf_complex *in, fftwf_complex *out, int sign, unsigned flags);
Argument restrictions. The same algorithm corresponds to all values of the flags parameter.
The rest of the wrappers are empty and do nothing since the Intel® MKL DFTI currently does not support split arrays.
All wrappers are empty and do nothing.
Real-data wrappers (without split arrays supporting) will be added in later versions of the Intel® MKL.
All wrappers are empty and do nothing since the Intel® MKL DFTI currently does not support these functionalities.
void fftw_execute_dft(const fftw_plan p, fftw_complex *in, fftw_complex *out);
void fftw_execute_dft_r2c(const fftw_plan p, double *in, fftw_complex *out);
void fftw_execute_dft_c2r(const fftw_plan p, fftw_complex *in, double *out);
void fftwf_execute_dft(const fftwf_plan p, fftwf_complex *in, fftwf_complex *out);
void fftwf_execute_dft_r2c(const fftwf_plan p, float *in, fftwf_complex *out);
void fftwf_execute_dft_c2r(const fftwf_plan p, fftwf_complex *in, float *out);
The rest of the wrappers are empty and do nothing.
Real-data wrappers (without split arrays supporting) will be added in later versions of the Intel® MKL.
All wrappers are empty and do nothing since the Intel® MKL DFTI currently does not support these functionalities.
All wrappers in this section are empty and do nothing. The Intel® MKL allocation
function can not align the allocable array. We
recommend allocate memory using functions of malloc
type and align your array
yourself. To do that it is necessary to allocate extra memory, and to shift the
array address for DFT data. See also the Performance
section in the
Intel® MKL documentation file
mkluse.htm.
All wrappers in this section are empty and do nothing since the Intel® MKL DFTI implements different mechanism of parallelization. If you want to use the parallel mode in the Intel® MKL DFTI routines or call wrappers from a multi-threaded application, please refer to the Intel® MKL documentation to learn about managing the number of threads.
There are no wrappers in this section yet.
FFTW2MKL wrappers are delivered as the source code that must be compiled by the user to build the wrapper library. Then the FFTW library can be substituted by the wrappers and Intel® MKL libraries. The source code for the wrapper and the corresponding makefiles with the wrapper list files are located in the \examples\fftw2mkl sub-directory of the Intel® MKL directory.
Two header files are
used to compile the wrapper library: fftw3_mkl.h
and fftw3.h.
fftw3_mkl.h
file
is located in the \examples\fftw2mkl\wrappers sub-directory in the
Intel® MKL
directory. The original
FFTW (www.fftw.org) header file fftw3.h
file is slightly modified
(all rows containing calls to the
fftw3.lib
are commented) and
placed in the \include sub-directory in the
Intel® MKL
directory.
Makefiles contain the following parameters: platform (required), compiler, function, target. Description of these parameters can be found in the makefile comment heading. For example, the command
make lib64
builds the wrapper library for the Intel® Itanium® processor family applications using the Intel® C++ and Fortran Compiler version 8.0 and later (by default).
The command
make lib32 F=intel7
builds the wrapper library for the 32-bit platform using an Intel® C++ Compiler (version 7.1 and earlier).
As a result, the following files containing the wrapper library
will be created: libfftw2mkl.a
(for Linux*),
and fftw2mkl.lib
(for Windows*).
The adapted
fftw3.h
header file
(see above)
should be used when you build applications.
There are some examples that demonstrate how to
use the wrapper library. They are located in the \examples\fftw2mkl\source in
the Intel®
MKL directory. Examples can be run by the same makefile file with the parameter
target=examples. If parameter function=example_name is defined, then only
specified example will run. Otherwise all examples from the source sub-directory will
run.
The sub-directory \_results will be created, and the results will be stored here in
the example_name.res files.
Intel, the Intel logo, Intel SpeedStep, Intel NetBurst, Intel NetStructure,
MMX, Intel386, Intel486, Celeron, Intel Centrino, Intel Xeon, Intel XScale, Itanium, Pentium, Pentium II Xeon,
Pentium III Xeon, Pentium M, and VTune are trademarks or registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries.
* Other names and brands may be claimed as the property of others.
Copyright © 2005, Intel Corporation.