Disclaimer

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.

Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm

Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number/

BlueMoon, BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Inside, Core Inside, i960, Intel, the Intel logo, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Inside logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel Sponsors of Tomorrow., the Intel Sponsors of Tomorrow. logo, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, InTru, the InTru logo, InTru soundmark, Itanium, Itanium Inside, MCS, MMX, Moblin, Pentium, Pentium Inside, skoool, the skoool logo, Sound Mark, The Journey Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries.
* Other names and brands may be claimed as the property of others.

Microsoft, Windows, Visual Studio, Visual C++, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.

Java is a registered trademark of Oracle and/or its affiliates.

©Intel Corporation. All rights reserved.

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.

Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

License Definitions

By downloading and installing this sample, you hereby agree that the accompanying materials are being provided to you under the terms and conditions of the End User License Agreement for the Intel® Integrated Performance Primitives (Intel® IPP) product previously accepted by you.

System Requirements

Recommended hardware:

Hardware requirements:

Software requirements:

Product specific requirements:

For more information please see Intel® IPP System Requirements.

Introducing Integration Wrappers for IntelĀ® Integrated Performance Primitives

Intel® Integrated Performance Primitives (Intel® IPP) Integration Wrappers aggregate Intel IPP functionality in easy-to-use functions and help to reduce effort required to integrate Intel IPP into your code. The wrappers are designed to improve user experience with threading of Intel IPP functions and tiling.

Integration Wrappers consist of C and C++ interfaces:

Integration Wrappers simplify usage of Intel IPP functions and address some of the advanced use cases of Intel IPP.

This document covers building of Intel IPP Integration Wrappers from sources and building of examples for them. To know more about Intel IPP Integration Wrappers API and its features please refer to the Developer Guide here: https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-documentation

Getting Started

Checking Your Installation

To use Intel IPP Integration Wrappers, you need to have the Intel IPP library installed on your machine and the IPPROOT environment variable set correctly.

For information on how to install Intel IPP, refer to the Intel IPP Installation Guide available at https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-install-guide/.

For instructions on how to set environment variables, refer to the Intel IPP Developer Guide https://software.intel.com/en-us/ipp-dev-guide.

Building Intel® Integrated Performance Primitives Integration Wrappers and Examples

Note

Make sure that IPPROOT environment variable is set and points to the Intel IPP library location.

Note

To use Intel® Threading Building Blocks (Intel® TBB) in examples, make sure that TBBROOT is set and points to the Intel TBB library location. This is optional if your system supports OpenMP* threading.

  • Windows* OS:

    • Prerequisites:

      Intel IPP Integration Wrappers examples use OpenGL* rendering to output results. This may require Windows* SDK to be installed on your system. To find Windows* SDK for your version of Windows*, refer to the Microsoft* web site at https://www.microsoft.com.

    • Examples with prebuilt libraries:

      • Open iw_examples_prebuild.sln in Microsoft* Visual Studio* 2017 or higher.

      • Build the solution using the Build command. Examples link with prebuilt ipp_iw from %IPPROOT%/lib/<ARCH> folder

    • Examples and manual build of libraries:

      • Open iw.sln in Microsoft* Visual Studio* 2017 or higher. Examples and Integration Wrappers sources can be found in the solution explorer.

      • Build the solution using the Build command. Examples link with the newly built Intel IPP Integration Wrappers libraries from the same solution.

  • Linux* OS:

    • Prerequisites:

      Intel IPP Integration Wrappers examples use OpenGL* rendering to show results. This requires the following packages to be installed:

      • libx11-dev

      • libgl1-mesa-dev

    • Compatible compilers:

      • gcc 4.4 or higher (at least 4.9 recommended for atomic operations support)

      • clang 3.9 or higher

    • Examples with prebuilt libraries:

      • make prebuilt [ARCH=ia32|intel64] [CONF=release|release_tbb|debug|debug_tbb]

    • Examples and manual build of libraries:

      • make [ARCH=ia32|intel64] [CONF=release|release_tbb|debug|debug_tbb]

  • macOS*:

    You can use XCode* or makefiles to build libraries and examples. XCode* workspace is located in the root Integration Wrappers folder.

    Note

    Intel® IPP Integration Wrappers Examples doesn’t support rendering on macOS*

    • Compatible compilers:

      • gcc 4.4 or higher (at least 4.9 recommended for atomic operations support)

      • Apple* clang 8 or higher

    • Examples with prebuilt libraries:

      • make prebuilt [ARCH=ia32|intel64] [CONF=release|release_tbb|debug|debug_tbb]

    • Examples and manual build of libraries:

      • make [ARCH=ia32|intel64] [CONF=release|release_tbb|debug|debug_tbb]

Intel IPP Integration Wrappers build-time configuration

Intel IPP Integration Wrappers library uses set of configuration directives to increase library behavior flexibility and decrease memory footprint of user application. C code is provided as a compilable library and C++ code as a set of headers without binary part. Therefore C layer configuration must be made during library build and C++ layer can be configured during application build.

  • C API (iw/include/iw_config.h):

    Configuration directives are located in a separate configuration file and can be used to switch features or disable some parts of unused APIs. For example: one can specifically disable all Intel IPP calls with 16u data type to decrease binary size by changing IW_ENABLE_DATA_TYPE_16U define from 1 to 0.

    Switches cover data types, channels and some specific features of functionality (For example: IW_ENABLE_iwiResize_Lanczos can be used to disable specific branch of resize functionality). This is very useful if application binary becomes too large due to many code branches and optimizations contained in Intel IPP library.

    • IW_ENABLE_THREADING_LAYER - Enables Intel IPP Threading Layer calls inside Integration Wrappers if possible (requires OpenMP* support). Disabled by default.

      The parallel version of a function is used if:

      • There is a parallel implementation for the particular function.

      • The return value of the iwGetThreadsNum() function call before the target function call, or a spec initialization call is greater than 1.

      To disable threading at run time, call iwSetThreadsNum(1) before the function call.

      To check for support for internal threading, see description of function in the header (field "Internal threading").

  • C++ API (iw/include/iw++/iw_core.hpp):

    • IW_ENABLE_EXCEPTIONS - Enables errors handling by exceptions. Enabled by default.

      If enabled, C++ Integration Wrappers functions return status only for values >= 0 (success and warnings) and will throw any value < 0 as an ipp::IwException object.

      If disabled, all C++ IW functions will provide error codes as return values, same as for Integration Wrappers C API or Intel IPP.

Using Intel IPP Integration Wrappers Examples

Intel IPP Integration Wrappers package provides examples that can help you to understand how to use Integration Wrappers features. Examples work on BMP files: they can read BMP images and write results to BMP.

You can find examples in <components_folder>/interfaces/iw/examples.

Sample images are located at <components_folder>/common/data.

Integration Wrappers are accompanied by the following examples:

Resize Example

The iw_resize example provides comparison between Intel IPP API, Integration Wrappers C API and Integration Wrappers C++ API implementations of the resize operation. Running the example from the build folder with no parameters automatically takes default image and performs resize operation with 2x downscale using Integration Wrappers C++ interface.

Intel IPP version of the resize pipeline in the example supports only 8u C1 linear resize. Integration Wrappers C and C++ pipelines support most of the resize modes available in Intel IPP APIs.

The iw_resize example supports Intel TBB and OpenMP threading.

To run the example, execute the following command:

iw_resize [[-i] InputFile] [[-o] OutputFile] [Options]

The example supports the following options:

-i <1 arg>  Input file name.
-o <1 arg>  Output file name.
-m <1 arg>  Interface mode: ipp, iw (for IW C interface), and iw++ (for IW C++ interface).
-r <2 args> Destination resolution (width and height).
-k          Do not keep aspect ratio.
-t <1 arg>  Number of threads.
-s          Suppress window output.
-w <1 arg>  Minimum test time in milliseconds.
-l <1 arg>  Number of loops (overrides the test time).
-T <1 arg>  Target Intel IPP optimization. Possible values: SSE3, SSSE3, SSE41, SSE42, AVX, AVX2, AVX512
-h          Print help and exit.

Advanced Tiling Demo

Note

This example requires supported window rendering interface for demonstration purpose.

The iw_advanced_tiling_demo example is a step-by-step explanation of pipeline tiling to help you understand and adopt the concept. The example implements the following pipeline:

  • RGB to Grayscale color conversion

  • 8u to 32f upscale

  • Gaussian filter

  • Sobel filter

  • Sharp filter

  • 32f to 8u downscale

The iw_advanced_tiling_demo example demonstrates:

  • How to initialize and use pipeline tiling

  • How to handle borders in pipeline tiling (Sobel)

  • How to use non-IW APIs (Intel IPP ippiFilterSharpenBorder function) with the tiling API

At the end, the example validates tiled result by comparing it with the non-tiled reference.

On rendered image you you can see a tile window position and size through each stage of processing: images/iw/iw_doc_sample_1.png

To run the example, execute the following command:

iw_advanced_tiling_demo [[-i] InputFile] [Options]

The example supports the following options:

-i <1 arg>  Input file name.
-b <2 args> Tile size for processing.
-s          Suppress window output.
-T <1 arg>  Target Intel IPP optimization. Possible values: SSE3, SSSE3, SSE41, SSE42, AVX, AVX2, AVX512
-h          Print help and exit.

Advanced Tiling Benchmark

Note

This example requires Intel TBB or OpenMP parallel interface to function properly.

The iw_advanced_tiling_benchmark example can be used to measure performance with different tile sizes and different number of threads between:

  • Pipeline tiled with Intel IPP IW pipeline tiling API which process whole pipeline at once for a given tile

  • Pipeline tiled in classic way, there functions are tiled one by one

  • Non-tiled pipeline for reference results

The example implements the following pipeline:

  • RGB to Grayscale color conversion

  • 8u to 32f upscale

  • Gaussian filter

  • Sobel filter

  • Sharp filter

  • 32f to 8u downscale

To run the example, execute the following command:

iw_advanced_tiling_benchmark [[-i] InputFile] [[-o] OutputFile] [Options]

The example supports the following options:

-i                <1 arg>  Input file name.
-o                <1 arg>  Output file name.
-t --threads      <1 arg>  Manual number of threads (threads iterations will be ignored)
   --threads-min  <1 arg>  Start number of threads range
   --threads-max  <1 arg>  End number of threads range
   --threads-step <1 arg>  Step between start and end of the threads range
-b --tile         <2 args> Manual tile size (tile iterations will be ignored)
   --tile-min     <2 args> Minimal tile size for tiling iterations
   --tile-divider <2 args> Tile divider for tiling iterations. 2 by default
-r                         Re-initialize data for each loop and include initialization time in timings
   --csv          <1 arg>  CSV file name to save performance data
-l                <1 arg>  Number of loops per iteration
-T                <1 arg>  Target Intel IPP optimization. Possible values: SSE3, SSSE3, SSE41, SSE42,
                           AVX, AVX2, AVX512
-h --help                  Print help and exit

Example of console output:

Threads | Tile Width | Tile Height | Tile Mem. (KB) | Ref. (ms) | Pipe (ms) | R/P   | NPipe (ms) | R/NP  | NP/P
1       | 1920       | 1080        | 26325          | 6.401     | 6.078     | 1.1   | 6.010      | 1.1   | 1.0
1       | 1920       | 540         | 13297          | 6.401     | 5.214     | 1.2   | 6.347      | 1.0   | 1.2
1       | 1920       | 270         | 6716           | 6.401     | 4.134     | 1.5   | 6.431      | 1.0   | 1.6
1       | 1920       | 135         | 3425           | 6.401     | 4.186     | 1.5   | 6.589      | 1.0   | 1.6
1       | 1920       | 67          | 1768           | 6.401     | 4.295     | 1.5   | 6.602      | 1.0   | 1.5
1       | 1920       | 33          | 939            | 6.401     | 4.629     | 1.4   | 6.629      | 1.0   | 1.4
1       | 1920       | 16          | 525            | 6.401     | 5.213     | 1.2   | 6.753      | 0.9   | 1.3
1       | 1920       | 8           | 330            | 6.401     | 6.208     | 1.0   | 7.050      | 0.9   | 1.1
4       | 1920       | 270         | 6716           | 6.401     | 4.950     | 1.3   | 4.453      | 1.4   | 0.9
4       | 1920       | 135         | 3425           | 6.401     | 3.338     | 1.9   | 5.047      | 1.3   | 1.5
4       | 1920       | 67          | 1768           | 6.401     | 1.648     | 3.9   | 4.555      | 1.4   | 2.8
4       | 1920       | 33          | 939            | 6.401     | 1.403     | 4.6   | 4.558      | 1.4   | 3.2
4       | 1920       | 16          | 525            | 6.401     | 1.433     | 4.5   | 4.420      | 1.4   | 3.1
4       | 1920       | 8           | 330            | 6.401     | 1.726     | 3.7   | 4.410      | 1.5   | 2.6
  • Tile Mem - tile footprint in memory. Amount of memory which is required to process tile through entire pipeline with all intermediate buffers. This value is based on Pipeline Tiling API implementation.

  • R/P - ratio between Reference and Pipeline Tiling API.

  • NPipe - Implementation tiled function by function.

  • R/NP - ratio between Reference and Non-Pipe Tiled implementation.

  • NP/P - ration between Pipeline Tiling API and Non-Pipe Tiled implementations.

Example of CSV output:

Threads Tile Width  Tile Height Tile Footprint (KB) Tile Overhead   Reference (ms)  Pipe (ms)   Non-Pipe (ms)   Pipe Karp-Flatt Non-Pipe Karp-Flatt
1       1920        1080        26325               0.008           6.365109        6.124229    6.207126        0               0
1       1920        540         13297               0.013           6.365109        5.368208    6.361129        0               0
1       1920        270         6716                0.023           6.365109        4.157982    6.40985         0               0
1       1920        135         3425                0.042           6.365109        4.186971    6.534762        0               0
1       1920        67          1768                0.082           6.365109        4.277384    6.515132        0               0
1       1920        33          939                 0.164           6.365109        4.67605     6.60685         0               0
1       1920        16          525                 0.336           6.365109        5.236525    6.637589        0               0
1       1920        8           330                 0.669           6.365109        6.377097    7.009519        0               0
4       1920        270         6716                0.023           6.365109        4.866554    4.450894        0.686           0.599
4       1920        135         3425                0.042           6.365109        2.723196    4.447217        0.237           0.598
4       1920        67          1768                0.082           6.365109        1.58232     4.53905         -0.002          0.617
4       1920        33          939                 0.164           6.365109        1.393994    4.553258        -0.041          0.62
4       1920        16          525                 0.336           6.365109        1.454381    4.414833        -0.029          0.591
4       1920        8           330                 0.669           6.365109        1.868798    4.465601        0.058           0.602
  • Tile Footprint - same as "Tile Mem" column from std output.

  • Tile Overhead - amount of redundant processing required to be done by tiles relative to total destination image size. Each tile for filters must process borders, so some portions of image will be processed several times. E.g. 0.336 means that 34% of image was processed twice.

  • Karp-Flatt - nonlinear parallelization efficiency metric. The lower the better.

Technical Support

If you did not register your Intel® software product during installation, please do so now at the Intel® Software Development Products Registration Center. Registration entitles you to free technical support, product updates and upgrades for the duration of the support term.

For general information about Intel technical support, product updates, user forums, FAQs, tips and tricks, and other support questions, please visit (http://www.intel.com/software/products/support).

Note
If your distributor provides technical support for this product, please contact them rather than Intel.

For technical information about the Intel® IPP library, including FAQ’s, tips and tricks, and other support information, please visit the Intel® IPP forum: (http://software.intel.com/en-us/forums/intel-integrated-performance-primitives) and browse the Intel® IPP support page: https://software.intel.com/en-us/intel-ipp-support/.