Technology & Research

Intel® Technology Journal Home

Volume 11, Issue 04

Multi-Core Software


Intel Technology Journal - Featuring Intel's recent research and development

ISSN 1535-864X DOI 10.1535/itj.1104.04

  • Volume 11
  • Issue 04
  • Published November 15, 2007

Multi-Core Software

  Section 7 of 12  

Intel® Performance Libraries: Multi-Core-Ready Software for Numeric-Intensive Computation

INTEGRATED PERFORMANCE PRIMITIVES (IPP)

IPP is a multi-functional library highly optimized for Intel® architecture. IPP covers 15 functional domains that can be recognized by a suffix in the library file names. For example, functions with IPPs in their names are signal processing functions, note suffix "s." There are more than two-thousand functions processing 1D signals/data of different data types: real and complex, signed and unsigned, floating point, and integer. The other libraries in IPP are image processing "i," JPEG primitives "j," audio coding "ac," color conversion "cc," string processing "ch," cryptography "cp," computer vision "cv," data compression "dc," small matrix operations "m," realistic rendering "r," speech coding "sc," speech recognition primitives "sr," video coding "vc," vector math "vm."

IPP is optimized for several Intel architectures: IA32, IA64, Intel® 64, and IXP. Within each architecture are optimizations for specific processors. For instance, within IA32 architecture there are specific optimizations for the Pentium® 4 and Intel® Core™2 Duo processors, among others.

IPP is optimized at three levels: algorithmic, effective use of SIMD instructions (SSE2, SSE3), and parallelization at both the primitive and component levels. Primitive-level threading is the threading implemented in IPP functions. Not every function in IPP is parallelized because of the overhead added by threading. However, the good news here is that IPP is by design a set of build blocks and applications that developers can easily use to thread their application by calling the primitives on different threads.

Component-level threading is threading provided in such components as video codecs, the H264 encoder and decoder; the jpeg viewer, and the IPP implementation of well-known data compression libraries, ZLIB and GZIP. These components, as well as others, are shipped with IPP as IPP samples given in their source codes.

An example of algorithm optimization is the median filter in the Signal and Image processing domains. Table 1, for instance, illustrates the results, in clocks-per-element, of IPP optimization of the median filter compared with the LEADtools library.



Table 1: IPP compared to LEADtools
click image for larger view
 

CPU optimization with the SIMD instruction set, which is done for many functions in IPP, also gives a performance gain that can be measured by comparing the performance ratio numbers of the C version of the library to the CPU specific library, such as optimizing for the Intel Core 2 Duo processor. Table 2 illustrates the performance advantage of multi-core threading on MPEG4 decoding.



Table 2: Speedup on threaded MPEG4
click image for larger view
 

  Section 7 of 12  

Back to Top

In this article

Download a PDF of this article.