- Home›
- Technology and Research›
- Intel Technology Journal›
- Multi-Core Software
Multi-Core Software
Intel® Performance Libraries: Multi-Core-Ready Software for Numeric-Intensive Computation
INTEGRATED PERFORMANCE PRIMITIVES (IPP)
IPP is a multi-functional library highly optimized for Intel® architecture. IPP covers 15 functional domains that can be recognized by a suffix in the library file names. For example, functions with IPPs in their names are signal processing functions, note suffix "s." There are more than two-thousand functions processing 1D signals/data of different data types: real and complex, signed and unsigned, floating point, and integer. The other libraries in IPP are image processing "i," JPEG primitives "j," audio coding "ac," color conversion "cc," string processing "ch," cryptography "cp," computer vision "cv," data compression "dc," small matrix operations "m," realistic rendering "r," speech coding "sc," speech recognition primitives "sr," video coding "vc," vector math "vm."
IPP is optimized for several Intel architectures: IA32, IA64, Intel® 64, and IXP. Within each architecture are optimizations for specific processors. For instance, within IA32 architecture there are specific optimizations for the Pentium® 4 and Intel® Core™2 Duo processors, among others.
IPP is optimized at three levels: algorithmic, effective use of SIMD instructions (SSE2, SSE3), and parallelization at both the primitive and component levels. Primitive-level threading is the threading implemented in IPP functions. Not every function in IPP is parallelized because of the overhead added by threading. However, the good news here is that IPP is by design a set of build blocks and applications that developers can easily use to thread their application by calling the primitives on different threads.
Component-level threading is threading provided in such components as video codecs, the H264 encoder and decoder; the jpeg viewer, and the IPP implementation of well-known data compression libraries, ZLIB and GZIP. These components, as well as others, are shipped with IPP as IPP samples given in their source codes.
An example of algorithm optimization is the median filter in the Signal and Image processing domains. Table 1, for instance, illustrates the results, in clocks-per-element, of IPP optimization of the median filter compared with the LEADtools library.

Table 1: IPP compared to LEADtools
click image for larger view
CPU optimization with the SIMD instruction set, which is done for many functions in IPP, also gives a performance gain that can be measured by comparing the performance ratio numbers of the C version of the library to the CPU specific library, such as optimizing for the Intel Core 2 Duo processor. Table 2 illustrates the performance advantage of multi-core threading on MPEG4 decoding.

Table 2: Speedup on threaded MPEG4
click image for larger view
