- Home›
- Technology and Research›
- Intel Technology Journal›
- Multi-Core Software
Multi-Core Software
Parallel Software Development with Intel® Threading Analysis Tools
INTRODUCTION
As multi-core processors become mainstream in the market place, software needs to be parallel to take advantage of multiple cores. However, there is no general framework available to implement parallel programming for different applications to achieve the highest performance gain. Generally implemented with multi-threading, parallel programming is notoriously difficult for developers to design, implement, and debug. In order to make life easier for developers, Intel provides a set of threading tools targeting various phases of the development cycle. In a generic development cycle, program development can be divided into four phases [1]:
- Analysis phase: Profiling the serial version of the program to determine the areas that are suitable for parallel decomposition.
- Design/implementation phase: Examining identified threading candidates, determining the changes that have to be made to the serial version, and converting them to the actual code.
- Debug phase: Ensuring the correctness of the program. Detecting and solving common threading errors such as data race and deadlocks.
- Testing/tuning phase: Validating the correctness of the program and testing its performance. Detecting performance issues and fixing them by improved design or by eliminating bottlenecks.
Intel's threading tools provide aids for developers from performance analysis to implementation and debugging:
- Intel® VTune™ Performance Analyzer [4]. This tool helps developers tune an application to better perform on Intel® architectures. It locates the performance bottlenecks and program hotspots by collecting, sampling, and displaying system-wide data down to specific functions, modules, or instructions. It is usually used during the analysis and tuning phase of the development cycle.
- Intel® Thread Profiler [5]. This tool helps to identify performance bottlenecks in Win32* and OpenMP* threaded software. It detects threading performance issues such as thread overhead and synchronization cost. The profiler is usually used in the tuning phase.
- Intel® Thread Checker [6]. This tool helps to find bugs in Win32 and OpenMP threaded software. It locates threading issues such as race conditions, thread stalls, and potential thread deadlocks. The Intel Thread Checker is usually used during the design and debugging phases.
- Intel® Threading Building Blocks (Intel® TBB) [7]. This is a threading abstraction library that provides high-level generic implementation of parallel patterns and concurrent data structures [2]. Intel® TBB is usually used in the design, implementation, and tuning phases.
In the sections that follow, we first introduce the principles of parallel application design; then we show how to parallelize an application with the help of threading tools during each phase of the development cycle. A multiple pattern matching algorithm is used as an example. We use the Win32 threading API and Intel® TBB to implement the parallelism, and we compare the performance of the two.
