Technology & Research

Intel® Technology Journal Home

Volume 11, Issue 04

Multi-Core Software


Intel Technology Journal - Featuring Intel's recent research and development

ISSN 1535-864X DOI 10.1535/itj.1104.03

  • Volume 11
  • Issue 04
  • Published November 15, 2007

Multi-Core Software

  Section 3 of 12  

Parallel Software Development with Intel® Threading Analysis Tools

PRINCIPLES OF PARALLEL APPLICATION DESIGN

Decomposition Techniques

Dividing a computation into smaller computations and assigning these to different processors for execution are two key steps in parallel design. Two of the most common decomposition techniques [3] are functional decomposition and data decomposition [2].

  • Functional decomposition is used to introduce concurrency in the problems that can be solved by different independent tasks. All these tasks can run concurrently.
  • Data decomposition works best on an application that has a large data structure. By partitioning the data on which the computations are performed, a task is decomposed into smaller tasks to perform computations on each data partition. The tasks performed on the data partitions are usually similar. There are different ways to perform data partitioning: partitioning input/output data or partitioning intermediate data.

Parallel Models

These are some of the commonly used parallel models [3].

  • Data parallel model. This is one of the simplest parallel models. In this model, the same or similar computations are performed on different data repeatedly. Image processing algorithms that apply a filter to each pixel are a common example of data parallelism. OpenMP* is an API that is based on compiler directives that can express a data parallel model.
  • Task parallel model. In this model, independent works are encapsulated in functions to be mapped to individual threads, which execute asynchronously. Thread libraries (e.g., the Win32 thread API or POSIX* threads) are designed to express task-level concurrency.
  • Hybrid models. Sometimes, more than one model may be applied to solve one problem, resulting in a hybrid algorithm model. A database is a good example of hybrid models. Tasks like inserting records, sorting, or indexing can be expressed in a task-parallel model, while a database query uses the data-parallel model to perform the same operation on different data.

Amdahl's Law

Amdahl's law provides the theoretical basis to assess the potential benefits of converting a serial application to a parallel one. It predicts the speedup limit of parallelizing an application:

Tparallel = {( 1 – P ) + P / N} * Tserial + ON
Tserial: time to run an application in serial version
P: parallel portion of the process
N: number of processors
ON: parallel overhead in using N threads

We can predict the speedup by looking at the scalability:

Scalability = Tserial / Tparallel

We can get the theoretical limit of scalability, assuming there is no parallel overhead:
Scalability = 1/ {(1– P) + P/N}
When N -> ∞, Scalability -> 1/ (1– P)

  Section 3 of 12  

Back to Top

In this article

Download a PDF of this article.