Technology & Research

Intel® Technology Journal Home

Volume 11, Issue 04

Multi-Core Software


Intel Technology Journal - Featuring Intel's recent research and development

ISSN 1535-864X DOI 10.1535/itj.1104.03

  • Volume 11
  • Issue 04
  • Published November 15, 2007

Multi-Core Software

  Section 8 of 12  

Parallel Software Development with Intel® Threading Analysis Tools

RESULTS

Figure 20 shows the performance of Serial, Win32*, and Intel® TBB implementation of the ACBM search algorithm when the test file is 32M. Both Win32 and Intel TBB achieve better performance than the serial version.



Figure 20: Performance data for different implementations
click image for larger view
 

In order to evaluate the performance gain of parallelizing with Win32 and Intel TBB, we test the two applications with different file sizes and calculate the corresponding scalability. Figure 20 shows that both Win32 and Intel TBB attain better scalability as the file size grows. Intel TBB has better performance than Win32 for all the different file sizes. When the file size is 256M, Scalabilitytbb = 1.714 and Scalabilitywin32 = 1.580, Also, the average Scalabilitytbb = 1.655, and the average Scalabilitywin32 =1.549. Considering the ideal scalability is 1.867, Intel® TBB demonstrates good scaling performance on a dual core system. Since the AC_BM algorithm can be used in the IDS detection engine, which is the core module of an IDS, parallelizing the algorithm can significantly improve the performance of the IDS.



Figure 21: Scalability comparison between Win32 and Intel TBB
click image for larger view
 

From the test results, we can see that Intel TBB demonstrates better performance than Win32 for the AC_BM algorithm. The reason is that with Intel TBB, we specify tasks instead of threads. A task can be assigned to a thread dynamically; Intel TBB selects the best thread for a task by using the task scheduler. If one thread runs faster, it is assigned to perform more tasks. However, with the Win32 Threading Library, a thread is assigned to a fixed task, and it can not be reassigned to other tasks even though it is idle.

From a developer's point of view, Intel TBB is easier to use. Intel TBB can dynamically decide the most suitable number of threads according to the platform and workload. Developers don't need to consider the granularity of the task by themselves. Further, when the application is migrated to another platform, Intel TBB automatically selects the best number of threads and assigns the tasks to the right threads to gain the best performance. Another advantage of using Intel TBB is that developers don't need to consider threading issues such as data race and inappropriate synchronization when they use the templates provided by Intel TBB, such as parallel_reduce, parallel_while, etc.

  Section 8 of 12  

Back to Top

In this article

Download a PDF of this article.