- Home›
- Technology and Research›
- Intel Technology Journal›
- Multi-Core Software
Multi-Core Software
Process Scheduling Challenges in the Era of Multi-Core Processors
MULTI-CORE TOPOLOGIES
In most of the multi-core implementations, to make the best use of the resources and to make inter-core communication efficient, cores in a physical package share some of the resources. For example, the Intel® Core™2 Duo processor has two CPU cores sharing the Level 2 (L2) cache (Intel® Advanced Smart Cache), as shown in Figure 2. The Intel® Core™2 Quad processor has four cores in a physical package with two last-level (L2) caches. Each of the L2 caches is shared by two cores. Going forward, as more and more logic gets integrated into the processor package; more resources will be shared between the cores on the die.
If only one of the cores in the package is active, a thread running on that core gets to use all the shared resources, resulting in maximum resource utilization and peak performance for that single thread. If multiple threads or processes run on different cores of the same physical package and if they share data that fit in the cache, then the shared last-level cache between cores will minimize the data duplication. This sharing, therefore, results in more efficient inter-thread communication.
Multi-Core Power Management
In typical multi-core configurations, all cores in one physical package reside in the same power domain (voltage and frequency). As a result, the processor performance state (P-state) transitions for all the cores need to happen at the same time. If one core is busy running a task at P0, this coordination will ensure that other cores in that package can't enter low-power P-states, resulting in the complete package at the highest power P0 state for optimal performance.

Figure 2: Dual-core package with shared resources
click image for larger view
Since each execution core operates independently, each core block can independently enter a processor power state (C-state). For example, one core can enter lower power C1 or C2 while the other executes code in the active power state C0. The common block will always reside in the numerically lowest (highest power) C-state of all the cores. For example, if one core is in C2 and another core is in C0, the shared block will reside in C0.
Intel Dynamic Acceleration Technology
Intel® Dynamic Acceleration Technology [7], available in the current Intel Core 2 processor family, increases the performance of single-threaded applications. If one core is in deep C-state, some of the power normally available to that idle core can be applied to the active core while still staying within the thermal design power specification for the processor. This increases the speed at which a single-threaded application can be executed, thereby improving the performance of the application.
