Control Thread Allocation
Control Thread Distribution
- Core type, specified asintel_core, orintel_atom.
- Core efficiency, specified aseffwherenumnumis a non-negative integer from zero to the number of core efficiencies detected minus one. The larger the efficiency the more performant the core. For example,KMP_HW_SUBSET=4c:eff0,5c:eff1will select all sockets, four cores of efficiency 0, five cores of efficiency 1, and all threads per those cores.
To Assign This Number of Threads ...
... Use This Setting
Control Thread Bindings
- compact: Distribute the threads sequentially among the cores.
- scatter: Distribute the threads among the cores in a round robin manner. Distribution is one thread per core initially, followed by repeat distribution among the cores.
OpenMP Threads on Core 0
OpenMP Threads on Core 1
0, 1, 2
3, 4, 5
0, 2, 4
1, 3, 5
Determine the Best Setting
- Ensure that your OpenMP code is working properly before using these environment variables.
- Establish a baseline with your current OpenMP code to compare to the performance when you allocate the threads to a processor.
- Measure the performance of distributing one, two, three, or four threads per core by use theKMP_HW_SUBSETvariable.
- Measure the performance of binding the threads to the cores by using theKMP_AFFINITYvariable.