OpenMP* Memory Spaces and Allocators
For storage and retrieval variables, OpenMP* provides memory known as memory spaces. Different memory spaces have different traits. Depending on how a variable is to be used and accessed determines which memory space is appropriate for allocation of the variable.
Each memory space has a unique allocator that is used to allocate and deallocate memory in that space. The allocators allocate variables in contiguous space that does not overlap any other allocation in the memory space. Multiple memory spaces with different traits may map to a single memory resource.
The behavior of the allocator is affected by the allocator traits that you specify. The allocator traits, their possible values, and their default values are shown in the following table:
Allocator Trait
| Values That Can Be Specified
| Default Value
|
---|---|---|
access
|
| All
|
alignment
| A positive integer value that is a power of 2 specifying number of bytes
| 1 byte
|
fallback
|
| default_mem_fb |
fb_data
| An allocator handle
| None
|
partition
|
| environment |
pinned
|
| false
|
pool_size
| a positive integer value
| Implementation defined
|
sync_hint
|
| contended |
The
access
trait specifies the accessibility of the allocated memory. The following are values you can specify for
access
:
- allThis value indicates that the allocated memory must be accessible by all threads in the device where the memory allocation occurs.This is the default setting.
- cgroupThis value indicates that the allocated memory must be accessible by all threads of the same contention group as the thread that requested the allocation. Accessing the allocated memory thread that is not part of the same contention group results in undefined behavior.
- pteamThis value indicates that the allocated memory is accessible by all threads that bind to the same parallel region as the thread that requests the allocations. Access to the memory by a thread that does not bind to the same parallel region as the thread that allocated the memory results in undefined behavior.
- threadThis value indicates that the memory allocated is accessible only by the thread that allocated it. Attempts to allocate the memory by another thread result in undefined behavior.
The
alignment
trait specifies how allocated variables will be aligned. Variables will be byte-aligned to at least the value specified for this trait. The default setting is 1 byte. Alignment can also be affected by directives and OpenMP runtime allocator routines that specify alignment requirements.
The
fallback
trait indicates how an allocator behaves if it is unable to satisfy an allocation request. The following are values you can specify for
fallback
:
- abort_fbThis value indicates that the program terminates if the allocation request fails.
- allocator_fbIf this value is specified and the allocation request fails, the allocation will be tried by the allocator specified by thefb_datatrait.
- default_mem_fbThis value indicates that a failed allocation request will be retried in theomp_default_mem_spacememory space. All traits for theomp_default_mem_spaceallocator should be set to the default trait values, except thefallbacktrait should be set tonull_fb. This is the default setting.
- null_fbThis value indicates the allocator returns a zero value when an allocation request fails.
The
fb_data
trait lets you specify a fall back allocator to be used if the requested allocator fails to satisfy the allocation request. The
fallback
trait of the failing allocator must be set to
allocator_fb
in order for the allocator specified by the
fb_data
trait to be used.
The
partition
trait describes the partitioning of allocated memory over the storage resources represented by the memory space of the allocator. The following are values you can specify for
partition
:
- blockedThis value indicates the allocated memory is partitioned into blocks of memory of approximately equal size with one block per storage resource.
- environmentThis value indicates the allocated memory placement is determined by the runtime execution environment. This is the default setting.
- interleavedThis value indicates the allocated memory is distributed in a round-robin fashion across the storage resources.
- nearestThis value indicates that the allocated memory will be placed in the storage resource nearest to the thread that requested the allocation.
If the
pinned
trait has the value
true
, the allocator ensures each allocation made by the allocator will remain in the storage resource at the same location where it was allocated until it is deallocated. The default setting is
false
.
The value of
pool_size
is the total number of bytes of storage available to an allocator when there have been no allocations. The following affect
pool_size
:
- If theaccesstrait has the value all, the value ofpool_sizeis the limit for all allocations for all threads having access to the allocator.
- If theaccesstrait of the allocator has the valuecgroup, the value ofpool_sizeis the limit for allocations made from the threads within the same contention group.
- For allocators with theaccessaccess trait value ofpteam, the value ofpool_sizeis the limit for allocations made within the same parallel team.
- If theaccesstrait has the valuethread, the value ofpool_sizeis the limit for allocations made from each thread using the allocator.
- An allocation request for more space than the value ofpool_sizeresults in the allocator not fulfilling the allocation request.
The
sync_hint
trait describes the way that multiple threads can access an allocator. The following are values you can specify for
sync_hint
:
- contendedoruncontendedValuecontendedindicates that many threads are anticipated to make simultaneous allocation requests while the valueuncontendedindicates that few threads are anticipated to make simultaneous allocation. The default setting iscontended.
- privateThis value indicates that all allocation requests will come from the same thread. Specifyingprivatewhen this is not the case and two or more threads make allocation requests by the same allocator results in undefined behavior.
- serializedThis value indicates that only one thread will request an allocation at a given time. The behavior is undefined if two threads request an allocation simultaneously by an allocator whosesync_hintvalue is serialized.
There are five predefined memory spaces in OpenMP:
- The system default memory is referred to asomp_default_mem_space.
- Large capacity memory is referred to asomp_large_cap_mem_space.
- High bandwidth memory is referred to asomp_high_bw_mem_space.
- Low latency memory is referred to asomp_low_lat_mem_space.
- Memory designed for optimal storage of constant values is referred to asomp_const_mem_space.It can be initialized with compile-time constant expressions or by using a firstprivate clause.Writing to variables inomp_const_mem_spaceresults in undefined behavior.
There are three additional predefined memory spaces that are extensions to the OpenMP standard:
- omp_target_host_mem_spaceis host memory that is accessible by the device.
- omp_target_shared_mem_spaceis memory that can migrate between the host and the device.
- omp_target_device_mem_spaceis memory that is accessible to the device.
The following table shows the predefined memory allocators, the memory space they are associated with, and the non-default memory trait values they possess.
Allocator Name
| Associated Memory Space
| Non-Default Trait Values
|
---|---|---|
omp_default_mem_alloc | omp_default_mem_space | fallback=null_fb
|
omp_large_cap_mem_alloc | omp_large_cap_mem_space | none
|
omp_low_lat_mem_alloc | omp_low_lat_mem_space | none
|
omp_high_bw_mem_alloc
| omp_high_bw_mem_space | none
|
omp_const_mem_alloc | omp_const_mem_space | none
|
omp_cgroup_mem_alloc | implementation/system defined
| access=cgroup |
omp_pteam_mem_alloc | implementation/system defined
| access=pteam |
omp_thread_mem_alloc | implementation/system defined
| access=thread |
omp_target_host_mem_alloc | omp_target_host_mem_space | none
|
omp_target_shared_mem_alloc | omp_target_shared_mem_space | none
|
omp_target_device_mem_alloc | omp_target_device_mem_space | none
|