Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

Automatically-Aligned Dynamic Allocation

Background

It is possible to tell the compiler that a data structure has a greater alignment requirement than its individual elements require. For example:

C++ standard syntax

class alignas(64) X {
	double elem[8];
};
GNU-compatible syntax
class __attribute__((aligned(64))) X {
	double elem[8];
};
Microsoft-compatible syntax
class __declspec(align(64)) X {
	double elem[8];
};

This is especially important for a structure that will be used with SIMD instructions, which typically require greater alignment than the individual data elements. The compiler will ensure that variables declared with such a type, either statically or on the stack, will be allocated with the appropriate alignment.

However, if an object of such a type is allocated dynamically, with a new-expression, the compiler was not previously able to do anything to ensure the appropriate alignment. That is because the C++ language requires that only very specific allocation methods be used, over which the programmer can take control if necessary, and none of those allocation methods are able to support specific alignment. They all assume that some alignment value is enough for everything, and guarantee that (and nothing more).

In the past, to ensure a greater alignment for a given type, a programmer had to take control of its allocation. One way to do that is by always allocating the memory separately with the appropriate alignment, and using a non-allocating placement new-expression. For example:

Incorrect alignment
new X
Correct alignment
new (_mm_malloc(sizeof(X), alignof(X))) X

However, this method is verbose, tedious, and error-prone.

Another way is to write class-specific allocation and deallocation functions—operator new and operator delete. For example:

class alignas(64) X {
  double elem[8];

public:
  void *operator new(size_t size){
    return _mm_malloc(size, alignof(X));
  }

  void operator delete(void *p){
    return _mm_free(p);
  }
};

This method is easier, because the changes are centralized in the class, instead of being distributed over the uses of the class. But to get it right in general is still fairly involved, because it requires defining several more functions, in case arrays of the class are dynamically allocated or nothrow allocation is used.

Automatically-Aligned Dynamic Allocation

In this release of the compiler, all that is necessary in order to get correct dynamic allocation for aligned data is to include a new header:

#include <aligned_new>

After this header is included, a new-expression for any aligned type will automatically allocate memory with the alignment of that type.

On Windows*, it is possible to direct the compiler to include a file at the beginning of the primary source file, without modifying the source, using the /FI command-line option.

Implementation Details

This section explains the language rules for the new feature. If a program needs to take control of dynamic allocation and deallocation of aligned data for some reason other than alignment, this section explains how it can be done.

Header <aligned_new> defines several new alignment-aware allocation and deallocation functions, each of which takes an alignment argument:

void *operator new    (size_t, align_val_t); 
void *operator new    (size_t, align_val_t, nothrow_t const &); 
void operator delete  (void *, align_val_t); 
void operator delete  (void *, align_val_t, nothrow_t const &); 
void *operator new[]  (size_t, align_val_t);
void *operator new[]  (size_t, align_val_t, nothrow_t const &);
void operator delete[](void *, align_val_t); 
void operator delete[](void *, align_val_t, nothrow_t const &);

The type align_val_t is declared internally by the compiler as if by a declaration like this:

namespace std { 
  enum class align_val_t: size_t; 
}; 

In other words, std::align_val_t is a scoped enumeration type, which can not be implicitly converted to an integer type, but has the same range and representation as std::size_t.

When the compiler processes a new-expression for a type whose alignment is greater than (2 * sizeof(void *)), it builds an argument list according to the normal C++ rules, but with an additional alignment argument of type align_val_t following the size argument (followed by the placement arguments from the new-expression, if any). It then uses overload resolution to try to find an alignment-aware operator new or operator new[] function that can be called with those arguments. If no alignment-aware function is found, the alignment argument is removed from the argument list, and overload resolution is attempted again. An error is reported if this second attempt fails.

Class-specific Allocation and Deallocation Functions

If a program already provides class-specific allocation and deallocation functions for an aligned class, including <aligned_new> will not change the behavior, because class-specific functions take precedence over global functions, and <aligned_new> defines only global functions.

Unless class-specific allocation and deallocation functions are written for a base class of a class hierarchy containing classes with different alignments, it is probably not necessary to write alignment-aware allocation and deallocation functions that take an alignment argument; the appropriate alignment can instead be built into the class-specific allocation and deallocation functions.

Replacing Global Allocation and Deallocation Functions

NOTE:
If a program defines its own global allocation and deallocation functions, replacing the ones from the standard library, and uses a non-placement new-expression to allocate aligned data, and <aligned_new> is included before the point of such a new-expression, the behavior of the program will change. The allocation will no longer use the program's replacement allocation functions, but instead Intel's provided alignment-aware allocation functions. In a program that replaces the global allocation and deallocation functions, care must be used to decide whether to include <aligned_new>.

If a program wants to replace the global allocation and deallocation functions, and also wants to take advantage of the compiler's ability to provide an alignment argument to such functions, <aligned_new> should not be included, because it provides inline definitions of the alignment-aware functions, which will conflict with or take precedence over the program's definitions. Instead, <aligned_new> should be used as a guide to write program-specific declarations and definitions of the alignment-aware functions that need to be replaced.