Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Details about Intrinsics

All instructions use the following features:

  • Registers

  • Data Types

Registers

Intel® processors provide special register sets for different instructions.

  • Intel® MMX™ instructions use eight 64-bit registers (mm0 to mm7) which are aliased on the floating-point stack registers.

  • Intel® Streaming SIMD Extensions (Intel® SSE) and the Advanced Encryption Standard (AES) instructions use eight 128-bit registers (xmm0 to xmm7).

  • Intel® Advanced Vector Extensions (Intel® AVX) instructions use 256-bit registers which are extensions of the 128-bit SIMD registers.

  • Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions use 512-bit registers.

Because each of these registers can hold more than one data element, the processor can process more than one data element simultaneously. This processing capability is also known as single-instruction multiple data processing (SIMD).

For each computational and data manipulation instruction in the new extension sets, there is a corresponding C intrinsic that implements that instruction directly. This frees you from managing registers and assembly programming. Further, the compiler optimizes the instruction scheduling so that your executable runs faster.

Data Types

Intrinsic functions use new C data types as operands, representing the new registers that are used as the operands to these intrinsic functions.

The following table details for which instructions each of the new data types are available. A 'Yes' indicates that the data type is available for that group of intrinsics; an 'NA' indicates that the data type is not available for that group of intrinsics.

Data Types -->

Technology

__m64

__m128

__m128d

__m128i

__m256 __m256d __m256i __m512 __m512d __m512i
Intel® MMX™ Technology Intrinsics

Yes

NA

NA

NA

NA

NA

NA

NA

NA

NA

Intel® Streaming SIMD Extensions Intrinsics

Yes

Yes

NA

NA

NA

NA

NA

NA

NA

NA

Intel® Streaming SIMD Extensions 2 Intrinsics

Yes

Yes

Yes

Yes

NA

NA

NA

NA

NA

NA

Intel® Streaming SIMD Extensions 3 Intrinsics

Yes

Yes

Yes

Yes

NA

NA

NA

NA

NA

NA

Advanced Encryption Standard Intrinsics + Carry-less Multiplication Intrinsic

Yes

Yes

Yes

Yes

NA

NA

NA

NA

NA

NA

Half-Float Intrinsics

Yes

Yes

Yes

Yes

NA

NA

NA

NA

NA

NA

Intel® Advanced Vector Extensions Intrinsics

Yes

Yes

Yes

Yes

Yes

Yes

Yes

NA

NA

NA

Intel® Advanced Vector Extensions 512 Intrinsics

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

__m64 Data Type

The __m64 data type is used to represent the contents of an MMX register, which is the register that is used by the MMX™ technology intrinsics. The __m64 data type can hold eight 8-bit values, four 16-bit values, two 32-bit values, or one 64-bit value.

__m128 Data Types

The __m128 data type is used to represent the contents of a SSE register used by the Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics.

Conventionally, the __m128 data type can hold four 32-bit floating-point values, while the __m128d data type can hold two 64-bit floating-point values, and the __m128i data type can hold sixteen 8-bit, eight 16-bit, four 32-bit, or two 64-bit integer values.

The compiler aligns __m128d and _m128i local and global data to 16-byte boundaries on the stack. To align integer, float, or double arrays, use the __declspec(align) statement.

Accessing __m128i Data

To access 8-bit data on IA-32 and Intel® 64 architecture-based systems, use the mm_extract intrinsics as follows:

#define _mm_extract_epi8(x, imm) \ 
((((imm) & 0x1) == 0) ?   \ 
_mm_extract_epi16((x), (imm) >> 1) & 0xff : \ 
_mm_extract_epi16(_mm_srli_epi16((x), 8), (imm) >> 1))

To access 16-bit data, use:

int _mm_extract_epi16(__m128i a, int imm)

To access 32-bit data, use:

#define _mm_extract_epi32(x, imm) \ 
_mm_cvtsi128_si32(_mm_srli_si128((x), 4 * (imm)))

To access 64-bit data (Intel® 64 architecture only), use:

#define _mm_extract_epi64(x, imm) \ 
_mm_cvtsi128_si64(_mm_srli_si128((x), 8 * (imm)))

__m256 Data Types

The __m256 data type is used to represent the contents of the extended SSE register - the YMM register, used by the Intel® AVX intrinsics.

The __m256 data type can hold eight 32-bit floating-point values, while the __m256d data type can hold four 64-bit double precision floating-point values, and the __m256i data type can hold thirty-two 8-bit, sixteen 16-bit, eight 32-bit, or four 64-bit integer values. See Details for Intel® AVX Intrinsics for more information.

__m512 Data Types

The __m512 data type is used to represent the contents of the extended SSE register - the ZMM register, used by the Intel® AVX-512 intrinsics.

The __m512 data type can hold sixteen 32-bit floating-point values, while the __m512d data type can hold eight 64-bit double precision floating-point values, and the __m512i data type can hold sixty-four 8-bit, thirty-two 16-bit, sixteen 32-bit, or eight 64-bit integer values. See Overview: Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Instructions for more information.

Data Types Usage Guidelines

These data types are not basic ANSI C data types. You must observe the following usage restrictions:

  • Use data types as objects in aggregates, such as unions, to access the byte elements and structures.