Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

Intrinsics for Integer Insert and Extract Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>


Intrinsic Name

Operation

Corresponding
Intel® AVX-512 Instruction

_mm512_extracti32x4_epi32, _mm512_mask_extracti32x4_epi32, _mm512_maskz_extracti32x4_epi32

Extracts int32 values.

VEXTRACTI32X4

_mm512_extracti64x4_epi64, _mm512_mask_extracti64x4_epi64, _mm512_maskz_extracti64x4_epi64

Extracts int64 values.

VEXTRACTI64X4

_mm512_inserti32x4_epi32, _mm512_mask_inserti32x4_epi32, _mm512_maskz_inserti32x4_epi32

Inserts int32 values.

VINSERTI32X4

_mm512_inserti64x4_epi64, _mm512_mask_inserti64x4_epi64, _mm512_maskz_inserti64x4_epi64

Inserts int64 values.

VINSERTI64X4


variable definition
k

writemask used as a selector

a

first source vector element

mem_addr

pointer to base address in memory

src

source element to use based on writemask result

tmp

temporary location specified by imm

imm

specifies temporary location tmp


_mm512_extracti32x4_epi32

extern __m128i __cdecl _mm512_extracti32x4_epi32(__m512i a, int imm);

Extracts 128 bits (composed of four packed 32-bit integers) from a, selected with imm, and stores the result.



_mm512_mask_extracti32x4_epi32

extern __m128i __cdecl _mm512_mask_extracti32x4_epi32(__m128i src, __mmask8 k, __m512i a, int imm);

Extracts 128 bits (composed of four packed 32-bit integers) from a, selected with imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_extracti32x4_epi32

extern __m128i __cdecl _mm512_maskz_extracti32x4_epi32(__mmask8 k, __m512i a, int imm);

Extracts 128 bits (composed of four packed 32-bit integers) from a, selected with imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_extracti64x4_epi64

extern __m256i __cdecl _mm512_extracti64x4_epi64(__m512i a, int imm);

Extracts 256 bits (composed of four packed int64 elements ) from a, selected with imm, and stores the result.



_mm512_mask_extracti64x4_epi64

extern __m256i __cdecl _mm512_mask_extracti64x4_epi64(__m256i src, __mmask8 k, __m512i a, int imm);

Extracts 256 bits (composed of four packed int64 elements ) from a, selected with imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_extracti64x4_epi64

extern __m256i __cdecl _mm512_maskz_extracti64x4_epi64(__mmask8 k, __m512i a, int imm);

Extracts 256 bits (composed of four packed int64 elements ) from a, selected with imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_inserti32x4

extern __m512i __cdecl _mm512_inserti32x4(__m512i a, __m128i b, int imm);

Copies a to destination, then inserts 128 bits (composed of four packed 32-bit integers) from b into destination at the location specified by imm.



_mm512_mask_inserti32x4

extern __m512i __cdecl _mm512_mask_inserti32x4(__m512i src, __mmask16 k, __m512i a, __m128i b, int imm);
Copies a to tmp, then inserts 128 bits (composed of four packed 32-bit integers) from b into tmp at the location specified by imm. Store tmp to using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_inserti32x4

extern __m512i __cdecl _mm512_maskz_inserti32x4(__mmask16 k, __m512i a, __m128i b, int imm);

Copies a to tmp, then inserts 256 bits (composed of four packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by imm.

Store tmp to destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_inserti64x4

extern __m512i __cdecl _mm512_inserti64x4(__m512i a, __m256i b, int imm);

Copies a to tmp, then inserts 256 bits (composed of four packed int64 elements ) from b into tmp at the location specified by imm.



_mm512_mask_inserti64x4

extern __m512i __cdecl _mm512_mask_inserti64x4(__m512i src, __mmask8 k, __m512i a, __m256i b, int imm);

Copies a to tmp, then inserts 256 bits (composed of four packed int64 elements ) from b into tmp at the location specified by imm. Store tmp to using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_inserti64x4

extern __m512i __cdecl _mm512_maskz_inserti64x4(__mmask8 k, __m512i a, __m256i b, int imm);

Copies a to tmp, then inserts 128 bits (composed of four packed 32-bit integers) from b into tmp at the location specified by imm. Store tmp to using zeromask k (elements are zeroed out when the corresponding mask bit is not set).