Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

Intrinsics for Integer Expand and Load Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>


Intrinsic Name

Operation

Corresponding
Intel® AVX-512 Instruction

_mm512_mask_expandloadu_epi32, _mm512_maskz_expandloadu_epi32

_mm512_mask_expand_epi32, _mm512_maskz_expand_epi32

Load packed int32 values from dense memory or register.

VPEXPANDD

_mm512_mask_expandloadu_epi64, _mm512_maskz_expandloadu_epi64

_mm512_mask_expand_epi64, _mm512_maskz_expand_epi64

Load packed int64 values from dense memory or register.

VPEXPANDQ


variable definition
k

writemask used as a selector

a

first source vector element

src

source element to use based on writemask result

mem_addr

pointer to base address in memory


_mm512_mask_expand_epi32

extern __m512i __cdecl _mm512_mask_expand_epi32(__m512i src, __mmask16 k, __m512i a);

Loads contiguous active int32 elements from a (those with their respective bit set in mask k), and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_expand_epi32

extern __m512i __cdecl _mm512_maskz_expand_epi32(__mmask16 k, __m512i a);

Loads contiguous active int32 elements from a (those with their respective bit set in mask k), and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_mask_expandloadu_epi32

extern __m512i __cdecl _mm512_mask_expandloadu_epi32(__m512i src, __mmask16 k, void * mem_addr);

Loads contiguous active int32 elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_expandloadu_epi32

extern __m512i __cdecl _mm512_maskz_expandloadu_epi32( __mmask16 k, void * mem_addr);

Loads contiguous active int32 elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_mask_expandloadu_epi64

extern __m512i __cdecl _mm512_mask_expandloadu_epi64(__m512i src, __mmask8 k, void * mem_addr);

Loads contiguous active int64 elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_expandloadu_epi64

extern __m512i __cdecl _mm512_maskz_expandloadu_epi64(__mmask8 k, void * mem_addr);

Loads contiguous active int64 elements from unaligned memory at mem_addr (those with their respective bit set in mask k), and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_mask_expand_epi64

extern __m512i __cdecl _mm512_mask_expand_epi64(__m512i src, __mmask8 k, __m512i a);

Loads contiguous active int64 elements from a (those with their respective bit set in mask k), and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_expand_epi64

extern __m512i __cdecl _mm512_maskz_expand_epi64(__mmask8 k, __m512i a);

Loads contiguous active int64 elements from a (those with their respective bit set in mask k), and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).