Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Intrinsics for Compression Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>


Intrinsic Name

Operation

Corresponding
Intel® AVX-512 Instruction

_mm512_mask_compress_pd, _mm512_maskz_compress_pd

Contiguously store active float32 elements.

VCOMPRESSPD

_mm512_mask_compress_ps, _mm512_maskz_compress_ps

Contiguously store active float64 elements.

VCOMPRESSPS

_mm512_mask_compress_epi32, _mm512_maskz_compress_epi32, _mm512_mask_compressstoreu_epi32

Contiguously store active int32 elements.

VPCOMPRESSD

_mm512_mask_compress_epi64, _mm512_maskz_compress_epi64

Contiguously store active int64 elements.

VPCOMPRESSQ


variable definition
k

writemask used as a selector

a

first source vector element

src

source element to use based on writemask result

base_addr

pointer to base address in memory to begin load or store operation


_mm512_mask_compress_pd

extern __m512d __cdecl _mm512_mask_compress_pd(__m512d a, __mmask8 k, __m512d src);

Contiguously stores the active float64 elements in a (those with their respective bit set in writemask k) to destination, and passes through the remaining elements from src.



_mm512_maskz_compress_pd

extern __m512d __cdecl _mm512_maskz_compress_pd(__mmask8 k, __m512d a);

Contiguously stores the active float64 elements in a (those with their respective bit set in zeromask k) to destination, and set the remaining elements to zero.



_mm512_mask_compress_ps

extern __m512 __cdecl _mm512_mask_compress_ps(__m512 a, __mmask16 k, __m512 src);

Contiguously stores the active float32 elements in a (those with their respective bit set in writemask k) to destination, and passes through the remaining elements from src.



_mm512_maskz_compress_ps

extern __m512 __cdecl _mm512_maskz_compress_ps(__mmask16 k, __m512 a);

Contiguously stores the active float32 elements in a (those with their respective bit set in zeromask k) to destination, and set the remaining elements to zero.



_mm512_mask_compressstoreu_pd

extern void __cdecl _mm512_mask_compressstoreu_pd(void* base_addr, __mmask8 k, __m512d a);

Contiguously stores the active float64 elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.



_mm512_mask_compressstoreu_ps

extern void __cdecl _mm512_mask_compressstoreu_ps(void* base_addr, __mmask16 k, __m512 a);

Contiguously stores the active float32 elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.



_mm512_mask_compress_epi32

extern __m512i __cdecl _mm512_mask_compress_epi32(__m512i a, __mmask16 k, __m512i src);

Contiguously stores the active int32 elements in a (those with their respective bit set in writemask k) to destination, and passes through the remaining elements from src.



_mm512_maskz_compress_epi32

extern __m512i __cdecl _mm512_maskz_compress_epi32(__mmask16 k, __m512i a);

Contiguously stores the active int32 elements in a (those with their respective bit set in zeromask k) to destination, and set the remaining elements to zero.



_mm512_mask_compress_epi64

extern __m512i __cdecl _mm512_mask_compress_epi64(__m512i a, __mmask8 k, __m512i src);

Contiguously stores the active int64 elements in a (those with their respective bit set in writemask k) to destination, and passes through the remaining elements from src.



_mm512_maskz_compress_epi64

extern __m512i __cdecl _mm512_maskz_compress_epi64(__mmask8 k, __m512i a);

Contiguously stores the active int64 elements in a (those with their respective bit set in zeromask k) to destination, and set the remaining elements to zero.



_mm512_mask_compressstoreu_epi32

extern void __cdecl _mm512_mask_compressstoreu_epi32(void* base_addr, __mmask16 k, __m512i a);

Contiguously stores the active int32 elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.



_mm512_mask_compressstoreu_epi64

extern void __cdecl _mm512_mask_compressstoreu_epi64(void* base_addr, __mmask8 k, __m512i a);

Contiguously stores the active int64 elements in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.