Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

Intrinsics for Integer Bit Manipulation and Conflict Detection Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>


Intrinsic Name

Operation

Corresponding
Intel® AVX-512 Instruction

_mm512_lzcnt_epi32, _mm512_mask_lzcnt_epi32, _mm512_maskz_lzcnt_epi32

Counts the leading zero bits in source int32 elements.

VPLZCNTD

_mm512_lzcnt_epi64, _mm512_mask_lzcnt_epi64, _mm512_maskz_lzcnt_epi64

Counts the leading zero bits in source int64 elements.

VPLZCNTQ

_mm512_ternarylogic_epi32, _mm512_mask_ternarylogic_epi32, _mm512_maskz_ternarylogic_epi32

Implements three-operand binary function specified by immediate value.

VPTERNLOGD

_mm512_ternarylogic_epi64, _mm512_mask_ternarylogic_epi64, _mm512_maskz_ternarylogic_epi64

Implements three-operand binary function specified by immediate value.

VPTERNLOGQ


variable definition
k

writemask used as a selector

a

first source vector element

b

second source vector element

c

third source vector element

imm8

binary function specifier

src

source element to use based on writemask result


_mm512_lzcnt_epi32

extern __m512i __cdecl _mm512_lzcnt_epi32(__m512i a);

Counts the number of leading zero bits in each packed 32-bit integer in a, and store the results in destination.



_mm512_mask_lzcnt_epi32

extern __m512i __cdecl _mm512_mask_lzcnt_epi32(__m512i src, __mmask16 k, __m512i a);

Counts the number of leading zero bits in each packed 32-bit integer in a, and store the results in destination using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_lzcnt_epi32

extern __m512i __cdecl _mm512_maskz_lzcnt_epi32(__mmask16 k, __m512i a);

Counts the number of leading zero bits in each packed 32-bit integer in a, and store the results in destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_lzcnt_epi64

extern __m512i __cdecl _mm512_lzcnt_epi64(__m512i a);

Counts the number of leading zero bits in each packed 64-bit integer in a, and store the results.



_mm512_mask_lzcnt_epi64

extern __m512i __cdecl _mm512_mask_lzcnt_epi64(__m512i src, __mmask8 k, __m512i a);

Counts the number of leading zero bits in each packed 64-bit integer in a, and store the results in using writemask k.

Elements are copied from src when the corresponding mask bit is not set.



_mm512_maskz_lzcnt_epi64

extern __m512i __cdecl _mm512_maskz_lzcnt_epi64(__mmask8 k, __m512i a);

Counts the number of leading zero bits in each packed 64-bit integer in a, and store the results in destination using zeromask k.

Elements are zeroed out when the corresponding mask bit is not set.



_mm512_ternarylogic_epi32

extern __m512i __cdecl _mm512_ternarylogic_epi32(__m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit.



_mm512_mask_ternarylogic_epi32

extern __m512i __cdecl _mm512_mask_ternarylogic_epi32(__m512i a, __mmask16 k, __m512i, __m512i b, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 32-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using writemask k at 32-bit granularity (32-bit elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_ternarylogic_epi32

extern __m512i __cdecl _mm512_maskz_ternarylogic_epi32(__mmask16 k, __m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using zeromask k at 32-bit granularity (32-bit elements are zeroed out when the corresponding mask bit is not set).



_mm512_ternarylogic_epi64

extern __m512i __cdecl _mm512_ternarylogic_epi64(__m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3-bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit.



_mm512_mask_ternarylogic_epi64

extern __m512i __cdecl _mm512_mask_ternarylogic_epi64(__m512i src, __mmask8 k, __m512i a, __m512i b, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 64-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using writemask k at 64-bit granularity (64-bit elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_ternarylogic_epi64

extern __m512i __cdecl _mm512_maskz_ternarylogic_epi64(__mmask8 k, __m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using zeromask k at 64-bit granularity (64-bit elements are zeroed out when the corresponding mask bit is not set).