Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Intrinsics for Root Function Operations (512-bit)

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>


Intrinsic Name

Operation

Corresponding
Intel® AVX-512 Instruction

_mm512_sqrt_pd, _mm512_mask_sqrt_pd

Calculates square root of float64 vector elements.

None.

_mm512_sqrt_ps, _mm512_mask_sqrt_ps

Calculates square root of float32 vector elements.

None.

_mm512_invsqrt_pd, _mm512_mask_invsqrt_pd

Calculates inverse square root of float64 vector elements.

None.

_mm512_invsqrt_ps, _mm512_mask_invsqrt_ps

Calculates inverse square root of float32 vector elements.

None.

_mm512_hypot_pd, _mm512_mask_hypot_pd

Calculates square root of float64 vector elements.

None.

_mm512_hypot_ps, _mm512_mask_hypot_ps

Calculates square root of float32 vector elements.

None.

_mm512_cbrt_pd, _mm512_mask_cbrt_pd

Calculates cube root of float64 vector elements.

None.

_mm512_cbrt_ps, _mm512_mask_cbrt_ps

Calculates cube root of float32 vector elements.

None.


variable definition
k

writemask used as a selector

a

first source vector element

b

second source vector element

src

source element to use based on writemask result


_mm512_sqrt_pd

extern __m512d __cdecl _mm512_sqrt_pd(__m512d a);

Calculates square root value of float64 vector a elements.


_mm512_mask_sqrt_pd

extern __m512d __cdecl _mm512_mask_sqrt_pd(__m512d src, __mmask8 k, __m512d a);

Calculates square root value of float64 vector a elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_sqrt_ps

extern __m512 __cdecl _mm512_sqrt_ps(__m512 a);

Calculates square root value of float32 vector a elements.


_mm512_mask_sqrt_ps

extern __m512 __cdecl _mm512_mask_sqrt_ps(__m512 src, __mmask16 k, __m512 a);

Calculates square root value of float32 vector a elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_invsqrt_pd

extern __m512d __cdecl _mm512_invsqrt_pd(__m512d a);

Calculates inverse square root value of float64 vector a elements.


_mm512_mask_invsqrt_pd

extern __m512d __cdecl _mm512_mask_invsqrt_pd(__m512d src, __mmask8 k, __m512d a);

Calculates inverse square root value of float64 vector a elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_invsqrt_ps

extern __m512 __cdecl _mm512_invsqrt_ps(__m512 a);

Calculates inverse square root value of float32 vector a elements.


_mm512_mask_invsqrt_ps

extern __m512 __cdecl _mm512_mask_invsqrt_ps(__m512 src, __mmask16 k, __m512 a);

Calculates inverse square root value of float32 vector a elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_hypot_pd

extern __m512d __cdecl _mm512_hypot_pd(__m512d a, __m512d b);

Computes the length of the hypotenuse of a right angled triangle with sides from float64 vector a and b elements.


_mm512_mask_hypot_pd

extern __m512d __cdecl _mm512_mask_hypot_pd(__m512d src, __mmask8 k, __m512d a, __m512d b);

Computes the length of the hypotenuse of a right angled triangle with sides from float64 vector a and b elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_hypot_ps

extern __m512 __cdecl _mm512_hypot_ps(__m512 a, __m512 b);

Computes the length of the hypotenuse of a right angled triangle with sides from float32 vector a and b elements.


_mm512_mask_hypot_ps

extern __m512 __cdecl _mm512_mask_hypot_ps(__m512 src, __mmask16 k, __m512 a, __m512 b);

Computes the length of the hypotenuse of a right angled triangle with sides from float32 vector a and b elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_cbrt_pd

extern __m512d __cdecl _mm512_cbrt_pd(__m512d a);

Calculates the cube root of float64 vector a elements.


_mm512_mask_cbrt_pd

extern __m512d __cdecl _mm512_mask_cbrt_pd(__m512d src, __mmask8 k, __m512d a);

Calculates the cube root of float64 vector a elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_cbrt_ps

extern __m512 __cdecl _mm512_cbrt_ps(__m512 a);

Calculates the cube root of float32 vector a elements.


_mm512_mask_cbrt_ps

extern __m512 __cdecl _mm512_mask_cbrt_ps(__m512 src, __mmask16 k, __m512 a);

Calculates the cube root of float32 vector a elements, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).