Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

_mm256_permutevar8x32_ps

Permutes single-precision floating-point elements of the source vector into the destination vector. The corresponding Intel® AVX2 instruction is VPERMPS.

Syntax

extern __m256i _mm256_permutevar8x32_ps(__m256 val, __m256i offsets);

Arguments

val

the vector of 32-bit single-precision floating-point elements to be permuted

offsets

the vector of eight 3-bit offsets (specifying values in range [0 - 7]) for the permuted elements of 256-bit vector

Description

Use the offset values in each dword element of the vector offsets to select a single-precision floating-point element from the source vector val. The result element is copied to the corresponding element of destination vector. The intrinsic does NOT allow to copy the same element of the source vector to more than one element of the destination vector.

Below is the pseudo-code for the intrinsic:

RESULT[31:0] <- (VAL[255:0] >> (OFFSETS[2:0] * 32))[31:0];
RESULT[63:32] <- (VAL[255:0] >> (OFFSETS[34:32] * 32))[31:0];
RESULT[95:64] <- (VAL[255:0] >> (OFFSETS[66:64] * 32))[31:0];
RESULT[127:96] <- (VAL[255:0] >> (OFFSETS[98:96] * 32))[31:0];
RESULT[159:128] <- (VAL[255:0] >> (OFFSETS[130:128] * 32))[31:0];
RESULT[191:160] <- (VAL[255:0] >> (OFFSETS[162:160] * 32))[31:0];
RESULT[223:192] <- (VAL[255:0] >> (OFFSETS[194:192] * 32))[31:0];
RESULT[255:224] <- (VAL[255:0] >> (OFFSETS[226:224] * 32))[31:0];

Returns

Result of the permute operation.