Development Reference Guides

Contents

qopt-streaming-stores, Qopt-streaming-stores

Enables generation of streaming stores for optimization.

Syntax

Linux:
-qopt-streaming-stores
=
keyword
-qno-opt-streaming-stores
Windows:
/Qopt-streaming-stores:
keyword
/Qopt-streaming-stores-
Arguments
keyword
Specifies whether streaming stores are generated. Possible values are:
always
Enables generation of streaming stores for optimization. The compiler optimizes under the assumption that the application is memory bound.
When this option setting is specified, it is your responsibility to also insert any memory barriers (fences) as required to ensure correct memory ordering within a thread or across threads. See the Examples section for one way to do this.
never
Disables generation of streaming stores for optimization. Normal stores are performed.
This setting has the same effect as specifying
-qno-opt-streaming-stores
or
/Qopt-streaming-stores-
.
auto
Lets the compiler decide which instructions to use.
Default
-qopt-streaming-stores=auto
or
/Qopt-streaming-stores:auto
The compiler decides whether to use streaming stores or normal stores.
Description
This option enables generation of streaming stores for optimization. This method stores data with instructions that use a non-temporal buffer, which minimizes memory hierarchy pollution.
This option may be useful for applications that can benefit from streaming stores.
IDE Equivalent
None
Alternate Options
None
Example
The following example shows one way to insert fences when specifying
-qopt-streaming-stores=always
or
/Qopt-streaming-stores:always
. It inserts a _mm_sfence() intrinsic call just after the loops (such as the initialization loop) where the compiler may insert streaming store instructions.
void simple1(double * restrict a, double * restrict b, double * restrict c, double *d, int n) { int i, j; #pragma omp parallel for for (j=0; j<n; j++) { a[j] = 1.0; b[j] = 2.0; c[j] = 0.0; } _mm_sfence(); // OR _mm_mfence(); #pragma omp parallel for for (i=0; i<n; i++) a[i] = a[i] + c[i]*b[i]; }

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.