Article ID: 000086372 Content Type: Troubleshooting Last Reviewed: 08/04/2017

Why does #pragma ivdep not work correctly in aocl version 17.0?

Environment

    Intel® Quartus® Prime Pro Edition
    Intel® FPGA SDK for OpenCL™ Pro Edition
BUILT IN - ARTICLE INTRO SECOND COMPONENT
Description

In 16.1, this code behaved as expected where the outer loop was serialized due to dependencies and the inner loop dependencies were removed by the #pragma ivdep.

// This loop gets serialized due to true dependencies with inner loop

for (unsigned char x = 0; x < 4; x ) {

// Inner loop does not have inter-iteration dependencies, but depends on outer loop

#pragma ivdep

for (unsigned char y = 0; y<64; y ) {

In 17.0, the #pragma ivdep is now applied to both the inner and outer loop, so the dependencies in the outer loop are not accounted for by the compiler.  As a result, similar code may not work correctly in hardware despite working in emulation.

Resolution

Workaround:

1. Add an extra argument "dummy" to the kernel.  On the host side, always pass 1 for this dummy argument.

BEFORE

__kernel void my_kernel(
    __global cpx_t* restrict input,
    __global cpx_t* restrict result)

AFTER

__kernel void my_kernel(
    __global cpx_t* restrict input,
    __global cpx_t* restrict result,
    int dummy)

2.  In the loop nest, wrap the inner loop in "if (dummy)":

// This loop gets serialized due to true dependencies

for (unsigned char x = 0; x < 4; x ) {

if (dummy) {

    // No dependencies within each set of 64 iterations

#pragma ivdep

for (unsigned char y = 0; y<64; y ) {

 

This issue is scheduled to be fixed in a future version of the Intel© OpenCL™ for FPGA SDK.

Related Products

This article applies to 5 products

Intel® Arria® 10 FPGAs and SoC FPGAs
Intel® Stratix® 10 FPGAs and SoC FPGAs
Cyclone® V FPGAs and SoC FPGAs
Arria® V FPGAs and SoC FPGAs
Stratix® V FPGAs

1