Article ID: 000086372 Content Type: Troubleshooting Last Reviewed: 08/04/2017

Why does #pragma ivdep not work correctly in aocl version 17.0?

Environment

  • Intel® Quartus® Prime Pro Edition
  • Intel® FPGA SDK for OpenCL™ Pro Edition
  • BUILT IN - ARTICLE INTRO SECOND COMPONENT
    Description

    In 16.1, this code behaved as expected where the outer loop was serialized due to dependencies and the inner loop dependencies were removed by the #pragma ivdep.

    // This loop gets serialized due to true dependencies with inner loop

    for (unsigned char x = 0; x < 4; x ) {

    // Inner loop does not have inter-iteration dependencies, but depends on outer loop

    #pragma ivdep

    for (unsigned char y = 0; y<64; y ) {

    In 17.0, the #pragma ivdep is now applied to both the inner and outer loop, so the dependencies in the outer loop are not accounted for by the compiler.  As a result, similar code may not work correctly in hardware despite working in emulation.

    Resolution

    Workaround:

    1. Add an extra argument "dummy" to the kernel.  On the host side, always pass 1 for this dummy argument.

    BEFORE

    __kernel void my_kernel(
        __global cpx_t* restrict input,
        __global cpx_t* restrict result)

    AFTER

    __kernel void my_kernel(
        __global cpx_t* restrict input,
        __global cpx_t* restrict result,
        int dummy)

    2.  In the loop nest, wrap the inner loop in "if (dummy)":

    // This loop gets serialized due to true dependencies

    for (unsigned char x = 0; x < 4; x ) {

    if (dummy) {

        // No dependencies within each set of 64 iterations

    #pragma ivdep

    for (unsigned char y = 0; y<64; y ) {

     

    This issue is scheduled to be fixed in a future version of the Intel© OpenCL™ for FPGA SDK.

    Related Products

    This article applies to 5 products

    Intel® Arria® 10 FPGAs and SoC FPGAs
    Intel® Stratix® 10 FPGAs and SoC FPGAs
    Cyclone® V FPGAs and SoC FPGAs
    Arria® V FPGAs and SoC FPGAs
    Stratix® V FPGAs