Article ID: 000073889 Content Type: Troubleshooting Last Reviewed: 06/20/2019

Why do I get bad performance when compiling vector add example design with Intel® FPGA SDK for OpenCL™?

Environment

  • Intel® Arria® 10 FPGAs and SoC FPGAs
  • Intel® Stratix® 10 FPGAs and SoC FPGAs
  • Intel® Quartus® Prime Pro Edition
  • Intel® FPGA SDK for OpenCL™
  • BUILT IN - ARTICLE INTRO SECOND COMPONENT
    Description

    Due to a problem in the Intel® FPGA SDK for OpenCL™ version 18.1 and later,  you may get bad performance when you compile the same vector_add example design code. The performance is as follows.

    Intel® FPGA SDK for OpenCL™ version

    Performance

    V16.1

    V18.0

    V18.1

    V19.1

    ~3ms

    ~3ms

    ~170ms

    ~170ms

     

    Resolution

    To work around this problem, add an attribute  to vector_add.cl which sets the required work group size.

      __attribute__((reqd_work_group_size(1, 1, 1)))
      __kernel void vector_add(__global const float *x, 
                               __global const float *y, 
                               __global float *restrict z)
      {
          // get index of the work item
          int index = get_global_id(0);
          // add the vector elements
          z[index] = x[index] y[index];
      }

    The problem is scheduled to be fixed in a future release of the the Intel® FPGA SDK for OpenCL™.

    Disclaimer

    1

    All postings and use of the content on this site are subject to Intel.com Terms of Use.