Intel® FPGA SDK for OpenCL™ Standard Edition: Best Practices Guide
2.4.1. Area Analysis by Source
OpenCL kernel example that includes four loops:
 1  // ND-Range kernel with unrolled loops
 2  __attribute ((reqd_work_group_size(1024,1,1)))
 3  kernel void t (global int * out, int N) {
 4    int i = get_global_id(0);
 5    int j = 1;
 6    for (int k = 0; k < 4; k++) {
 7      #pragma unroll
 8      for (int n = 0; n < 4; n++) {
 9        j += out[k+n];
10      }
11    }
12    out[i] = j;
13
14    int m = 0;
15    #pragma unroll 1
16    for (int k = 0; k < N; k++) {
17      m += out[k/3];
18    }
19    #pragma unroll
20    for (int k = 0; k < 6; k++) {
21      m += out[k];
22    }
23    #pragma unroll 2
24    for (int k = 0; k < 6; k++) {
25      m += out[k];
26    }
27    out[2] = m;
28  }The area report below lists the area usage for the kernel system, board interface, and global interconnects. These elements are system-level IP and are dependent on the Custom or Reference Platform that your design targets. The kernel t is within the hierarchy of the kernel system and is where the source code begins. The report specifies all the variables declared within the source code under kernel t and sorts the remaining area information by line number.
 
 
   In this example, for the code line for the code line j += out[k+n] (line 9), the calculates the estimated area usage based on the area required to perform the addition and to load data from global memory. For the code line out[i] = j (line 12), the offline compiler calculates the estimated area usage based on the area required to compute the pointer value and then store it back to global memory.