Visible to Intel only — GUID: ear1527014806057
Ixiasoft
Visible to Intel only — GUID: ear1527014806057
Ixiasoft
10.2. Component Gets Poor Quality of Results
The information in this section describes some common sources of stallable arbitration nodes or excess RAM utilization.
Component Uses More FPGA Resource Than Expected
By default, the Intel® HLS Compiler Pro Edition tries to optimize your component for the best throughput by trying to maximize the maximum operating frequency (fMAX).
A way to reduce area consumption is to relax the fMAX requirements by setting a target fMAX value with the --clock i++ command option or the hls_scheduler_target_fmax_mhz component attribute. The HLS compiler can often achieve a higher fMAX than you specify, so when you set a target fMAX to a lower value than you need, your design might still achieve an acceptable fMAX value, and a design that consumes less area.
To learn more about the behavior of fMAX target value control see the following tutorial: <quartus_installdir>/hls/examples/tutorials/best_practices/set_component_target_fmax
Loops Do Not Achieve II=1
If you specify a target fMAX , the compiler might conservatively increase II in order to achieve your target fMAX .
If you specify a target fMAX and require II=1, you should use #pragma ii 1 on your loops that require II=1. For more details, refer to Balancing Target fMAX and Target II.
Incorrect Bank Bits
If you access parts of an array in parallel (either a single- or multidimensional array), you might need to configure the memory bank selection bits.
See Memory Architecture Best Practices for details about how to configure efficient memory systems.
Conditional Operator Accessing Two Different Arrays of struct Variables
In some cases, if you try to access different arrays of struct variables with a conditional operator, the Intel® HLS Compiler Pro Edition merges the arrays into the same RAM block. You might see stallable arbitration in the Function Memory Viewer because there are not enough Load/Store site on the memory system.
struct MyStruct { float a; float b; } MyStruct array1[64]; MyStruct array2[64];
MyStruct value = (shouldChooseArray1) ? array1[idx] : array2[idx];
MyStruct value; if (shouldChooseArray1) { value = array1[idx]; } else { value = array2[idx]; }
Cluster Logic
Your design might consume more RAM blocks than you expect, especially if you store many array variables in large registers.
You can use the hls_use_stall_enable_clusters component attribute to prevent the compiler from inserting stall-free cluster exit FIFOs.
The Area Analysis of System report in the high-level design report (report.html) can help find this issue.
The three matrices are stored intentionally in RAM blocks, but the RAM blocks for the matrices account for less than half of the RAM blocks consumed by the component.
If you look further down the report, you might see that many RAM blocks are consumed by Cluster logic or State variable. You might also see that some of your array values that you intended to be stored in registers were instead stored in large numbers of RAM blocks.
Notice the number of RAM blocks that are consumed by Cluster Logic and State.
- Pipeline loops instead of unrolling them.
- Storing local variables in local RAM blocks (hls_memory memory attribute) instead of large registers (hls_register memory attribute).
Component with a System of Tasks Hangs or has Poor Throughput
If your component contains a system of tasks, you might need to add launch/collect capacity.
Incorrectly specifying launch/collect capacity can result in hangs or poor throughput.
For details, refer to Balancing Capacity in a System of Tasks.