Intel® High Level Synthesis Compiler Pro Edition: Reference Manual
8.2. Concurrency Control (hls_max_concurrency Attribute)
You can use the hls_max_concurrency component attribute to increase or limit the maximum concurrency of your component. The concurrency of a component is the number of invocations of the component that can be in progress at one time. By default, the Intel® HLS Compiler tries to maximize concurrency so that the component runs at peak throughput.
You can control the maximum concurrency of your component by adding the hls_max_concurrency component attribute immediately before you declare your component, as shown in the following example:
#include "HLS/hls.h"
hls_max_concurrency(3)
component void foo ( /* arguments */ ){
  // Component code
} 
  The optimizations caused by using this attribute might cause component memory configuration changes to meet the set concurrency requirements. Use memory attributes to control the geometry of your component memory configuration.
- You have a component memory system.
- 
     At the component level, the Intel HLS compiler does not automatically create private copies of component memory to increase the throughput. If your component invocation uses a non-static component memory system, the next invocation cannot start until the previous invocation has finished all its accesses to and from that component memory. 
     This limitation is shown in the Loop Analysis report as load-store dependencies on the component memory. Adding the hls_max_concurrency(N) attribute to the component creates private copies of the component memory so that you can have multiple pipelined invocations of your component in progress at the same time. To create as many private copies as necessary for maximal performance, use hls_max_concurrency(0). For finer-grained control of which component memories to create private copies of, use the hls_private_copies memory attribute. For details, see hls_private_copies Memory Attribute. 
- The compiler determines that reducing concurrency saves FPGA area.
- 
     In some cases, the compiler reduces concurrency to save FPGA area. In these cases, the hls_max_concurrency(N) component attribute can increase the concurrency from 1. 
     The Loop Analysis report displays the concurrency for a function in the Details pane of the report when you click the function marked with (Component invocation) in the Loop Analysis pane. If your design concurrency is limited, the Details pane shows a line like the following line:Maximum concurrent iterations: 1 is the default for component invocation loop. 
The hls_max_concurrency attribute can also accept a value of 0. When this attribute is set to 0, the component should be able to accept new invocations as soon as the downstream datapath frees up. Use this value only when you see loop initiation interval (II) issues in your component because using this attribute can increase the component area. You can find loop II issues by examining the Loop Viewer in the High-Level Design Reports or looking for extra bubbles that are visible in a simulation waveform.
You can also control the concurrency of loops in components with the max_concurrency(N) pragma. For more information about the max_concurrency(N) pragma, see Loop Concurrency (max_concurrency Pragma).