Intel® High Level Synthesis Compiler Pro Edition: Reference Manual

ID 683349
Date 3/28/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

5. Component Memories (Memory Attributes)

The Intel® High Level Synthesis (HLS) Compiler can build a hardware memory system using FPGA memory resources (such as block RAMs) for local, constant, and static variables as well as agent memory declared in your code.

In some cases, the Intel® HLS Compiler implements a local, constant, and static variable using registers in the component datapath. However, you can override that implementation by using memory attributes.

Memory accesses are mapped to load-store units (LSUs), which transact with the hardware memory through its ports. The Intel® HLS Compiler sometimes statically coalesces multiple memory accesses to a component memory into one wider memory access in order to save on the number of LSUs instantiated. LSUs for component memory are always pipelined LSUs.

If two or more LSUs need to be scheduled during the same cycle, the compiler might create stallable arbitration logic. Stallable arbitration logic appears red in the Component Viewer (in the High-Level Design Reports).

For more details about LSUs instantiated by the Intel® HLS Compiler, see Load-Store Unit Types. For details about coalescing memory accesses to save on instantiated LSUs, see Memory-Access Coalescing and Load-Store Units.

Figure 7. A Basic Memory Configuration Inferred by the Intel® HLS Compiler

The following diagram shows a basic memory configuration:



Figure 8. A Memory System With Two Memory Banks

The contents of a memory system can be partitioned into one or more memory banks, such that each bank contains a subset of data contained in the hardware memory:



Figure 9. A Memory System With Two Replicates

A memory bank can contain one or more memory replicates. The compiler might create memory replicates to create more read ports. Having more read ports allows concurrent access to your memory system if you have many read operations.

The replicates in a memory bank contain identical data and you can read from the replicates simultaneously. A replicate can have two or four access ports, depending on whether the replicate is clocked at the same frequency (single pumped) or twice the frequency (double pumped) of the component. All ports in replicates can be accessed concurrently. The number of ports in a memory bank depends on the number of replicates that the bank contains.



A replicate can also contain one or more private copies to support multiple concurrent loop iterations.
Figure 10. A Memory System With Two Private Copies


The Intel® HLS Compiler can control the geometry and configuration parameters of the hardware memories that it builds. The compiler tries to create stall-free memory accesses. That is, the compiler tries to give memory reads and writes contention-free access to a memory port. A memory system is stall-free if all reads and writes in the memory system are contention-free.

The compiler tries to create a minimum-area stall-free memory system. If you want a different area-performance trade off, use the component memory attributes to specify your own memory system configuration and override the memory system inferred by the compiler.

Component Memory Attributes

Apply the component memory attributes to local, constant, and static variables in your component to customize the on-chip memory architecture of the component memory system and lower the FPGA area utilization of your component. You can also apply memory attributes to agent memories and struct data members.

These component memory attributes are defined in the "HLS/hls.h" header file, which you can include in your code.

Table 16.   Intel® HLS Compiler Pro Edition Component Memory Attributes Summary
Memory Attribute Description
hls_force_pow2_depth Specifies that the memory implementing the variable or array has power-of-2 depth.
hls_register Forces a variable or array to be carried through the pipeline in registers.

A register variable can be implemented either exclusively in flip-flops (FFs) or in a mix of FFs and RAM-based FIFOs.

hls_memory Forces a variable or array to be implemented as embedded memory.
hls_memory_impl Forces a variable or array to be implemented as embedded memory of a specified type.
hls_singlepump Specifies that the memory implementing the variable or array must be clocked at the same rate as the component accessing the memory.
hls_doublepump Specifies that the memory implementing the variable or array must be clocked at twice the rate as the component accessing the memory.
hls_numbanks Specifies that the memory implementing the variable or array must have a defined number of memory banks.
hls_bankwidth Specifies that the memory implementing the variable or array must have memory banks of a defined width.
hls_bankbits Forces the memory system to split into a defined number of memory banks and defines the bits used to select a memory bank.
hls_simple_dual_port_memory Specifies that the memory implementing the variable or array should have no port that services both reads and writes.
hls_merge (depthwise) Allows merging two or more local variables to be implemented in component memory as a single merged memory system in a depth-wise manner.
hls_merge (widthwise) Allows merging two or more local variables to be implemented in component memory as a single merged memory system in a width-wise manner.
hls_init_on_reset Forces the static variables inside the component to be initialized when the component reset signal is asserted.
hls_init_on_powerup Sets the component memory implementing the static variable to initialize on power-up when the FPGA is programmed.
hls_max_concurrency
Deprecated: This attribute is deprecated and will be removed in a future release. Use the hls_private_copies memory attribute instead.
Specifies that the memory implementing the variable or array has a defined number of private copies to allow concurrent iterations of a loop at any given time.
hls_max_replicates Specifies that the memory implementing the variable or array has no more than the specified number of replicates to enable simultaneous reads from the datapath
hls_private_copies Specifies that the memory implementing the variable or array has a defined number of private copies to allow concurrent iterations of a loop at any given time.

Struct Datatypes and Memory Attributes

You can apply memory attributes to struct member variables in the struct declaration. If you also apply memory attributes to the object instantiation of a struct variable, the attributes on the instantiation override the attributes from the declaration.

The following code example applies memory attributes to both a declaration and an instantiation:
struct State {
 int array[100] hls_memory;
 int reg[4] hls_register;
};
component int test(..) {
 struct State S1;
 struct State S2 hls_memory;
  // some uses
}

For this example code, the compiler splits S1 into two variables, S1.array[100] (implemented in memory) and S1.reg[4] (implemented in registers). However, the compiler ignores the attributes applied at the struct declaration for object S2 because the S2 object has the hls_memory attribute applied at instantiation.

Constraints on Attributes for Memory Banks

The properties of memory banks constrain how you can divide component memory into banks with the memory bank attributes.

The relationship between the following properties is constrained:
  • The number of bytes in your array that you want to access at one time (S). If you are accessing a local variable, this value represents the size (in bytes) of the local variable.
  • The number of memory banks specified by hls_numbanks attribute ().
  • The width (in bytes) of the memory banks specified by hls_bankwidth attribute ().
  • The number of memory bank-select bits specified by hls_bankbits attribute. That is, n+1 when you specify b 0 , b 1 , ..., b n as the bank-select bits ().
These attributes are subject to the following constraints:
  • The number of bytes accessed concurrently (or size of a local variable) is equal to the number of memory banks it uses times the width of the memory banks.

  • must be a power of 2 value.
  • bank-selection bits that are required to address number of memory banks.

Values that you specify for the hls_numbanks, hls_bankwidth, and hls_bankbits attributes must meet these constraints. For attributes that you do not specify, the Intel® HLS Compiler infers values for the attributes following these constraints.

To learn more about these attributes, review the following tutorial:
<quartus_installdir>/hls/examples/tutorials/component_memories/memory_geometry