Visible to Intel only — GUID: aqd1659542928199
Ixiasoft
Visible to Intel only — GUID: aqd1659542928199
Ixiasoft
2.4.2.3. Parameter Group: pe_array
This parameter group configures the PE Array. The PE Array is used to calculate dot products.
Parameter: pe_array/dsp_limit
Use this parameter to force the PE array to implement multipliers in ALM logic on the FPGA.
The number of multipliers that the PE requires is determined by the k_vector and c_vector global parameters. Given the value of the arch_precision global parameter and the target architecture (for example, Arria® 10 or Agilex™ 7), the number of multipliers determines the number of DSPs that the PE Array tries to use. If this number exceeds the value set in the dsp_limit parameter, then some multipliers are implemented in ALM logic to ensure that the PE Array DSP usage does not exceed the limit set by the dsp_limit parameter.
If this option is omitted, then all multipliers are implemented in the FPGA AI Suite IP as DSPs.
Typically, this parameter is set by the architecture optimizer.
Parameters: pe_array/num_interleaved_features, pe_array/num_interleaved_filters
To support layers with bias values, the PE array uses a threaded accumulator that is time-multiplexed to handle multiple accumulations. Each accumulation corresponds to an output filter and feature.
- Common Values:
-
- Agilex™ 5 devices
- 12x1
- Agilex™ 7 devices
- 5x1, 3x2
- Arria® 10 devices
- 4x1, 2x2
- Cyclone® 10 GX
- Stratix® 10 devices
- 5x1, 3x2
All architectures support a 1x1 interleave. Selecting a 1x1 interleave typically reduces ALM consumption, but the IP associated with this architecture does not support layers with bias. Because most deep learning graphs include bias, the 1x1 interleave is typically not used.
The architecture optimizer does not modify the num_interleaved_features and num_interleaved_filters values. You must set them manually.
The filter interleave multiplies the effective KVEC, which means that graphs with a depthwise convolution (such as certain versions of MobileNet) might perform best when using num_interleaved_filters=1. Multilayer perceptron graphs might perform best when using num_interleaved_features=1.
- Agilex™ 5 devices
- The value of num_interleaved_features must be greater than or equal to 12.
- Agilex™ 7 devices
- The value of num_interleaved_features multiplied by num_interleaved_filters must be greater than or equal to five.
- Arria® 10 devices
- The value of num_interleaved_features multiplied by num_interleaved_filters must be greater than or equal to four.
- Cyclone® 10 GX devices
- Stratix® 10 devices
- The value of num_interleaved_features multiplied by num_interleaved_filters must be greater than or equal to five.
There is no advantage in choosing interleave factors larger than the minimum required.
Parameter: pe_array/exit_fifo_depth
This parameter controls the depth of the PE Array exit FIFO. Larger values might reduce the incidence of stalling, but at the cost of area.
Typically, this parameter is not modified.
Parameter: pe_array/enable_scale
This parameter controls whether the IP supports scaling feature values by a per-channel weight. This is used to support batch normalization.
In most graphs, the graph compiler (dla_compiler command) adjusts the convolution weights to account for scale, so this option is usually not required. (Similarly, if a shift is required, then the convolution bias values are adjusted).
- Legal values:
- true, false