2.4.2.1. Parameter Group: Global Parameters

Parameter: `family`

This parameter specifies the target FPGA device family for the architecture.

Legal Values

Table 3. Valid Values for `family` Global Parameter
Value	Description
`A10`	Target Arria® 10 devices.
`AGX5`	Target Agilex™ 5 devices.
`AGX7`	Target Agilex™ 7 devices.
`C10`	Target Cyclone® 10 devices.
`S10`	Target Stratix® 10 devices.

Parameter: `k_vector`

This parameter, also called KVEC, describes the number of filters that the PE Array is able to process simultaneously.

Typically the architecture optimizer is used to set this parameter.

Legal values:

[1-128]

The k_vector value must be a multiple of the c_vector value.
The k_vector value must be divisible by the xbar_k_vector and auxiliary k_vector values.
When you use the depthwise module, the k_vector value must equal the c_vector value.

Parameter: `c_vector`

This parameter, also called CVEC, describes the size of the dot product within each PE in the PE Array.

Typically the architecture optimizer is used to set this parameter.

Legal values:

[4,8,16,32,64]

When you use the depthwise module, the c_vector value must equal the k_vector value.

Parameter: `arch_precision`

This parameter sets the precision (in bits) of the internal numeric representation used by FPGA AI Suite IP. Lower values increase fps and reduce area, but at the cost of inference accuracy.

Each internal precision option corresponds to a different number of sign and mantissa bits, and uses either two's complement or sign+magnitude. For details, refer to the table in FPGA AI Suite IP Block Configuration.

The FP16 precision significantly increases the size of the resulting IP, but can improve accuracy (particularly in models that have not been retrained for low precision).

All numeric options, except INT8AGX, use block floating point format. In block floating point format, each block of size CVEC shares a common exponent. Both CVEC (c_vector) and arch_precision affect the accuracy of the inference. However, the impact of c_vector is generally small, while the impact of the arch_precision setting is relatively large.

The INT8AGX option uses signed INT8 values without block floating point format. The INT8AGX option supports graphs with symmetric quantization, which requires that the operations to quantize from floating point to integer (and to dequantize from integer to floating point) are simple multiplication operations without any offset bias (that is, a value that is added/subtracted during the quantize/dequantize step).

The INT8AGX option does not significantly affect either the inference speed or the FPGA resource consumption. When a graph has been retrained to support symmetric INT8 quantization, decide whether to use block floating point (with the fp32 or fp16 version of the graph) for the FPGA AI Suite IP or continue using the quantized version of the graph. Because the INT8AGX option does not significantly affect either performance or FPGA resource consumption, this decision is driven by the following factors: convenience, accuracy, and possibly a desire to maintain behavior that maximally resembles inference on other inference devices using the quantized graph.

The example architectures that are included with the FPGA AI Suite are already set to the recommended arch_precision parameter values for their supported FPGA family. In some cases, it is useful to select a different arch_precision value. FP11 is the lowest precision option, but requires the least number of RAM blocks, and slightly reduces the amount of external memory traffic. The FP12AGX significantly reduces the number of DSPs required to implement the PE array, but logic utilization may increase.

For more details about the block floating point format, refer to the Low-Precision Networks for Efficient Inference on FPGAs white paper.

Legal values:

FPGA Device Family	Supported `arch_precision` Values
Agilex™ 5	FP11 FP12AGX FP13AGX FP16 (less common) INT8AGX
Agilex™ 7	FP11 FP13AGX FP16 (less common) INT8AGX
Arria® 10	FP11 FP16 (less common)
Cyclone® 10 GX	FP11 FP16 (less common)
Stratix® 10	FP11 FP16 (less common)

Parameter: `stream_buffer_depth`

This parameter controls the depth of the stream buffer. The stream buffer is used as the on-chip cache for feature (image) data. Larger values increase area (logic and block RAM) but also increase performance.

Typically the architecture optimizer is used to set this parameter.

Legal values:: [2048-262144]

Parameter: `enable_eltwise_mult`

This parameter enables the Elementwise multiplication layer. This layer is required for MobileNetV3.

Parameters: `filter_size_width_max`, `filter_size_height_max`

These parameters determine the maximum size of a convolution filter, which also relates the maximum window size for Average Pool.

The maximum window size for Average Pool is no larger than the value determined by the following formula: $\min (filter_size_width_max, file_size_height_max) - 1$ . In addition, the Average Pool window size may be limited by the filter_scratchpad and filter_depth parameters.

Legal values:: [14,28]

Parameter: `enable_debug`

This parameter toggles the FPGA AI Suite debug network to allow forwarding of read requests from the CSR to one of many externally-attached debug-capable modules.

Generally not required for production architectures.

Legal values:: [true,false]

(Early Access only) Parameter: `enable_layout_transform`

The parameter enables the dedicated input tensor transform module in the FPGA AI Suite IP. When enabled, the dedicated layout transform hardware transforms the input tensor format and folds the inputs into channels.

When this parameter is enabled, you must configure the transform as described in Input Layout Transform Hardware.

The hardware layout transform is not supported in SoC designs in streaming-to-memory (S2M) mode.

Early Access Only:

This feature has early access support only for FPGA AI Suite 2024.1. Full support for this feature is planned for a future release.

As an early access feature, this feature has the following limitations:

Multiple inference requests with the OpenVINO™ Async API are unsupported.
Data conversion from into8 to FP16 are not enabled.
Accuracy may be degraded when using layout transform.
Some inference networks may not run.

To use this feature, you must enable early access features in the FPGA AI Suite compiler and runtime.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

FPGA AI Suite: IP Reference Manual