FPGA AI Suite Handbook

ID 863373
Date 11/21/2025
Public
Document Table of Contents

9.4.2. Improving Layer Accuracy by using Mixed Precision

For ML tasks that are sensitive to precision, although using a lower precision saves area, the inference accuracy may be impacted. Mixed precision feature enables running designated layers in the ML graph to run at a higher precision to achieve a better accuracy.

Figure 25. Conversion to high-precision block floating point

The following diagram illustrates the conversion from floating point (fp16) to high-precision block floating point (in this example, 2 x INT9-BFP).

Since the block floating point block alignment step uses a mantissa width larger than the fp16 mantissa, there is little to no loss of precision.

The only situation in which the high-precision blocked values lose mantissa precision relative to the fp16 inputs occurs when values of very different magnitude (i.e. having very different exponents) are blocked together. In this situation, a large bit shift is required to block align the mantissas, which can cause some low-precision bits of smaller values in the block to be lost. The 7th blocked value in the diagram illustrates this case.

Using the high precision block floating point numerical decomposition, a PE array parameterized to handle INT9-BFP can perform convolutions at INT17-BFP precision on select layers. The table below summarizes what "high precision BFP" entails for different FPGA AI Suite IP arch_precision parameter values.

Table 28.  High-Precision BFP vs. Default Precision BFP

arch_precision

Block floating point

High precision BFP

FP11

INT7-BFP

INT13-BFP(not supported)

FP12AGX

INT8-BFP

INT15-BFP

FP13AGX

INT9-BFP

INT17-BFP

FP16

INT12-BFP

INT23-BFP(not supported)

A high-precision convolution layer has 4x the computational cost of a default precision convolution layer. With high precision BFP decomposition, both features and filters are represented as the sum of two terms. The resulting feature-filter product of sums has four terms.

The computational cost can be reduced by using high precision BFP for only the features, leaving the filters at default precision. Such a layer has 2x the computational cost of a default precision layer.