Variable Precision DSP Blocks User Guide: Agilex™ 5 FPGAs and SoCs
ID
813968
Date
8/06/2025
Public
1. Agilex™ 5 Variable Precision DSP Blocks Overview
2. Agilex™ 5 Variable Precision DSP Blocks Architecture
3. Agilex™ 5 Variable Precision DSP Blocks Operational Modes
4. Agilex™ 5 Variable Precision DSP Blocks Design Considerations
5. Native Fixed Point DSP Agilex FPGA IP References
6. Native Floating Point DSP Agilex FPGA IP References
7. Native AI Optimized DSP Agilex™ FPGA IP References
8. Multiply Adder FPGA IP References
9. ALTMULT_COMPLEX FPGA IP References
10. LPM_MULT FPGA IP References
11. LPM_DIVIDE (Divider) FPGA IP References
12. Document Revision History for the Variable Precision DSP Blocks User Guide: Agilex™ 5 FPGAs and SoCs
2.1.1. Input Register Bank for Fixed-point Arithmetic
2.1.2. Pipeline Registers for Fixed-point Arithmetic
2.1.3. Pre-adder for Fixed-point Arithmetic
2.1.4. Internal Coefficient for Fixed-point Arithmetic
2.1.5. Multipliers for Fixed-point Arithmetic
2.1.6. Adder or Subtractor for Fixed-point Arithmetic
2.1.7. Accumulator, Chainout Adder, and Preload Constant for Fixed-point Arithmetic
2.1.8. Systolic Register for Fixed-point Arithmetic
2.1.9. Double Accumulation Register for Fixed-point Arithmetic
2.1.10. Output Register Bank for Fixed-point Arithmetic
2.2.1. Input Register Bank for Floating-point Arithmetic
2.2.2. Pipeline Registers for Floating-point Arithmetic
2.2.3. Multipliers for Floating-point Arithmetic
2.2.4. Adder or Subtractor for Floating-point Arithmetic
2.2.5. Output Register Bank for Floating-point Arithmetic
2.2.6. Exception Handling for Floating-point Arithmetic
3.2.2.1. FP16 Supported Precision Formats
3.2.2.2. Sum of Two FP16 Multiplication Mode
3.2.2.3. Sum of Two FP16 Multiplication with FP32 Addition Mode
3.2.2.4. Sum of Two FP16 Multiplication with Accumulation Mode
3.2.2.5. FP16 Vector One Mode
3.2.2.6. FP16 Vector Two Mode
3.2.2.7. FP16 Vector Three Mode
5.1. Native Fixed Point DSP Agilex™ FPGA IP Release Information
5.2. Supported Operational Modes
5.3. Maximum Input Data Width for Fixed-point Arithmetic
5.4. Maximum Output Data Width for Fixed-point Arithmetic
5.5. Parameterizing Native Fixed Point DSP IP
5.6. Native Fixed Point DSP Agilex™ FPGA IP Signals
5.7. IP Migration
6.4.1. FP32 Multiplication Mode Signals
6.4.2. FP32 Addition or Subtraction Mode Signals
6.4.3. FP32 Multiplication with Addition or Subtraction Mode Signals
6.4.4. FP32 Multiplication with Accumulation Mode Signals
6.4.5. FP32 Vector One and Vector Two Modes Signals
6.4.6. Sum of Two FP16 Multiplication Mode Signals
6.4.7. Sum of Two FP16 Multiplication with FP32 Addition Mode Signals
6.4.8. Sum of Two FP16 Multiplication with Accumulation Mode Signals
6.4.9. FP16 Vector One and Vector Two Modes Signals
6.4.10. FP16 Vector Three Mode Signals
3.3.4. Tensor Fixed-point Mode
In tensor fixed-point mode, two columns of 80-bit weights can be preloaded to the ping-pong buffers by using one of the following methods:
- Data input feed
- Side input feed
A signed 20-bit fixed-point DOT product vector is calculated using the preloaded weights and data_in_{1..10} inputs. The DOT product performs 10 signed 8x8 multiplications.
Next, the CPA adder adds the cascade_data_in_col_{1..2} or the previous cycle’s accumulation value depending upon the dynamic inputs acc_en and zero_en.
The CPA adder outputs the data in 32-bit fixed-point format to core fabric as fxp32_col_{1..2}[31:0] and the next DSP block in the chain through the cascade_data_out_col_{1..2}[31:0] buses.
zero_en | acc_en | fxp32_col_1[31:0] | fxp32_col_2[31..0] |
---|---|---|---|
0 | 0 | data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 + cascade_data_in_col_1[31..0] | data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 + cascade_data_in_col_2[31..0] |
0 | 1 | data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 + fxp32_col_1[31..0] | data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 + fxp32_col_2[31..0] |
1 | NA | data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 | data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 |
Figure 57. Tensor Fixed-point Mode One Column Datapath