Intel® Agilex™ Variable Precision DSP Blocks User Guide

ID 683037
Date 11/17/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

10.2. Native Floating Point DSP Intel® Agilex™ FPGA IP Core Supported Operational Modes

Table 107.  Operational Modes Supported by Native Floating Point DSP Intel® Agilex™ FPGA IP Core
Operational Modes Description Supported Exception Flags
FP32 multiplication mode

This mode performs single precision multiplication operation.

This mode applies the following equation:
  • fp32_result = fp32_result = fp32_mult_a*fp32_mult_b
  • fp32_mult_overflow
  • fp32_mult_underflow
  • fp32_mult_inexact
  • fp32_mult_invalid
FP32 addition or subtraction mode This mode performs single precision addition or subtraction operation.
This mode applies the following equations:
  • fp32_result = fp32_adder_b+fp32_adder_a
  • fp32_result = fp32_adder_b-fp32_adder_a
  • fp32_adder_overflow
  • fp32_adder_underflow
  • fp32_adder_inexact
  • fp32_adder_invalid
FP32 multiplication with addition or subtraction mode

This mode performs single precision multiplication, followed by addition or subtraction operations.

This mode applies the following equations:
  • When chainin feature is enabled:
    • fp32_result = (fp32_mult_a*fp32_mult_b) + fp32_chainin
    • fp32_result = (fp32_mult_a*fp32_mult_b) - fp32_chainin
  • When chainin feature is disabled:
    • fp32_result = (fp32_mult_a*fp32_mult_b) + fp32_adder_a
    • fp32_result = (fp32_mult_a*fp32_mult_b) - fp32_adder_a
  • fp32_mult_overflow
  • fp32_mult_underflow
  • fp32_mult_inexact
  • fp32_mult_invalid
  • fp32_adder_overflow
  • fp32_adder_underflow
  • fp32_adder_inexact
  • fp32_adder_invalid
FP32 multiplication with accumulation mode

This mode performs floating-point multiplication followed by floating-point addition or subtraction with the previous multiplication result.

This mode applies the following equations:
  • When accumulate signal is driven high:
    • fp32_result(t) = [fp32_mult_a(t)*fp32_mult_b(t)] + fp32_result(t-1)
    • fp32_result(t) = [fp32_mult_a(t)*fp32_mult_b(t) - fp32_result(t-1)
  • When accumulate signal is driven low:
    • fp32_result = fp32_mult_a*fp32_mult_b.
FP32 vector one mode

This mode performs floating-point multiplication followed by floating-point addition or subtraction with the chainin input from the previous variable DSP Block.

This mode applies the following equations:
  • When chainin feature is enabled:
    • fp32_result = (fp32_mult_a * fp32_mult_b) + fp32_chainin, fp32_chainout = fp32_adder_a
    • fp32_result = (fp32_mult_a * fp32_mult_b) - fp32_chainin, fp32_chainout = fp32_adder_a
  • When chainin feature is disabled:
    • fp32_result = fp32_mult_a * fp32_mult_b, fp32_chainout = fp32_adder_a
FP32 vector two mode This mode performs floating-point multiplication where the multiplication result is directly fed to chainout. The chainin input from the previous variable DSP Block is then added or subtracted from input Ax as the output result.

This mode applies the following equations:

  • When chainin feature is enabled:
    • fp32_result = fp32_adder_a + fp32_chainin, fp32_chainout = fp32_mult_a * fp32_mult_b
    • fp32_result = fp32_adder_a - fp32_chainin, fp32_chainout = fp32_mult_a * fp32_mult_b
  • When chainin feature is disabled:
    • fp32_result = fp32_adder_a, fp32_chainout = fp32_mult_a * fp32_mult_b
Sum of two FP16 multiplication mode

This mode performs a summation of two half-precision multiplication and provide a single-precision result.

This mode applies the following equations:
  • fp32_result = (fp16_mult_top_a*fp16_mult_top_b) + (fp16_mult_bot_a*fp16_mult_bot_b)
  • fp32_result = (fp16_mult_top_a*fp16_mult_top_b) - (fp16_mult_bot_a*fp16_mult_bot_b)
Exception flags supported in flushed and bfloat16 formats:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_overflow
  • fp16_mult_top_underflow
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_overflow
  • fp16_mult_bot_underflow
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_overflow
  • fp16_adder_underflow
Exception flags supported in extended format:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_infinite
  • fp16_mult_top_zero
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_infinite
  • fp16_mult_bot_zero
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_infinite
  • fp16_adder_zero
Sum of two FP16 multiplication with FP32 addition mode

This mode performs a summation of two half-precision multiplication and provide a single-precision result.

This mode applies the following equations:
  • fp32_result = (fp16_mult_top_a*fp16_mult_top_b) + (fp16_mult_bot_a*fp16_mult_bot_b) - fp32_adder_a
  • fp32_result = (fp16_mult_top_a*fp16_mult_top_b) - (fp16_mult_bot_a*fp16_mult_bot_b) - fp32_adder_a
  • fp32_result = (fp16_mult_top_a*fp16_mult_top_b) + (fp16_mult_bot_a*fp16_mult_bot_b) + fp32_adder_a
  • fp32_result = (fp16_mult_top_a*fp16_mult_top_b) - (fp16_mult_bot_a*fp16_mult_bot_b) + fp32_adder_a
Exception flags supported in flushed and bfloat16 formats:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_overflow
  • fp16_mult_top_underflow
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_overflow
  • fp16_mult_bot_underflow
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_overflow
  • fp16_adder_underflow
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
Exception flags supported in extended format:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_infinite
  • fp16_mult_top_zero
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_infinite
  • fp16_mult_bot_zero
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_infinite
  • fp16_adder_zero
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
Sum of two FP16 multiplication with accumulation mode

This mode performs a summation of two half-precision multiplication and accumulate the value into single-precision format.

This mode applies the following equations:
  • When accumulate signal is driven high:
    • fp32_result (t) = [fp16_mult_top_a(t) * fp16_mult_top_b(t)] + [fp16_mult_bot_a(t) * fp16_mult_bot_b(t)] + fp32_result(t-1)
    • fp32_result (t) = [fp16_mult_top_a(t) * fp16_mult_top_b(t)] - [fp16_mult_bot_a(t) * fp16_mult_bot_b(t)] + fp32_result(t-1)
    • fp32_result (t) = [fp16_mult_top_a(t) * fp16_mult_top_b(t)] + [fp16_mult_bot_a(t) * fp16_mult_bot_b(t)] - fp32_result(t-1)
    • fp32_result (t) = [fp16_mult_top_a(t) * fp16_mult_top_b(t)] - [fp16_mult_bot_a(t) * fp16_mult_bot_b(t)] - fp32_result(t-1)
  • When accumulate signal is driven low:
    • fp32_result = [fp16_mult_top_a * fp16_mult_top_b] + [fp16_mult_bot_a * fp16_mult_bot_b]
    • fp32_result = [fp16_mult_top_a * fp16_mult_top_b] - [fp16_mult_bot_a * fp16_mult_bot_b]
Exception flags supported in flushed and bfloat16 formats:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_overflow
  • fp16_mult_top_underflow
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_overflow
  • fp16_mult_bot_underflow
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_overflow
  • fp16_adder_underflow
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
Exception flags supported in extended format:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_infinite
  • fp16_mult_top_zero
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_infinite
  • fp16_mult_bot_zero
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_infinite
  • fp16_adder_zero
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
FP16 vector one mode

This mode performs a summation of two half-precision multiplications with the chainin input from the previous variable DSP Block. The output is a single-precision floating-point value which is fed into chainout.

This mode applies the following equation:
  • When chainin feature is enabled:
    • fp32_result = (fp16_mult_top_a * fp16_mult_top_b) + (fp16_mult_bot_a * fp16_mult_bot_b) + fp32_chainin, fp32_chainout = fp32_adder_a
    • fp32_result = (fp16_mult_top_a * fp16_mult_top_b) - (fp16_mult_bot_a * fp16_mult_bot_b) + fp32_chainin, fp32_chainout = fp32_adder_a
    • fp32_result = (fp16_mult_top_a * fp16_mult_top_b) + (fp16_mult_bot_a * fp16_mult_bot_b) - fp32_chainin, fp32_chainout = fp32_adder_a
    • fp32_result = (fp16_mult_top_a * fp16_mult_top_b) - (fp16_mult_bot_a * fp16_mult_bot_b) - fp32_chainin, fp32_chainout = fp32_adder_a
  • When chainin feature is disabled:
    • fp32_result = (fp16_mult_top_a * fp16_mult_top_b) + (fp16_mult_bot_a * fp16_mult_bot_b), fp32_chainout = fp32_adder_a
    • fp32_result = (fp16_mult_top_a * fp16_mult_top_b) - (fp16_mult_bot_a * fp16_mult_bot_b), fp32_chainout = fp32_adder_a
Exception flags supported in flushed and bfloat16 formats:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_overflow
  • fp16_mult_top_underflow
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_overflow
  • fp16_mult_bot_underflow
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_overflow
  • fp16_adder_underflow
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
Exception flags supported in extended format:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_infinite
  • fp16_mult_top_zero
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_infinite
  • fp16_mult_bot_zero
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_infinite
  • fp16_adder_zero
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
FP16 vector two mode

This mode performs a summation of two half precision multiplication and fed to chainout. The chainin input from the previous variable DSP Block is then added or subtracted from input fp32_adder_a as the output result.

This mode applies the following equation:
  • When chainin feature is enabled:
    • fp32_result = fp32_adder_a + fp32_chainin, fp32_chainout = (fp16_mult_top_a * fp16_mult_top_b) + (fp16_mult_bot_a * fp16_mult_bot_b)
    • fp32_result = fp32_adder_a - fp32_chainin, fp32_chainout = (fp16_mult_top_a * fp16_mult_top_b) + (fp16_mult_bot_a * fp16_mult_bot_b)
    • fp32_result = fp32_adder_a + fp32_chainin, fp32_chainout = (fp16_mult_top_a * fp16_mult_top_b) - (fp16_mult_bot_a * fp16_mult_bot_b)
    • fp32_result = fp32_adder_a - fp32_chainin, fp32_chainout = (fp16_mult_top_a * fp16_mult_top_b) - (fp16_mult_bot_a * fp16_mult_bot_b)
  • When chainin feature is disabled:
    • fp32_result = fp32_adder_a, fp32_chainout = (fp16_mult_top_a * fp16_mult_top_b) + (fp16_mult_bot_a * fp16_mult_bot_b)
    • fp32_result = fp32_adder_a, fp32_chainout = (fp16_mult_top_a * fp16_mult_top_b) - (fp16_mult_bot_a * fp16_mult_bot_b)
Exception flags supported in flushed and bfloat16 formats:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_overflow
  • fp16_mult_top_underflow
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_overflow
  • fp16_mult_bot_underflow
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_overflow
  • fp16_adder_underflow
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
Exception flags supported in extended format:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_infinite
  • fp16_mult_top_zero
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_infinite
  • fp16_mult_bot_zero
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_infinite
  • fp16_adder_zero
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
FP16 vector three

This mode performs a single-precision accumulation and a summation of two half-precision multiplications.

This mode applies the following equation:
  • When accumulate is driven high:
    • fp32_result(t) = fp32_adder_a(t) + fp32_result(t-1), fp32_chainout = {fp16_mult_top_a * fp16_mult_top_b} + {fp16_mult_bot_a * fp16_mult_bot_b}

    • fp32_result(t) = fp32_adder_a(t) - fp32_result(t-1), fp32_chainout = {fp16_mult_top_a * fp16_mult_top_b} + {fp16_mult_bot_a * fp16_mult_bot_b}
    • fp32_result(t) = fp32_adder_a(t) + fp32_result(t-1), fp32_chainout = {fp16_mult_top_a * fp16_mult_top_b} - {fp16_mult_bot_a * fp16_mult_bot_b}

    • fp32_result(t) = fp32_adder_a(t) - fp32_result(t-1), fp32_chainout = {fp16_mult_top_a * fp16_mult_top_b} - {fp16_mult_bot_a * fp16_mult_bot_b}
  • When accumulate is driven low:
    • fp32_result = fp32_adder_a, fp32_chainout = {fp16_mult_top_a * fp16_mult_top_b} + {fp16_mult_bot_a * fp16_mult_bot_b}
    • fp32_result = fp32_adder_a, fp32_chainout = {fp16_mult_top_a * fp16_mult_top_b} - {fp16_mult_bot_a * fp16_mult_bot_b}
Exception flags supported in flushed and bfloat16 formats:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_overflow
  • fp16_mult_top_underflow
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_overflow
  • fp16_mult_bot_underflow
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_overflow
  • fp16_adder_underflow
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow
Exception flags supported in extended format:
  • fp16_mult_top_invalid
  • fp16_mult_top_inexact
  • fp16_mult_top_infinite
  • fp16_mult_top_zero
  • fp16_mult_bot_invalid
  • fp16_mult_bot_inexact
  • fp16_mult_bot_infinite
  • fp16_mult_bot_zero
  • fp16_adder_invalid
  • fp16_adder_inexact
  • fp16_adder_infinite
  • fp16_adder_zero
  • fp32_adder_invalid
  • fp32_adder_inexact
  • fp32_adder_overflow
  • fp32_adder_underflow