Unified FFT Intel FPGA IPs User Guide

ID 683366
Date 4/05/2021
Public

A newer version of this document is available. Customers should click here to go to the newest version.

1.4. Supported Data Types

The Unified FFT IPs support various fixed- and floating-point data types.

Supported Fixed-Point Data Types

The fixed-point input and output data are in signed two’s complement representation. You determine the position of the binary point of the output by the selected pruning strategy and the position of the binary point of the input. For example, a 16 bit input grows to 25 bits in a 256 point FFT if you apply no pruning. If 8 bits of the input are fractional bits, 8-bits of the output are fractional bits. If you apply mild pruning the IP removes the three least significant bits (LSBs) and the output is 22 bits where 5 of the LSBs are fractional bits.

Supported Floating Point Data Types

You can independently set the mantissa and exponent widths within an allowed range to specify hundreds of floating-point formats to suit your requirements. The supported floating-point types are either IEEE 754 formats (half, single and double precision) or custom IEEE 754-like formats with user-specified exponent and fraction-field widths.

The Unifed FFT IPs represent the special values positive zero, negative zero, and non-numbers in the standard IEEE 754 manner, namely:

  • Zero is mantissa=0 and exponent=0 with the sign-bit giving the sign.
  • Infinity is mantissa=0 and exponent=all ones with the sign-bit giving the sign.
  • Not a number (NaN) is mantissa != 0 and exponent=all ones.

Subnormal values are flushed to zero.

Except for the preceding special values, the numerical value of a float type is given in terms of its bit-wise representation by:

where:

  • Exponent, bias and mantissa are the base-10 equivalents of the respective bit sequences
  • You specify the widths of exponent and mantissa, the width of sign bit is 1, and the value of the bias is given by:

For example, for a 32-bit single precision floating point number with a bit-wise representation of 0x40300000:

  • sign = 0b = 0
  • exponent = 10000000b = 128
  • mantissa = 01100000000000000000000b = 3145728

Then:

f = (-1)^0 × 2^(128-127) × (1+(3145728/(2^23))) = 1 × 2 × (1+0.375) = 2.75