Developer Guide

Intel oneAPI FPGA Handbook

ID 785441
Date 2/07/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Conversion Rules for the ap_float Data Type

You can convert between different sizes of ap_float data types through assignment or by using the convert_to() function. For example,

using namespace ihc; 
ap_float<8, 23> myFloat = ...; 
ap_float<5, 10> myFloat2 = myFloat; // use rounding rules defined by ap_float type 

// use rounding rules defined in convert_to() function call
ap_float <5, 10> myFloat3 = myFloat.convert_to<5, 10, ihc::fp_config::FP_Round::RZERO>();

For two ap_float variables in a binary operation, the ap_float variable with the larger exponent bit-width is considered to be the larger variable. If two variables have the same exponent bit width, the variable with the larger mantissa bit-width is considered to be the larger variable. The operands are then unified to the larger type before the binary operation occurs.

To convert between ap_float and other data types:

  • To convert between native data types (for example, float, double) and ap_float data types, assign to or from the types. Type conversion in an assignment occurs according to the rules mentioned in Table 1.
  • To convert between ac_int data types and ap_float data types:
    • To convert from ac_int data types to ap_float, use explicit conversion. The type conversion occurs according to the rules mentioned in Table 1.
    • To convert from ap_float data types to ac_int, use the to_ac_int<W,S> function, where W is the bit-width, and S is a true/false value indicating if the ac_int variable is signed. For example,
      using namespace ihc;
      ac_int<5,true> myAcInt1 = ...;
      auto myFloat = ap_float<8,23>(myAcInt1);
      auto myAcInt2 =  myFloat.to_ac_int<5,true>();
  • To convert between ac_fixed data types and ap_float, cast to or from the data types. Type conversion occurs according to the rules mentioned in Table 1. For example,
    using namespace ihc;
    ac_fixed<8,7> myAcFixed = …;
    // use rounding rules defined by ap_float type
    auto myFloat2 = ap_float<5,10,fp_config::FP_Round::RZERO>(myAcFixed); 
    // use rounding & overflow rules defined by ac_fixed type
    auto myAcFixed2 = ac_fixed<10,0,true,AC_RND,AC_SAT>(myFloat2);

The Intel® oneAPI DPC++/C++ Compiler also provides some operations that leave the precision of input types untouched and provide control over the output precision. For more details, refer to Operations with Explicit Precision Controls.

Default Conversion Rules for ap_float Variables.
Data Type From ap_float To Data Type From Data Type To ap_float
ap_float with higher representable range

Keep exponent equivalent.

The mantissa is rounded according to the rounding mode of the target ap_float (with the higher representable range).

+-Inf if the source of the conversion is out of the representable range. Otherwise, keep exponent equivalent.

The mantissa is rounded according to the rounding mode of the target ap_float (with the smaller representable range).

IMPORTANT:
If the input number is in the subnormal range of the target ap_float number, the output value is flushed to zero.

float Convert original ap_float to ap_float<8, 23> with the previous ap_float rule, and then bit cast to float. Bit-cast float to ap_float<8, 23>, and then convert to target ap_float precision using the ap_float to ap_float rules described previously.
double Convert original ap_float to ap_float<11, 52> with earlier ap_float rule, and then bit cast to double. Bit-cast double to ap_float<11, 52>, and then convert to the target ap_float precision using the ap_float to ap_float rules described earlier.
C++ native integer types

Truncate towards zero. Converting from ap_float that is larger than the range of integer type is an undefined behavior.

Round to the nearest, tie breaks to even. If the integer value is too large, the ap_float value saturates to plus infinity.

ac_int

Truncate towards zero. Converting from ap_float that is larger than the range of integer type is an undefined behavior.

ac_int width in bits: W ≤ 64

Round to the nearest, tie breaks to even. If the integer value is too large, the ap_float value saturates to +∞ (plus infinity).

ac_int width in bits: W ≤ 64

ac_fixed

Rounding and overflow mode depend on the type parameters of the target ac_fixed

Converting Inf/NaN to ac_fixed is an undefined behavior

Rounding mode depends on rounding mode type parameter of the target ap_float

ac_fixed width in bits: W ≤ 64

Notes:

  • Avoid assigning the result of the convert_to function to another ap_float variable. If the left-hand side of the assignment has a different exponent or mantissa widths than the ones specified in the convert_to function on the right-hand side, another conversion can occur.
  • Converting between floating-point data types and integer or fixed-point data types on FPGA devices can be expensive in terms of area.