ID 683349
Date 3/28/2022
Public

## 8.5.1. Operators and Return Types Supported by the hls_float Data Type

The hls_float data type supports all overloaded math operators and a limited set of the math functions provided by the Intel® HLS Compiler Pro Edition. For some math operators, you can control the precision of the output by using templated versions of the functions.

Important: Due to the differences in the internal math implementations and rounding errors, the results from hls_float operations might not always be bit-accurate to those produced by C++ native floating-point types with the same exponent and mantissa bit widths. However, these results are validated against the infinitely accurate results.

### Supported Math Functions

In addition to supporting all overloaded math operators, the Intel® HLS Compiler supports the following additional math functions for the hls_float data type through the HLS/hls_float_math.h header file:
• Exponential and logarithmic functions 5:
• ln, log2 , log10 , ln(1+x)
• e x , 2 x , 10 x , e x −1
• reciprocal
• reciprocal_sqrt
• sqrt 5
• cbrt (cube root)
• hypot (hypotenuse)
• Power functions5:
• pow, powr, pown
• Trigonometric functions5:
• sin, cos, sincos
• sinpi, cospi
• asin, asinpi, acos, acospit, atan, atanpi, atan2

### Conversion Rules

You can convert between different sizes of hls_float data types through assignment or by using the convert_to() function. For example,
hls_float<8, 32> myFloat = ...;
hls_float<3, 18> myFloat2 = myFloat; // use rounding rules defined by hls_float type
hls_float <3, 18>myFloat3 = myFloat.convert_to<3, 18, ihc::fp_config::FP_Round::RZERO>();
// use rounding rules defined in convert_to() function call

To convert between native types (for example, float, double) and hls_float data types, assign to or from the types. Type conversion in an assignment occurs according to the rules in the Default Conversion Rules for hls_float Variables table that follows.

For two hls_float variables in a binary operation, the hls_float variable with the larger exponent bit-width is considered to be the "larger" variable. If the two variables have the same exponent bit width, the variable with the larger mantissa bit-width is considered to be the larger variable. The operands are then unified to the "larger" type before the binary operation occurs.

Native floating point data types and hls_float data types are converted to hls_float data types according to the rules in the Default Conversion Rules for hls_float Variables table that follows.

The Intel® HLS Compiler also provides some operations that leave the precision of input types untouched and provide control over the output precision. For details, see Operations With Explicit Precision Controls.

Table 21.  Default Conversion Rules for hls_float Variables
Data Type From hls_float To Data Type From Data Type To hls_float
hls_float with higher representable range Keep exponent equivalent.

The mantissa is rounded according to the rounding mode of the target hls_float (with the higher representable range).

+-Inf if the source of the conversion is out of the representable range.

Otherwise, keep exponent equivalent.

The mantissa is rounded according to the rounding mode of the target hls_float (with the smaller representable range).

float Convert original hls_float to hls_float<8, 23> with earlier hls_float rule, then bit-cast to float Bit-cast float to hls_float<8, 23>, and then convert to target hls_float precision using the hls_float to hls_float rules described earlier.
double Convert original hls_float to hls_float<11, 52> with earlier hls_float rule, then bit-cast to double Bit-cast double to hls_float<11, 52>, and then convert to target hls_float precision using the hls_float to hls_float rules described earlier.
long double

(emulation only)

(Linux only)

Convert original hls_float to hls_float<15, 63> with earlier hls_float rule, then insert a 1-bit 1 to the MSB of fraction bits to get an approximate equivalent of 80-bit representation of long double Drop the explicit 1 fraction bit to convert long double to 79-bit hls_float<15, 63>
long double

(emulation only)

(Windows only)

Same as double Same as double
C++ native integer types Truncate towards zero

Converting from hls_float that is larger than range of integer type is undefined behavior.

Round to nearest, tie breaks to even.

If the integer value is too large, the hls_float value saturates to plus infinity.

### Operations With Explicit Precision Controls

The Intel® HLS Compiler provides the following operations that leave the precision of input hls_float-type variables untouched and let you control the output precision:

Rounding Mode Control For hls_float to hls_float Conversions
Syntax
convert_to<output_exponent_width, output_mantissa_width, rounding_mode>
Description

Use this method to override the rounding mode set for an hls_float variable when you are converting the variable to different precision.

By default, hls_float to hls_float conversions use the rounding mode that you specified when you declared the variable.

Multiplication
Syntax
ihc::hls_float< output_exponent_width, output_mantissa_width > ::mul <accuracy_setting], [subnormal_setting]> (hls_float_a, hls_float_b)
Where the optional parameters are defined as follows:
subnormal_setting
Optional parameter to specify whether input and output number are flushed to zero when carrying out basic binary operations explicitly.
Set this parameter with one of the following values:
• ihc::fp_config::FP_Subnormal::ON

Input and output numbers in the subnormal range are preserved.

The target FPGA device must have subnormal support,

Subnormal support might require more FPGA area.

• ihc::fp_config::FP_Subnormal::OFF

Input or output numbers in the subnormal range are flushed to zero.

• ihc::fp_config::FP_Subnormal::AUTO

With this setting, the Intel® HLS Compiler enables subnormal support only when it is directly supported by the target FPGA device and it does incur any extra FPGA area overhead.

If you do not set this parameter, the Intel® HLS Compiler uses the ihc::FP_Subnormal::AUTO subnormal setting.
accuracy_setting
Optional parameter that influences trade-offs between the accuracy of the result due to different rounding decisions in the intermediary calculations and the FPGA area utilized by the generated hardware. Floating-point operations with less accurate results typically use fewer logic elements.

For example, a divider with a high accuracy might use 20% more FPGA area than divider with low accuracy. The low accuracy divider has a higher error bound [1 unit of least precision (ULP)] than a high accuracy divider (0.5 ULP).

Set this parameter with one of the following values:
• ihc::fp_config::FP_Accuracy::LOW
• ihc::fp_config::FP_Accuracy::HIGH
If you do not set this parameter, the Intel® HLS Compiler uses the ihc::fp_config::FP_Accuracy::HIGH accuracy setting.
Description

This math function supplements the basic multiplication operation performed by the multiplication (*) operator.

Multiplies hls_float_a and flaot_b without changing the input types, and outputs an hls_float at the specified precision.

Syntax
ihc::hls_float< output_exponent_width, output_mantissa_width > ::add <[optional parameters]> (hls_float_a, hls_float_b)

ihc::hls_float< output_exponent_width, output_mantissa_width > ::sub <[optional parameters]> (hls_float_a, hls_float_b)

ihc::hls_float< output_exponent_width, output_mantissa_width > ::div <[optional parameters]> (hls_float_a, hls_float_b)

Description

These math functions supplement the basic math operations performed by the addition/subtraction/division (+/ //) operators.

Adds/Subtracts/Divides hls_float_a and hls_float_b by first casting hls_float_a and hls_float_b to the specified hls_floatprecision. The operation and output are at the specified precision.

You can also specify the optional parameters that are the accuracy_setting and subnormal_setting parameters described earlier.

### Comparison Operators

Comparison operators (>, <, ==, !=, >=, <=) are subject to the conversion rules described earlier.

The == and != operators impose a bit-wise comparison of the casted values.

Comparisons with NaN always return false.

The hls_float data type also has the following additional functions:
Table 22.
Function Description
Getters and Setters
hls_float::get_exponent

hls_float::set_exponent

Gets/sets the exponent value of the hls_float variable.
hls_float::get_mantissa

hls_float::set_mantissa

Gets/sets the mantissa value of the hls_float variable.
hls_float::get_sign

hls_float::set_sign

Gets/sets the sign bit of the hls_float variable.
Special Constants
hls_float<e,m>::nan() Constant used to assign the hls_float variable a value of NaN.
hls_float<e,m>::pos_inf() Constant used to assign the hls_float variable a value of +∞.
hls_float<e,m>::neg_inf() Constant used to assign the hls_float variable a value of −∞.
Value Queries
hls_float::is_nan() Returns true if the value of the hls_float variable is NaN.
hls_float::is_inf() Returns true if the value of the hls_float variable is ±∞.
hls_float::is_zero() Returns true if the value of the hls_float variable is zero.
Special Functions
hls_float::next_after(next_val) Returns the next representable value towards next_val.
* Not supported for hls_float<15,63> precision variables.