Visible to Intel only — GUID: GUID-3A0BD42E-B6C7-469F-8F67-BB612828BEE5
Visible to Intel only — GUID: GUID-3A0BD42E-B6C7-469F-8F67-BB612828BEE5
Declare the ap_float Data Type
The ap_float.hpp header file provides support for arbitrary-precision floating-point numbers. The floating-point representation for ap_float data types adopts the same data layout as the IEEE 754 floating-point representation.
An ap_float variable carries an explicit sign bit and an arbitrary number of bits for the exponent and mantissa. Due to the differences in the internal math implementations and rounding errors, the results from ap_float operations might not always be bit-accurate compared to those produced by C++ native floating-point types with the same exponent and mantissa bit widths.
Perform the following steps to declare the ap_float data type:
- Include the ap_float.hpp header file as follows:
#include <sycl/ext/intel/ac_types/ap_float.hpp>
- Declare your ap_float variables as follows:
ihc::ap_float<exponent_width, mantissa_width[,rounding_mode]>
Where, the template attributes are defined as follows:
- exponent_width, mantissa_width: The bit-width of the exponent and mantissa of the floating-point variable.
The ap_float.hpp header file also provides aliases to declare bfloat16 and bfloat19 data types directly. The ap_float data type supports the following exponent_width, mantissa_width combinations:
Exponent- and Mantissa-Width Combinations Supported by the ap_float Data Type 5, 10 8, 7 8, 10 8, 17 8, 23 8, 26 10, 35 11, 44 11, 52 15, 63 Some of these width combinations map to some commonly used floating-point formats, as listed in the following table:
Exponent- and Mantissa-Width Setting for Various Floating-point Formats Floating-point Format exponent_width, mantissa_width Setting IEEE 754 half-precision (binary16) 5, 10 bfloat16 8, 7 bfloat19 8,10 IEEE 754 single-precision (binary32) 8. 23 IEEE 754 double-precision (binary64) 11, 52 80-bit extended precision 15, 63 - rounding_mode: Optional parameter to specify the IEEE 754 rounding mode used when converting between data types. Set the rounding mode with one of the following values:
Rounding Mode Values Rounding Mode Description ihc::fp_config::FP_Round::RNE Round to the nearest, tie break to even.
This rounding mode is more accurate (0.5 ULP) but requires more FPGA area.
ihc::fp_config::FP_Round::RZERO Round towards zero.
This rounding mode is less accurate (1 ULP) and requires less FPGA area.
NOTE:If you do not set the rounding_mode parameter, the ihc::FP_Round::RNE rounding mode is used by default.
- exponent_width, mantissa_width: The bit-width of the exponent and mantissa of the floating-point variable.
Math Functions Supported by ap_float Data Type
The ap_float data type supports all overloaded math operators and a limited set of the math functions provided by the Intel® oneAPI DPC++/C++ Compiler. For some math operators, you can control the output's precision by using templated versions of the functions.
Due to the differences in the internal math implementations and rounding errors, the results from ap_float operations might not always be bit-accurate when compared to those produced by C++ native floating-point types with the same exponent and mantissa-bit widths. However, these results are validated against the infinitely accurate results.
The following additional math functions are supported through the ap_float_math.hpp header file:
Function Type | Math Function | Comment |
---|---|---|
Exponential and logarithmic functions |
|
Supported only for ap_float data types with exponent width less than or equal to 15 bits and mantissa width less than or equal to 63 bits. |
|
Supported only for ap_float data types with exponent width less than or equal to 11 bits and mantissa width less than or equal to 52 bits. | |
Advanced functions |
|
|
Power functions |
|
|
Trigonometric functions |
|
1 You can also declare this format directly as a ihc::bfloat16 data type.
2 You can also declare this format directly as a ihc::bfloat19 data type.
3 You can also declare this format directly as a ihc::FPsingle data type.
4 You can also declare this format directly as a ihc::FPdouble data type.
5 Not a bit-to-bit mapping. The integer part (bit 63) of the 80-bit extended precision value is dropped when converting it to ap_float<15,63>.