IEEE Floating-point Operations
Understanding the IEEE Standard for Floating-point Arithmetic, IEEE 754-2008
- Signed Zero:The sign of zero is the same as the sign of a nonzero number. Comparisons consider +0 to be equal to -0. A signed zero is useful in certain numerical analysis algorithms, but in most applications the sign of zero is invisible.
- Denormalized Numbers:Denormalized numbers (denormals) fill the gap between the smallest positive and the smallest negative normalized number, otherwise only (+/-) 0 occurs in the interval. Denormalized numbers extend the range of computable results by allowing for gradual underflow.Systems based on the IA-32 architecture support a Denormal Operand status flag. When this is set, at least one of the input operands to a Floating-point operation is a denormal. The Underflow status flag is set when a number loses precision and becomes a denormal.
- Signed Infinity:Infinities are the result of arithmetic in the limiting case of operands with arbitrarily large magnitude. They provide a way to continue when an overflow occurs. The sign of an infinity is simply the sign you obtain for a finite number in the same operation as the finite number approaches an infinite value.By retrieving the status flags, you can differentiate between an infinity that results from an overflow and one that results from division by zero. The compiler treats infinity as signed by default. The output value of infinity is +Infinity or -Infinity.
- Not a Number:Not a Number (NaN) may result from an invalid operation. For example,0/0andSQRT(-1)result in NaN. In general, an operation involving a NaN produces another NaN. Because the fraction of a NaN is unspecified, there are many possible NaNsThe compiler treats all NaNs identically, but there are two classes of NaNs:
The floating-point hardware usually converts a signaling NaN into a quiet NaN during computational operations. An invalid exception is raised and the resulting Floating-point value is a quiet NaN.
- Signaling NaNs: Have an initial mantissa bit of 0. They usually raise an invalid exception when used in an operation.
- Quiet NaNs: Have an initial mantissa bit of 1.