• 2019 Update 4
  • 03/20/2019
  • Public Content
Contents

Considering native_ and half_ Versions of Math Built-Ins

OpenCL™ API offers two basic ways to trade precision for speed:
  • native_*
    and
    half_*
    math built-ins, which have lower precision, but are faster than their un-prefixed variants
  • Compiler optimization options that enable optimizations for floating-point arithmetic for the whole OpenCL program (for example, the
    -cl-fast-relaxed-math flag
    ).
For the list of other compiler options and their description please refer to the
Intel® Code Builder for OpenCL™ API - User Manual
. In general, while the
-cl-fast-relaxed-math
flag is a quick way to get potentially large performance gains for kernels with many math operations, it does not permit fine control of numeric accuracy. Consider experimenting with
native_*
equivalents separately for each specific case, keeping track of the resulting accuracy.
The
native_
versions of math built-ins are generally supported in hardware and run substantially faster, while offering lower accuracy. Use native trigonometry and transcendental functions, such as
sin
,
cos
,
exp
or
log
, when performance is more important than precision.
The list of functions that have optimized versions support is provided in "Working with cl-fast-relaxed-math Flag" section of the
OpenCL Code Builder - User’s Guide
.
See Also
OpenCL™ Build and Linking Options chapter of the Intel® Code Builder for OpenCL™ API - User Manual

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.