Optimizer bugs often lead to crashes or incorrect results. LLVM* offers a tool, Bisect, which lets the user figure out which optimization pass produces the issue. This article will be devoted to debugging optimization issues that happen in Intel® oneAPI DPC++/C++ Compiler and Intel® Fortran Compiler.
Steps to Investigate Optimizer Errors:
- Compile the application with optimizations enabled (-O2 or -O3) to observe the issue.
- Compile the application with the -O0 option (optimizations are disabled) to make sure the issue is not reproducible anymore. If the issue is still there, then the root cause is not caused by the optimizer, so please submit a ticket to OSC.
- Compile the application with the optimization level reproducing the result and add -mllvm -opt-bisect-limit=N, where N is a random number. N is the number of the last optimization pass which will be executed by the compiler. Other passes will be disabled. Check the result.
- If the issue is still reproducible, then provide -opt-bisect-limit=M where M is smaller than N.
- Perform a binary search between M and N to find M where M still reproduces the issue and M+1 does not
- Once the pass name is extracted, submit a ticket to OSC. Please, refer to How to Create a Support Request at the Online Service Center in case you need some guidance.
A few examples will be discussed below. Source code is not provided as the issues have already been fixed and are no longer reproducible with the latest compiler. However, a demonstration of the output and how to enable Bisect may be helpful.
Debugging C++ and Fortran Optimization Issues
Below is a C++ CPU application that can be used as an example for Fortran as well. The following example demonstrates how to deal with compiler crash reproducible with -O2 level only. The compiler gives some hints and provides the path to the reproducer which is expected to be attached to the bug report.
$ icx icx_bisect.cpp fatal error: error in backend: cannot lower memory intrinsic in address space 256 clang-12: error: clang frontend command failed with exit code 70 (use -v to see invocation) <…> clang-12: note: diagnostic msg: ******************** PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang-12: note: diagnostic msg: /tmp/icx_bisect-910797.cpp clang-12: note: diagnostic msg: /tmp/icx_bisect-910797.sh
The issue is not reproducible with -O0. After some experiments with -opt-bisect-limit, we have found that 41 is the pass where the crash happens
$ icx -mllvm -opt-bisect-limit=41 icx_bisect.cpp BISECT: running pass (1) Simplify the CFG on function (_Z2f1PU5AS256hm) BISECT: running pass (2) SROA on function (_Z2f1PU5AS256hm) BISECT: running pass (3) Early CSE on function (_Z2f1PU5AS256hm) … BISECT: running pass (41) Induction Variable Simplification on loop BISECT: NOT running pass (42) Recognize loop idioms on loop BISECT: NOT running pass (43) Delete dead loops on loop BISECT: NOT running pass (44) SROA on function (_Z2f1PU5AS256hm) BISECT: NOT running pass (45) MergedLoadStoreMotion on function (_Z2f1PU5AS256hm) … BISECT: NOT running pass (168) CodeGen Prepare on function (main) BISECT: NOT running pass (169) X86 DAG->DAG Instruction Selection on function (_Z2f1PU5AS256hm)
Debugging C++ and Fortran OpenMP* Offload Optimization Issues
The following example demonstrates how to deal with incorrect results reproducible with -O3 level only. This is a heterogeneous application containing openMP* Offload pragmas and the issue happens somewhere in the target code:
$ icpx -O3 -fiopenmp -fopenmp-targets=spir64 ./t.cpp $ ./a.out wrong results 7 $ icpx -O0 -fiopenmp -fopenmp-targets=spir64 ./t.cpp $ ./a.out 28
To debug GPU optimizations, the user should attach -opt-bisect-limit to the -fopenmp-targets option:
$ icpx -fiopenmp -fopenmp-targets=spir64=”-mllvm -opt-bisect-limit=86” -O3 t.cpp && ./a.out ... BISECT: running pass (85) Value Propagation on function (openmp.descriptor_reg) BISECT: running pass (86) Aggressive Dead Code Elimination on function (openmp.descriptor_reg) BISECT: NOT running pass (87) MemCpy Optimization on function (openmp.descriptor_reg) ... wrong results 7 $ icpx -fiopenmp -fopenmp-targets=spir64=”-mllvm -opt-bisect-limit=85” -O3 t.cpp && ./a.out ... BISECT: running pass (84) Jump Threading on function (openmp.descriptor_reg) BISECT: running pass (85) Value Propagation on function (openmp.descriptor_reg) BISECT: NOT running pass (86) Aggressive Dead Code Elimination on function (openmp.descriptor_reg) ... 28
This approach is correct for Fortran openMP* Offload applications as well.
Debugging DPC++ Optimization Issues
-opt-bisect-limit option is not yet supported for debugging GPU kernel optimizations. Users can debug host code optimizations as described above in Debugging C++ Compiler Optimization Issues
Notices and Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.