Problem: Broken Call Tree
- A code region is duplicated.
- A code region is located at a wrong place.
- A code region has incorrect number of trip counts reported in any column of the Trip Counts column group.
- A code region with your code has aSystem Modulediagnostics messageandCannot be modeled: System Modulereason for not offloading.
- Call stacks were detected incorrectly.
- A heavy optimization was used.
- Debug information has issues.
- Make sure you compiled binary with-goption.You can recompile it with the-debug inline-debug-infooption to get enhanced debug information.
- Recompile the binary with a lower optimization level: use-O2.
- If you collect performance metrics withWhen running the Survey analysis, try the following:advisorCLI:
- Remove--stackwalk-mode=onlineoption if you used it when running the Survey analysis.
- Offload only specific code regions if their estimated execution time on a target device is greater than or equal to the original execution time. Rerun the performance modeling with--select-loopsto specify loops of interest and--enforce-offloadsto make sure all of them are offloaded. For example:advisor-python <APM>/analyze.py <project-dir> --select-loops=[<file-name1>:<line-number1>,<file-name1>:<line-number2>,<file-name2>:<line-number3>] –-enforce-offloadsReplace<APM>with$APMon Linux* OS or%APM%on Windows* OS.For details, see Enforce Offloading for Specific Loops
- If you model a multithreaded code that runs with a complicated scheduler, you might see a code region with suspiciously low trip counts and multiple instances of the same region loop present in the scheduler. This means that theOffload Modelingcould not correctly detect the call stacks. Use the--enable-batchingoption to artificially increase the number of trip counts by using total number of executions instead of average number trip counts.