Analyze Vectorization Perspective
Improve your application performance, get code-specific recommendations for how to fix vectorization issues and quick visibility into source code and assembly code by running the Vectorization and Code Insights perspective.
The Vectorization and Code Insights perspective can help you to identify:
- Where vectorization, or parallelization with threads, will pay off the most
- If vectorized loops are providing benefit, and if not, why not
- Not vectorized loops and why they are not vectorized
- Memory usage issues
- Performance insights and problems in general
How It Works
The Vectorization and Code Insights perspective includes the following steps:
- Get integrated compiler report data and performance data by running a Survey analysis.
- Identify the number of times loops are invoked and execute and the number of floating-point and integer operations by running the Characterization analysis. It measures the call count/loop count and iteration count metrics for your application. Enable to make better decisions about your vectorization strategy for particular loops, as well as optimize already-vectorized loops.
- Check for various memory issues by running the Memory Access Patterns (MAP) analysis. It can warn you about non-contiguous memory accesses, unit stride vs. non-unit stride accesses, or other issues. Enable to identify issues that could lead to significant vector code execution slowdown or block automatic vectorization by the compiler.
- Check for data dependencies in loops the compiler did not vectorize by running the Dependencies analysis. The Dependencies analysis checks for real data dependencies and if real dependencies are detected, provides additional details to help resolve them. Choose to identify and better characterize real data dependencies that could make forced vectorization unsafe.
Vectorization and Code Insights perspective collects data about your application performance, including the following:
Performance metrics, including vectorization efficiency for the whole application and for each vectorized loop/function
Top five time-consuming loops sorted by self time
Integrated compiler report data and code-specific recommendations for fixing performance issues