Analyze Vectorization Perspective
Improve your application performance, get code-specific recommendations for how to fix vectorization issues and quick visibility into source code and assembly code by running the
Vectorization and Code Insights
perspective.
The
Vectorization and Code Insights
perspective can help you to identify:
- Where vectorization, or parallelization with threads, will pay off the most
- If vectorized loops are providing benefit, and if not, why not
- Not vectorized loops and why they are not vectorized
- Memory usage issues
- Performance insights and problems in general
How It Works
The
Vectorization and Code Insights
perspective includes the following steps:
- Get integrated compiler report data and performance data by running aSurveyanalysis.
- Identify the number of times loops are invoked and execute and the number of floating-point and integer operations by running theCharacterizationanalysis. It measures the call count/loop count and iteration count metrics for your application. Enable to make better decisions about your vectorization strategy for particular loops, as well as optimize already-vectorized loops.
- Check for various memory issues by running theMemory Access Patterns (MAP)analysis. It can warn you about non-contiguous memory accesses, unit stride vs. non-unit stride accesses, or other issues. Enable to identify issues that could lead to significant vector code execution slowdown or block automatic vectorization by the compiler.
- Check for data dependencies in loops the compiler did not vectorize by running theDependenciesanalysis. The Dependencies analysis checks for real data dependencies and if real dependencies are detected, provides additional details to help resolve them. Choose to identify and better characterize real data dependencies that could make forced vectorization unsafe.
Vectorization Summary
Vectorization and Code Insights
perspective collects data about your application performance, including the following:
- Performance metrics, including vectorization efficiency for the whole application and for each vectorized loop/function
- Top five time-consuming loops sorted by self time
- Integrated compiler report data and code-specific recommendations for fixing performance issues
