Optimization Steps
The key to performance measurement is two-fold, know exactly what you are measuring and collect your baseline data. Next, profile your application and identify a specific and realistic performance goal based on the profiling data. Follow these steps to optimize your software.
Fundamental Concepts
The Intel Compilers provide a number of features for generating vectorized code. Auto-vectorization is the method used by the Intel Compilers to generate vectorized code for a given application without requiring code changes. Developers can also implement simple coding changes in the source code to enforce vectorization behavior.
Intel Compiler Auto-vectorization (C++ | Fortran)
Performance Essentials with OpenMP Vectorization (webinar)
Intermediate Techniques
Proven techniques for code optimizations and change recommendations are listed here. Note that these recommendations depend entirely upon the application.
Fortran Array Data and Arguments and Vectorization
Explicit Vector Programming in Fortran
Data Alignment to Assist Vectorization
Random Number Function Vectorization
Optimization Reports
Code changes may be required in order to facilitate vectorization even further. Once a developer has made changes to the code, how does one that the changes elicit the expected behavior? Use of special compiler optimization reports to guide source code changes and verify that the code does indeed vectorize.
Vectorization and Optimization Reports
Advanced Methods
The techniques offering the most control require greater application knowledge and skill in knowing where they should be applied. But these more intensive techniques, such as intrinsics, can result in greater performance when properly used.
References
Intel® Fortran Vectorization Diagnostics
Vectorization Diagnostics for Intel® C++ Compiler
Intel® Fortran Compiler Developer Guide and Reference
Intel® C++ Compiler Developer Guide and Reference