Intel® Xeon Phi™ Processor Targets
- To enable system-wide and uncore event collection that allows the measurement of DRAM and MCDRAM memory bandwidth that is a part of the Memory Access and HPC Performance Characterization analysis types, use root or sudo to set/proc/sys/kernel/perf_event_paranoidto0.echo 0>/proc/sys/kernel/perf_event_paranoid
- To enable collection with the Microarchitecture Exploration analysis type, increase the default limit of opened file descriptors. Use root or sudo to increase the default value in/etc/security/limits.confto100*.<number_of_logical_CPU_cores><user>hard nofile<100 * number_of_logic_CPU_cores><user>soft nofile<100 * number_of_logic_CPU_cores>
1. Configure and run analysis on the target system with an Intel Xeon Phi processor
- Finalization on host system (recommended): Use a command to run the analysis on the system with the Intel Xeon Phi processor without finalizing. This option results in the best performance.From a command prompt, run the collection with the deferred finalization option to calculate the binary check sum for proper symbol resolution on the host system. For example, to run a Memory Access analysis:-collect memory-access -finalization-mode=deferred -rvtune<my_result_dir>./my_appYou can also generate a command using theVTuneGUI as described below. After generating the command, add theProfiler-finalization-mode=deferredoption to the command to delay finalization.
- Finalization on target system: Use theVTuneGUI on the host system to generate a command for the target system with the Intel Xeon Phi processor. Run and finalize the analysis on the target system. This method may not provide the fastest results.Profiler
- In theWHEREpane, selectArbitrary Hostbutton, set the processor architecture toIntel® Processor code named Knights Landing, and specify the operating system type.
- In theWHATpane, selectLaunch Applicationand configure the analysis:
- Enter the application name and parameters.
- Select theUse MPI Launchercheckbox and provide the launcher name, number of ranks, ranks to profile, and result location.
- Click theCommand Linebutton at the bottom of the window to generate the command.
- Copy the generated command to a command prompt on the target system and run the analysis. Finalization begins after the analysis completes. Finalization may take several minutes.
2. Open the result on the host system
- Copy the result to the host system using SSH or a similar method.
- [Optional] Finalize the result by providing the result file and search directories to the binaries of interest if the module paths are different from the target system. For example:-finalize -rvtune<my_result_dir>-search-dir<my_binary_dir>
3. Open and interpret analysis results
- View results in the command line by running a command to generate a report based on the data collected. For example, the following command creates a hotspots report:-report hotspots -rvtune<my_result_dir>
- LaunchIntel VTuneon the host system and view the result file.Profiler
- OpenIntel VTune.Profiler
- Use the open result action on the toolbar or from the menu button to browse to the result file.