- Intel® VTune™ Amplifier XE opens with the Summary page. Use this page as a starting point for the analysis of your application. In theElapsed Timesection of the Summary page, find out the elapsed time. For the current application it is 0.463 seconds:This display also indicates that this is a single-threaded application with the CPU time equal to 0.080 seconds.
- In theTop Hotspotsection, see the most time-consuming functions. For thepoissonapplication, they arepoisson_red_black_andmpi_recv.
- To analyze the most time-consuming functions, click theBottom Uptab. Take a look at the CPU Time column, in which you can see that it took 70.010 milliseconds to execute the most time consuming function of the application and 9.990 milliseconds to executeMPI_Recv.NoteTo see MPI functions under theBottom-Uptab, make sure thatCall Stack Modeat the bottom of the tab is set toUser Functions + 1It proves that the result we saw in the Intel® Trace Analyzer Event Timeline is correct: this is theMPI_Recvcall that generates imbalance in the application. Since there is no need to optimize this kind of logical imbalance, proceed with the analysis.
- To see the imbalance created by the other function, filter theMPI_Recvout of the analysis scope. To do this, right-click the function at theBottom-Uptab and selectFilter Out By Selection, as shown in the example:
- Take a look at the function with poor CPU usage. Double-click thepoisson_red_black_function to open the source and identify the hotspot code regions. The beginning of the hotspot function is highlighted. The source code in theSourcepane is not editable.NoteTo enable theSourcepane, make sure to build the target with debugging symbols using the-g(Linux* OS) and/Zi(Windows* OS) compiler flags.
- For thepoissonapplication, you can see the cycle in which computation took most of the CPU time.Two options for resolving the issue are vectorize, or parallelize the cycle.