4.2.2. Kernel Execution Tab
For example, if you run the host application from a networked directory with slow network disk accesses, the GUI can display the resulting delays between kernel launches while the runtime stores profile output data to disk.
The horizontal bar graph represents kernel execution through time. The combination of the two bars shown in the first entry (fft1d) represents the total time. The second and last entries show kernel executions that occupy the time span. These bars represent the concurrent execution of output_kernel and input_kernel, and indicate that the kernels share common resources such as memory bandwidth.
The Kernel Execution tab also displays information on memory transfers between the host and your devices, shown below:
To enable the display of memory transfer information, set the environment variable ACL_PROFILE_TIMER to a value of 1 and then run your host application. Setting the ACL_PROFILE_TIMER environment variable enables the recording of memory transfers. The information is stored in the profile.mon file and is then parsed by the GUI.
Did you find the information on this page useful?