Examine Kernel Details
Review Compute and Memory Bandwidth Utilization
- The total number of floating-point and integer operations transferred by the kernel per second as a percentage of the maximum compute capacity of your hardware. The red bar represents the dominant operation data type used in the kernel.
- The amount of data transferred by the kernel at each cache memory level per second as a percentage of the memory level bandwidth. Cache memory level bandwidth utilization (in per cent) is a ratio of effective bandwidth and maximum bandwidth of a given memory level. This metric shows how well the kernel uses the capability of your hardware and can help you identify bottlenecks for your kernel.
- Review how much time the kernel spends processing requests for each memory level in relation to the total time, in perspective, reported in the Impacts histogram.A big value indicates a memory level that bounds the selected kernel. Examine the difference between the two largest bars to see how much throughput you can gain if you reduce the impact on your main bottleneck. It also gives you a long-time plan to reduce your memory bound limitations as once you will solve the problems coming from the widest bar, your next issue will come from the second biggest bar and so on.Ideally, you should see the L3 or SLM as the most impactful memory.
- Review an amount of data that passes through each memory level reported in theShareshistogram.
Explore Operation Types Used During Application Execution
Compute (FLOP and INTOP)
LOAD, STORE, SLM_LOAD, SLM_STORE types depending on the argument:
send, sendc, sends, sendsc
- Examine instruction count for each category as well as its percentage in overall instruction mix to determine the dominating category of instructions in the kernel.
- Examine instruction count for each type of compute, memory, atomics, and other instructions.
- For compute instructions, view the dominating data type for each type of instructions.The data type dominating in the entire kernel is highlighted blue.