Window: Summary - GPU Analysis
EU Array Stalled/Idle
Factor responsible for Low Peak Occupancy
SLM size requested per workgroup in a computing task is too high
Decrease the SLM size or increase the Local size
Global size (the number of working items to be processed by a computing task) is too low
Increase Global size
Barrier synchronization (the sync primitive can cause low occupancy due to a limited number of hardware barriers on a GPU subslice)
Remove barrier synchronization or increase the Local size
- A tiny computing task could cause considerable overhead when compared to the task execution time.
- There may be high imbalance between the threads executing a computing task.