Identify the Real Bottlenecks
This topic is part of a
tutorial
that shows how to use the automated
Roofline
chart to make prioritized optimization decisions.
Perform the following steps:
Key take-aways from these steps:
- The first roofline above a dot position isn't always the bottleneck; any roofline above a dot position could be the culprit
- Even a roofline below a dot position can be a bottleneck; however, the farther a dot is positioned above a roofline, the less likely that roofline is causing the bottleneck.
- If the first roofline above a dot position does not make logical sense, investigate the next roofline, and just keep working your way up theRooflinechart, using common sense, otherIntel Advisorfeatures, and your familiarity with your application to inform your investigation.
- TheRooflinechart is not a Data-In-Answers-Out utility; however, it puts you in the ballpark and guides you in the right direction to optimize your code.
Open a Result Snapshot
Do one of the following:
- If you prefer to work in the standalone GUI, from theFilemenu, choose and choose theResult3.advixeexpzresult.
- If you prefer to work in the Visual Studio* IDE, from theFilemenu, choose and choose theResult3.advixeexpzresult.
Focus the Roofline Chart on the Data of Most Interest
- On theIntel Advisortoolbar, click theLoops And Functionsfilter drop-down and chooseLoops.
- In theRooflinechart:
- Select theUse Single-Threaded Loopscheckbox.
- Click the
control, then deselect the
Visibilitycheckbox for allSP...roofs. (All variables in this sample code are double-precision, so there is no need to clutter the chart with single-precision rooflines.)In thePoint Colorizationsection, chooseColors of Point Weight Rangesto differentiate dot colors by runtime (red, yellow, and green).Clickto save your changes.
- Click the
control. In the x-axis fields, backspace over the existing values and enter 0.05 and 0.7. In the y-axis fields, backspace over the existing values and enter 1.0 and 14.8. Click the
button to save your changes.
Interpret Roofline Chart Data

Notice the position of the dot representing the loop at
main
in
roofline.cpp:138
(the red dot).
One possible reason for the dot position: The loop is suffering from a memory bandwidth bottleneck, based on the dot position below the
L3 Bandwidth
roofline.
However, based on our familiarity with the sample code, we know the dataset definitely fits into in L1 cache. So the next
L2 Bandwidth
roofline does not seem to be the likely culprit either.
Another possible reason for the dot position: The next roofline up is the
Scalar Add Peak
roofline, so perhaps the loop is suffering from a compute capacity bottleneck.
Using the
Survey Report
, we can quickly verify the loop is scalar (blue loop icon).
What happens if we vectorize the loop but add no memory optimizations? This is exactly what we did. The outcome is the loop in
main
at
roofline.cpp:151
(the yellow dot).
Notice the dot representing this loop is positioned above the
Scalar Add Peak
roofline and closer to the
L1 Bandwidth
roofline.
This proves the bottleneck was compute capacity, not memory bandwidth.