1. Introduction to Standard Edition Best Practices Guide 2. Reviewing Your Kernel's report.html File 3. OpenCL Kernel Design Best Practices 4. Profiling Your Kernel to Identify Performance Bottlenecks 5. Strategies for Improving Single Work-Item Kernel Performance 6. Strategies for Improving NDRange Kernel Data Processing Efficiency 7. Strategies for Improving Memory Access Efficiency 8. Strategies for Optimizing FPGA Area Usage A. Additional Information
2.1. High Level Design Report Layout 2.2. Reviewing the Report Summary 2.3. Reviewing Loop Information 2.4. Reviewing Area Information 2.5. Verifying Information on Memory Replication and Stalls 2.6. Optimizing an OpenCL Design Example Based on Information in the HTML Report 2.7. HTML Report: Area Report Messages 2.8. HTML Report: Kernel Design Concepts
3.1. Transferring Data Via Channels or OpenCL Pipes 3.2. Unrolling Loops 3.3. Optimizing Floating-Point Operations 3.4. Allocating Aligned Memory 3.5. Aligning a Struct with or without Padding 3.6. Maintaining Similar Structures for Vector Type Elements 3.7. Avoiding Pointer Aliasing 3.8. Avoid Expensive Functions 3.9. Avoiding Work-Item ID-Dependent Backward Branching
22.214.171.124. High Stall Percentage 126.96.36.199. Low Occupancy Percentage 188.8.131.52. Low Bandwidth Efficiency 184.108.40.206. High Stall and High Occupancy Percentages 220.127.116.11. No Stalls, Low Occupancy Percentage, and Low Bandwidth Efficiency 18.104.22.168. No Stalls, High Occupancy Percentage, and Low Bandwidth Efficiency 22.214.171.124. Stalling Channels 126.96.36.199. High Stall and Low Occupancy Percentages
7.1. General Guidelines on Optimizing Memory Accesses 7.2. Optimize Global Memory Accesses 7.3. Performing Kernel Computations Using Constant, Local or Private Memory 7.4. Improving Kernel Performance by Banking the Local Memory 7.5. Optimizing Accesses to Local Memory by Controlling the Memory Replication Factor 7.6. Minimizing the Memory Dependencies for Loop Pipelining
2.4. Reviewing Area Information
The <your_kernel_filename>/reports/report.html file contains information about area usage of your OpenCL system. You may view the area usage information either by source (that is, code line) or by system.
The area report serves the following purposes:
- Provides detailed area breakdown of the whole OpenCL system. The breakdown is related to the source code.
- Provides architectural details to give insight into the generated hardware and offers actionable suggestions to resolve potential inefficiencies.
As observed in the following figure, the area report is divided into three levels of hierarchy:
- System area: It is used by all kernels, channels, interconnects, and board logic.
- Kernel area: It is used by a specific kernel, including overheads, for example, dispatch logic.
- Basic block area: It is used by a specific basic block within a kernel. A basic block area represents a branch-free section of your source code, for example, a loop body.
Figure 23. Area Report Hierarchy
Note: The area usage data are estimates that the generates. These estimates might differ from the final area utilization results.
In the report menu's View reports... pull-down menu, select Area analysis by source or Area analysis of system.
Did you find the information on this page useful?