1. Introduction to Standard Edition Best Practices Guide 2. Reviewing Your Kernel's report.html File 3. OpenCL Kernel Design Best Practices 4. Profiling Your Kernel to Identify Performance Bottlenecks 5. Strategies for Improving Single Work-Item Kernel Performance 6. Strategies for Improving NDRange Kernel Data Processing Efficiency 7. Strategies for Improving Memory Access Efficiency 8. Strategies for Optimizing FPGA Area Usage A. Additional Information
2.1. High Level Design Report Layout 2.2. Reviewing the Report Summary 2.3. Reviewing Loop Information 2.4. Reviewing Area Information 2.5. Verifying Information on Memory Replication and Stalls 2.6. Optimizing an OpenCL Design Example Based on Information in the HTML Report 2.7. HTML Report: Area Report Messages 2.8. HTML Report: Kernel Design Concepts
3.1. Transferring Data Via Channels or OpenCL Pipes 3.2. Unrolling Loops 3.3. Optimizing Floating-Point Operations 3.4. Allocating Aligned Memory 3.5. Aligning a Struct with or without Padding 3.6. Maintaining Similar Structures for Vector Type Elements 3.7. Avoiding Pointer Aliasing 3.8. Avoid Expensive Functions 3.9. Avoiding Work-Item ID-Dependent Backward Branching
22.214.171.124. High Stall Percentage 126.96.36.199. Low Occupancy Percentage 188.8.131.52. Low Bandwidth Efficiency 184.108.40.206. High Stall and High Occupancy Percentages 220.127.116.11. No Stalls, Low Occupancy Percentage, and Low Bandwidth Efficiency 18.104.22.168. No Stalls, High Occupancy Percentage, and Low Bandwidth Efficiency 22.214.171.124. Stalling Channels 126.96.36.199. High Stall and Low Occupancy Percentages
7.1. General Guidelines on Optimizing Memory Accesses 7.2. Optimize Global Memory Accesses 7.3. Performing Kernel Computations Using Constant, Local or Private Memory 7.4. Improving Kernel Performance by Banking the Local Memory 7.5. Optimizing Accesses to Local Memory by Controlling the Memory Replication Factor 7.6. Minimizing the Memory Dependencies for Loop Pipelining
2.5.1. Features of the System Viewer
The system viewer is an interactive graphical report of your OpenCL™ system that allows you to review information such as the sizes and types of loads and stores, stalls, and latencies.
You may interact with the system viewer in the following ways:
- Use the mouse wheel to zoom in and out within the system viewer.
- Review portions of your design that are associated with red logic blocks. For example, a logic block that has a pipelined loop with a high initiation interval (II) value might be highlighted in red because the high II value might affect design throughput.
- Hover over any node within a block to view information on that node in the tooltip and in the details pane.
- Select the type of connections you wish to include in the system viewer by unchecking the type of connections you wish to hide. By default, both Control and Memory are checked in the system viewer. Control refers to connections between blocks and loops. Memory refers to connections to and from global or local memories. If your design includes connections to and from read or write channels, you also have a Channels option in the system viewer.
Did you find the information on this page useful?