7.4. The FPGA AI Suite Compiler Report
After running the FPGA AI Suite compiler with the performance or resources estimation flags, you get a detailed report for the architecture and the network. Subject to the fMAX assumptions for this compilation, you receive estimates for the following metrics:
Metric |
Description |
|---|---|
Total DDR Space Required |
The estimated total DDR space including static buffer for config data, static buffer for filter, and buffer for features. Changing the arch_precision parameter in your .arch file changes the storage size of the filter and affects this estimation. |
Total DDR Transfers Required |
The estimated total DDR transfers for one instance. Similarly, changing the arch_precision parameter in your .arch file affects this estimation. |
Minimum Average DDR Bandwidth Required |
An average estimation on the DDR bandwidth. Please note that the peak DDR transfer will exceed this estimation. Use as a baseline for system budgeting. |
PE-only Conv Throughput No DDR |
The estimated throughput of the PEs assuming no DDR transfers, i.e. the design DDR-Free and weights are stored on-chip. This estimation may appear to be the highest since it eliminates all the system and memory overhead. |
PE-only Conv Throughput |
The estimated throughput of the PEs for a single IP instance. This estimation isolates the performance of the PE, excluding input and output memory transfer. It highlights the computational efficiency of the PE array of your architecture. |
Final throughput |
The total estimated throughput of the IP. This estimation includes any overhead of the memory transfers. This is for a single IP instance. |
For ALMs, DSPs, and M20Ks, leave some unused on the FPGA device to ensure that your FPGA design can achieve timing closure and to ensure that the design is routable and can fit.
Work with your FPGA team to set a resources utilization target. After such target being set, budget according to the target. As a general rule, aim for a maximum ALM utilization of 70%, a maximum DPS usage of 95%, and maximum M20K usages of 95%.