FPGA AI Suite Handbook

ID 863373
Date 11/21/2025
Public
Document Table of Contents

2.4.1. The FPGA AI Suite IP Overlay Architecture

Machine learning (ML) tasks differ significantly in their throughput and resource utilization requirements, and the characteristics of ML graphs vary accordingly. As a result, no universally optimal FPGA AI Suite IP configuration exists that can achieve optimal performance in every scenario. Your model is unique to your application, and you need to specify for the FPGA AI Suite compiler an architecture description file to generate the overlay IP.

The FPGA AI Suite IP functions as an FPGA overlay, analogous to a soft processor, where modules can be added or removed to meet specific design requirements. Its configurability facilitates various trade-offs between inference performance (throughput and latency) and FPGA resource utilization (area). Configuration options are defined within Architecture Description Files (Arch files).

An Arch file is a text-based protobuf file with the .arch extension, describing the parameters of the FPGA AI Suite IP. To create an architecture file after understanding the following concepts of it, refer to Creating an Architecture File for the FPGA AI Suite IP.

Conceptually, FPGA AI Suite Overlay IP has two sets of components: control logic and datapath components.

  • The datapath components do heavy lifting—they accept and buffer incoming data, perform computations, and output inference results. The datapath components expose a series of parameters for customization.
  • The control logic orchestrates the data movement, storage, and computation in the datapath. The control logic in the overlay architecture receives instruction from the Config Network.
    • The control logic provides little configurability for customization.

The datapath components of the overlay architecture can be precisely customized using Architecture Description Files (.arch). This file allows adjustments of critical parameters that affect the organization and data movement and storage.

For instance, the most important concept is the processing element (PE) parallelism, as it controls how many computational operations can be executed simultaneously, directly influencing inference speed. Since other components are built around the PE, they frequently draw reference to the concepts introduced in the PE. Hence, a comprehensive understanding of the concepts in PE is essential to obtain a successful design using the FPGA AI Suite.