FPGA AI Suite Handbook

ID 863373
Date 11/21/2025
Public
Document Table of Contents

2.2. The FPGA AI Suite Tool Flow

The FPGA AI Suite takes a trained ML model and a user-defined architecture file and then analyzes the ML model structure, maps operations to FPGA hardware resources, quantizes weights and activations for optimal precision, and then generates an IP that you can integrate into an FPGA system. The FPGA AI Suite also supports a C/C++ interface (via OpenVINO) to bit-accurate emulation. You can use this emulation for pre-hardware development of the runtime stack and pre-hardware validation of the ML model accuracy.

The following diagram illustrates the FPGA AI Suite tool flow:
Figure 3.  FPGA AI Suite Tool Flow

Moving from ML Model to IR

The FPGA AI Suite tool flow starts with your trained ML models from TensorFlow, PyTorch, Keras, MXNet, or ONNX. These models are fed to the OpenVINO™ Model Converter to generate an intermediate representation (IR) of your model. The IR consists of a .xml file that describes the model topology and a .bin file that contains the model weights. These IR files are the main input into the FPGA AI Suite tool flow.

Collaborate Early to Choose an Initial Architecture

Your AI developers and FPGA hardware engineers collaborate early to determine a starting architecture description (.arch) file. They can choose one of the predefined architecture description files provided with the FPGA AI Suite, or they can take a more complex route and create a custom architecture file.

Enabling Parallel Software and Hardware Development Flows

After you have the IR for your ML model and an initial architecture file, the FPGA AI Suite compiler (dla_compiler) Architecture Optimizer takes the IR data and enables two parallel optimization paths:
  • AI & Software Development Flow

    The Architecture Optimizer output feeds into FPGA software emulation. This emulation enables functional validation and performance profiling before any hardware synthesis, which is critical for rapid iteration. The software flow targets Linux* hosts—either through PCIe interface, JTAG- Avalon® interface, or an AXI interface (for an HPS host on an SoC FPGA device). Different deployment targets have different runtime stacks.

  • FPGA Hardware Development Flow

    Use the Architecture Optimizer to update the architecture file with FPGA resource specifications. The optimizer consumes FPGA resource specifications (ALMs, DSPs, RAM blocks, FPS targets) and generates an optimized .arch file. The FPGA AI Suite IP Generator consumes the .arch file and produces a customized IP core. Example designs across a variety of platforms (PCIe-attach, SoC, and hostless) integrate the custom IP to perform machine learning inference.

Final Hardware Implementation

  • Both development paths converge at Quartus® Prime Pro Edition. The software combines the validated architecture with the generated IP, performing synthesis, place-and-route, and timing closure to produce the final FPGA bitstream.