FPGA AI Suite Handbook

ID 863373
Date 11/21/2025
Public
Document Table of Contents

2.3. The FPGA AI Suite User Flow

While the tool flow showed the technical pipeline, this user flow shows how AI/Software developers and FPGA engineers collaborate to transform ML models into deployed accelerators.

Figure 4.  FPGA AI Suite User Flow

Both groups start together with the same software stack: FPGA AI Suite, Quartus® Prime Pro Edition, and OpenVINO™ toolkit. The two groups also decide on the platform they are developing: PCIe* -attached for data center, embedded HPS for edge, 4K video processing, or DDR-free for ultra-low latency. This shared decision anchors the entire deployment strategy.

After that, the AI/software development work and the FPGA hardware development work can move and iterate their parts of the application roughly in parallel:
  • AI/software development work

    This work involves optimizing neural network graphs with OpenVINO™ Model Converter, defining architecture files to enhance hardware performance, and compiling the design with the FPGA AI Suite compiler into custom inference IP. The design can then be iteratively validated through software emulation to ensure a robust architecture before proceeding to FPGA hardware development.

  • FPGA hardware development work

    Starting with the first iteration of the custom inference IP, the FPGA engineers work to integrate the IP into the overall system design. This process includes connecting the IP to existing system components through Visual Designer Studio or Platform Designer, performing RTL simulations, and running Quartus compilation. The FPGA Hardware Engineers must ensure timing closure, optimize resource utilization, and successfully generate the final bitstream.

Architecture definition requires both ML insights and hardware constraints. IP parameterization needs performance targets from AI engineers and resource budgets from FPGA engineers. And that final inference validation depends on both teams to verify functionality—software engineers on CPU, FPGA engineers on silicon.

You can use software emulation results to determine if further architecture optimization is needed. Your hardware resource usage results can inform model quantization decisions. Use performance benchmarks to drive parameter tuning. With each iteration of the parallel development cycles, the ML model- FPGA hardware fit tightens.

The parallel-track helps eliminate development bottlenecks. AI engineers do not need to wait for synthesis to validate models, and FPGA engineers do not need ML expertise to integrate IP. Yet collaboration points ensure optimal co-design. It's structured flexibility that enables rapid deployment while maintaining development rigor.