2.2. The FPGA AI Suite Tool Flow
The FPGA AI Suite takes a trained ML model and a user-defined architecture file and then analyzes the ML model structure, maps operations to FPGA hardware resources, quantizes weights and activations for optimal precision, and then generates an IP that you can integrate into an FPGA system. The FPGA AI Suite also supports a C/C++ interface (via OpenVINO) to bit-accurate emulation. You can use this emulation for pre-hardware development of the runtime stack and pre-hardware validation of the ML model accuracy.
Moving from ML Model to IR
The FPGA AI Suite tool flow starts with your trained ML models from TensorFlow, PyTorch, Keras, MXNet, or ONNX. These models are fed to the OpenVINO™ Model Converter to generate an intermediate representation (IR) of your model. The IR consists of a .xml file that describes the model topology and a .bin file that contains the model weights. These IR files are the main input into the FPGA AI Suite tool flow.
Collaborate Early to Choose an Initial Architecture
Your AI developers and FPGA hardware engineers collaborate early to determine a starting architecture description (.arch) file. They can choose one of the predefined architecture description files provided with the FPGA AI Suite, or they can take a more complex route and create a custom architecture file.
Enabling Parallel Software and Hardware Development Flows
- AI & Software Development Flow
The Architecture Optimizer output feeds into FPGA software emulation. This emulation enables functional validation and performance profiling before any hardware synthesis, which is critical for rapid iteration. The software flow targets Linux* hosts—either through PCIe interface, JTAG- Avalon® interface, or an AXI interface (for an HPS host on an SoC FPGA device). Different deployment targets have different runtime stacks.
- FPGA Hardware Development Flow
Use the Architecture Optimizer to update the architecture file with FPGA resource specifications. The optimizer consumes FPGA resource specifications (ALMs, DSPs, RAM blocks, FPS targets) and generates an optimized .arch file. The FPGA AI Suite IP Generator consumes the .arch file and produces a customized IP core. Example designs across a variety of platforms (PCIe-attach, SoC, and hostless) integrate the custom IP to perform machine learning inference.
Final Hardware Implementation
- Both development paths converge at Quartus® Prime Pro Edition. The software combines the validated architecture with the generated IP, performing synthesis, place-and-route, and timing closure to produce the final FPGA bitstream.