2. What is the FPGA AI Suite?
- Hardware description language (HDL) source that targets a specific FPGA device
- C/C++ emulation code
- Inference runtime for host control path.
The user-provided configuration information takes the form of parameters that configure the FPGA AI Suite IP, which is an IP with an FPGA overlay architecture. You can use the IP parameters to configure the overlay to your ML model as well optimizing the performance for your needs. The overlay architecture is a collection of optimized RTL blocks that forms a complete inference pipeline: tensor processing units, memory controllers, data movers, and interconnect logic—all working together as a cohesive system.
An architecture generator takes your parameters and the FPGA overlay to generate custom RTL specifically optimized for the target ML model and FPGA device.
The IP has two important parts: the inference engine and the stream buffer unit. The inference engine is a highly parallel processor array that handles convolutions, pooling, activations, and all other neural network operations at hardware speed. The stream buffer unit manages data flow between external memory and the processing elements to keep the FPGA datapath as full as possible without bottlenecks.
You can create your own IP configuration files (often called architecture files or .arch files) or you can use one of the provided example preconfigured architecture files that target a combination of ML model and FPGA device family.
The FPGA AI Suite also includes evaluation and prototyping platforms (the FPGA AI Suite design examples) that you can use as a starting point based on your deployment methodology.