FPGA AI Suite: PCIe-based Design Example User Guide

ID 768977
Date 3/29/2024
Public
Document Table of Contents

6.3.2. FPGA AI Suite Runtime

The FPGA AI Suite runtime implements lower-level classes and functions that interact with the memory-mapped device (MMD). The MMD is responsible for communicating requests to the OPAE driver, and the OPAE driver connects to the OPAE FPGA BSP, and ultimately to the FPGA AI Suite IP instance or instances.

The runtime source files are located under runtime/coredla_device. The three most important classes in the runtime are the Device class, the GraphJob class, and the BatchJob class.

Device class

  • Acquires a handle to the MMD for performing operations by calling aocl_mmd_open.
  • Initializes a DDR memory allocator with the size of 1 DDR bank for each FPGA AI Suite IP instance on the device.
  • Implements and registers a callback function on the MMD DMA (host to FPGA) thread to launch FPGA AI Suite IP for batch=1 after the batch input data is transferred from host to DDR.
  • Implements and registers a callback function (interrupt service routine) on the MMD kernel interrupt thread to service interrupts from hardware after one batch job completes.
  • Provides the CreateGraphJob function to create a GraphJob object for each FPGA AI Suite IP instance on the device.
  • Provides the WaitForDla(instance id) function to wait for a batch inference job to complete on a given instance. Returns instantly if the number of batch jobs finished (that is, the number of jobs processed by interrupt service routine) is greater than number of batch jobs waited for this instance. Otherwise, the function waits until interrupt service routine notifies. Before returning, this function increments the number of batch jobs that have been waited for this instance.

GraphJob class

  • Represents a compiled network that is loaded onto one instance of the FPGA AI Suite IP on an FPGA device.
  • Allocates buffers in DDR memory to transfer configuration, filter, and bias data.
  • Creates BatchJob objects for a given number of pipelines and allocates input and output buffers for each pipeline in DDR.

BatchJob class

  • Represents a single batch inference job.
  • Stores the DDR addresses for batch input and output data.
  • Provides LoadInputFeatureToDdr function to transfer input data to DDR and start inference for this batch asynchronously.
  • Provides ReadOutputFeatureFromDdr function to transfer output data from DDR. Must be called after inference for this batch is completed.