FPGA AI Suite: Design Examples User Guide

ID 848957
Date 4/30/2025
Public
Document Table of Contents

3.3.1. OpenVINO™ FPGA Runtime Overview

The purpose of the runtime front end is as follows:
  • Provide input to the FPGA AI Suite IP
  • Consume output from the FPGA AI Suite IP
  • Control the FPGA AI Suite IP
Typically, this front-end layer provides the following items:
  • The .arch file that was used to configure the FPGA AI Suite on the FPGA.
  • The ML model (possibly precompiled into an Ahead-of-Time .bin file by the FPGA AI Suite compiler (dla_compiler).
  • A target device that is passed to OpenVINO™

    The target device may instruct OpenVINO™ to use the HETERO plugin, which allows a graph to be partitioned onto multiple devices.

One of the directories provided in the installation of the FPGA AI Suite is the runtime/ directory. In this directory, the FPGA AI Suite provides the source code to build a selection of OpenVINO™ applications. The runtime/ directory also includes the dla_benchmark command line utility that you can use to generate inference requests and benchmark the inference speed.

The following applications use the OpenVINO™ API. They support the OpenVINO™ HETERO plugin, which allows portions of the graph to fall-back onto the CPU for unsupported graph layers.
  • dla_benchmark (adapted from OpenVINO™ benchmark_app)
  • classification_sample_async
  • object_detection_demo_yolov3_async
  • segmentation_demo

Each of these applications serve as a runtime executable for the FPGA AI Suite. You might want to write your own OpenVINO™ -based front ends to wrap the FPGA plugin. For information about writing your own OpenVINO™ -based front ends, refer to the OpenVINO™ documentation.

Some of the responsibilities of the OpenVINO™ FPGA plugin are as follows:

  • Inference Execution
    • Mapping inference requests to an IP instance and internal buffers
    • Executing inference requests via the IP, managing synchronization and all data transfer between host and device.
  • Input / Output Data Transform
    • Converting the memory layout of input/output data
    • Converting the numeric precision of input/output data