FPGA AI Suite Handbook

ID 863373
Date 11/21/2025
Public
Document Table of Contents

3.1. Consider an FPGA AI Suite Design Example as Starting Point

The FPGA AI Suite provides several design examples that demonstrate how to integrate the FPGA AI Suite IP with real hardware platforms. These examples serve as a foundation to evaluate features, prototype workflows, and understand runtime interactions.

To choose the most appropriate platform, evaluate your application architecture and system-level constraints by asking yourself the following questions:
  • Do you want to offload inference to the FPGA from a host CPU (look-aside)?
  • Do you require standalone (hostless) operation without DDR memory?
  • Is your system embedded, with an onboard ARM core (SoC)?
  • Are you targeting PCIe* -attached cards, or SoCs with integrated HPS?
  • Do you want to use prebuilt bitstreams or build custom ones with Quartus?
Review the following available platforms and associated design examples, along with the key hardware and software components involved to help you decide if an FPGA AI Suite design example is a good starting point for your design:

PCIe* Host

Use Case: Server or workstation offloading inference to a PCIe* -attached FPGA.

Recommended boards:

  • Terasic* DE10-Agilex Development Board (DE10-Agilex-B2E2)
  • Open FPGA Stack (OFS)-based boards:
    • Agilex™ 5 FPGA E-Series 065B Modular Development Kit (MK-A5E065BB32AES1)
    • Agilex™ 7 FPGA I-Series Development Kit ES2 (DK-DEV-AGI027RBES)
    • Intel® FPGA SmartNIC N6001-PL Platform (without Ethernet controller)

Key Characteristics:

  • Supports look-aside model with host-driven control.
  • Integrates with the Intel® Distribution of OpenVINO™ toolkit (x86 host).
  • Build scripts support architecture selection and optional bitstream regeneration.
  • Designed for performance benchmarking and throughput-optimized inference.

Hostless DDR-Free

Use Case: Fully autonomous AI inference on FPGA, with no host processor and no external DDR memory.

Recommended boards:

  • Agilex™ 7 FPGA I-Series Development Kit ES2 (DK-DEV-AGI027RBES)

Key Characteristics:

  • Inputs, weights, and configurations stored in on-chip RAM or MIF.
  • No external DDR or runtime required.
  • Data streaming and results are handled via direct hardware interfaces.
  • Suitable for ultra-low-latency and minimal-footprint deployments.

DDR-Free Scenarios

DDR-Free architecture trades on-chip memory blocks with filter data access efficiency. Using DDR-Free architecture is beneficial in the following scenarios:

  • When DDR bandwidth bottlenecks the system performance but on-chip memory is sufficient, storing filter and configuration data on-chip lessen the needs for data transfer between the on-chip memory and external memory.
  • When the graph is reasonably-sized, its weights and biases fit on-chip memory, and a low latency or a high throughput is critical to your application. In this case, storing them to on-chip memory accelerates the access time and reduces per-layer latency.
  • When the multilane feature is enabled, PE arrays consume data more frequently in parallel. Storing filter data on-chip and letting them be shared across lanes result in the least extra resource overhead instead of fetching from external memory.

DDR-Free Constraints

DDR-free mode imposes certain constraints:

  • Memory Constraints: The architecture file must have sufficient on-chip memory to accommodate all graph parameters in the filter scratchpad. There must also be sufficient on-chip memory to store all intermediate surfaces in the stream buffer.

Hostless JTAG

Use Case: Direct control of FPGA-based inference through JTAG interface, typically in lab environments or tightly controlled edge systems.

Recommended boards:
  • Agilex™ 3 FPGA C-Series Development Kit (DK-A3Y135BM16AEA)
  • Agilex™ 5 FPGA E-Series 065B Modular Development Kit (MK-A5E065BB32AES1)

Key Characteristics:

  • External DDR is used for weights/features.
  • Host communicates over JTAG to FPGA.
  • Good for development, bring-up, or research workflows.

SoC Host

Use Case: Embedded AI inference using the FPGA hard processor system (HPS).

Recommended boards:

  • Agilex™ 5 FPGA E-Series 065B Modular Development Kit (MK-A5E065BB32AES1)
  • Agilex™ 7 FPGA I-Series Transceiver-SoC Development Kit (DK-SI-AGI027FC)
  • Arria® 10 SX SoC FPGA Development Kit (DK-SOC-10AS066S)

Key Characteristics:

  • Supports CPU-offload model using OpenVINO™ ARM plugin on Linux.
  • Uses Yocto-based builds to generate bootable SD card images.
  • Offers two execution modes:
    • M2M (Memory-to-Memory): Benchmark-style execution.
    • S2M (Streaming-to-Memory): Demonstrates live streaming from CPU to FPGA.

AI Video (SoC Host)

Use Case: Embedded AI inference on video inputs using the FPGA hard processor system (HPS).

Recommended boards:

  • Agilex™ 5 FPGA E-Series 065B Modular Development Kit (MK-A5E065BB32AES1)

Key Characteristics:

Custom Platform

Use Case: Support for non-standard or production-specific platforms that are not directly covered by default design examples.

Examples:

  • Carrier boards with custom pin maps or I/O.
  • PCIe add-in cards with proprietary form factors.
  • Edge or embedded systems with bespoke power or memory configurations.

Key Considerations:

  • Platform Definition: Use a custom Open FPGA Stack (OFS) Platform Interface Manager (PIM) or modify an existing BSP.
  • Interface Integration: Ensure compatibility with chosen I/O (PCIe, JTAG, streaming interfaces).
  • Bitstream Generation: You must have a valid FPGA AI Suite license that allows custom compilation.
  • Toolchain Compatibility: Align Quartus, BSP, and runtime versions with FPGA AI Suite requirements.
  • Software Stack: Modify or extend the OpenVINO™ integration layer (for host-side) or use lightweight inference APIs if hostless.

This structured platform breakdown helps ensure you're targeting the appropriate flow for your system constraints. Once your platform is selected, you can follow the design example’s instructions to build, modify, or integrate the FPGA AI Suite into your deployment pipeline.