Streaming DMA Accelerator Functional Unit User Guide: Intel FPGA Programmable Acceleration Card D5005

ID 683424
Date 11/04/2019

2. Streaming DMA AFU Description

The streaming DMA AFU design example shows how to transfer data between the memory and Avalon® -ST sources and sinks. Most commonly, a streaming DMA is utilized to transfer data from host memory into a hardware accelerator and stream the results back to host memory without using the local FPGA memory as a temporary buffer. These streams typically operate in parallel mode and reduce the latency of a hardware accelerator by removing the additional memory copy operations.

The streaming DMA AFU comprises the following sub-modules:
  • Memory Properties Factory (MPF) Basic Building Block (BBB)
  • Core Cache Interface (CCI-P) to Avalon® -MM Adapter
  • Streaming DMA Test System, which includes:
    • Memory-to-Stream (M2S) DMA BBB
    • Steam-to-Memory (S2M) DMA BBB
    • Streaming Pattern Checker and Generator
The streaming DMA AFU design example includes a user space driver as well as a host application that performs data transfer between host memory and the FPGA pattern checker and generator. You can use this design example as a starting point to implement streaming data transfers in your own AFU design by replacing the pattern checker and generator with your hardware accelerator and modifying the host application accordingly.

Both M2S and S2M DMA BBBs support packetized data, therefore the streaming data includes the start-of-packet (SOP), end-of-packet (EOP), and empty signals. You can use this packet support to transfer a hardware driven payload size. For example, a compression accelerator typically receives a known payload size; and the compression results have an unknown length until the accelerator completes this task. The compression accelerator simply issues a packet to the S2M DMA BBB and the driver provides the host application metadata that describes how much data can be transferred.