DMA Accelerator Functional Unit User Guide: Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA

ID 683263
Date 3/06/2020
Public

2.3. The DMA AFU Hardware Components

The DMA AFU interfaces with the FPGA Interface Unit (FIU) and two banks of local DDR4-SDRAM. The total memory addressable on the device is 8 gigabytes (8 GB). The memory comprises two, 4 GB banks.

Note: The currently available hardware dictates this memory configuration. Future hardware may support different memory configurations.

You can use the DMA AFU to copy data between the following source and destination locations:

  • The host to device FPGA memory
  • Device FPGA memory to the host

A Platform Designer system, $OPAE_PLATFORM_ROOT/hw/samples/dma_afu/hw/rtl/qsys/<device>/dma_test_system.qsys implements most of the DMA AFU. The DMA BBB subsystem is located at <installation path>/hw/samples/dma_afu/hw/rtl/qsys/<device>/msgdma_bbb.qsys

Figure 1. DMA AFU Hardware Block Diagram

The DMA AFU includes the following internal modules to interface with the FPGA Interface Unit (FIU):

  • Memory-Mapped IO (MMIO) Decoder Logic: detects MMIO read and write transactions and separates them from the CCI-P RX channel 0 that they arrive from. This ensures that MMIO traffic never reaches the MPF BBB and is serviced by an independent MMIO command channel.
  • Memory Properties Factory (MPF): This module ensures that read responses from the DMA return in the order that they were issued. The Avalon® -MM protocol requires read responses to return in the correct order.
  • CCI-P to Avalon® -MM Adapter: This module translates between CCI-P and Avalon® -MM transactions, as follows:
    • CCI-P to Avalon® -MMIO Adapter: This path translates CCI-P MMIO transactions into Avalon® -MM transactions.
      Note: MMIO accesses do not support backpressure. As a result, the CCI-P to Avalon® -MM Adapter does not support the waitrequest signal. Intel recommends that you add an Avalon® -MM Clock Crossing Bridge, available in the IP Catalog, between the CCI-P to Avalon® MMIO Adapter master port and the DMA Test System Avalon® -MM slave port. Intel recommends that you set the clock crossing command depth to 64 entries deep and disable burst support.
    • Avalon® -MM to CCI-P: These paths create separate read-only and write-only paths for the DMA to access host memory.

    The Avalon® -MM write slave of the CCI-P to Avalon Adapter includes an extra, high-order bit to implement write fences. When the high-order bit is set to 1'b1, the CCI-P adapter first issues a write fence. Then, the CCI-P bridge writes data to the host physical address space with the high-order bit is set to 1'b0. This operation allows the DMA to synchronize writes to host memory. The DMA BBB is not capable of receiving write responses so that the write fence is used to synchronize the write data with the host.

  • DMA Test System: This module serves as a wrapper around the DMA BBB to expose the DMA masters and interrupt interfaces to the rest of the logic in the AFU. It provides the interface between the DMA BBB and the CCI-P to Avalon® Adapter. It also provides the interface between the DMA BBB and the local FPGA SDRAM banks.