Intel® FPGA SDK for OpenCL™: Intel® Arria® 10 GX FPGA Development Kit Reference Platform Porting Guide

ID 683267
Date 3/28/2022
Public
Document Table of Contents

3.1.6.1. Implementing a DMA Transfer

Implement a DMA transfer in the MMD on Windows ( INTELFPGAOCLSDKROOT\board\a10_ref\source\host\mmd\acl_pcie_dma_windows.cpp) or in the kernel driver on Linux ( INTELFPGAOCLSDKROOT/board/a10_ref/linux64/driver/aclpci_dma).
Note:

For Windows, the Jungo WinDriver imposes a 5000 to 10000 limit on the number of interrupts received per second in user mode. This limit translates to a 2.5 gigabytes per second (GBps) to 5 GBps DMA bandwidth when a full 128-entry table of 4 KB page is transferred per interrupt.

On Windows, polling is the default method for maximizing PCIe DMA bandwidth at the expense of CPU run time. To use interrupts instead of polling, assign a non-NULL value to the ACL_PCIE_DMA_USE_MSI environment variable.

To implement a DMA transfer:

  1. Verify that the previous DMA transfer sent all the requested bytes of data.
  2. Map the virtual memories that are requested for DMA transfer to physical addresses.
    Note: The amount of virtual memory that can be mapped at a time is system dependent. Large DMA transfers require multiple mapping or unmapping operations. For a higher bandwidth, map the virtual memory ahead in a separate thread that is in parallel to the transfer.
  3. Set up the DMA descriptor table on local memory.
  4. Write the location of the DMA descriptor table, which is in user memory, to the DMA control registers (that is, RC Read Status and Descriptor Base and RC Write Status and Descriptor Base).
  5. Write the Platform Designer address of descriptor FIFOs to the DMA control registers (that is EP Read Descriptor FIFO Base and EP Write Status and Descriptor FIFO Base).
  6. Write the start signal to the RD_DMA_LAST_PTR and WR_DMA_LAST_PTR DMA control registers.
  7. After the current DMA transfer finishes, repeat the procedure to implement the next DMA transfer.