MACsec Intel® FPGA System Design User Guide

ID 767516
Date 3/31/2024
Public
Document Table of Contents

2.3.2. MCDMA

The Multi Channel DMA for PCI Express enables you to efficiently transfer data between the host and device. It supports multiple DMA channels between the host and device over the underlying PCIe link. A DMA channel consists of an H2D (host to device) and D2H (device to host) queue pair.

As shown in the figure below, the Multi Channel DMA can be used in a server’s hardware infrastructure to allow communication between various VM clients and their FPGA-device based counterparts. The Multi Channel DMA operates on descriptor-based queues set up by driver software to transfer data between local FPGA and the host. The Multi Channel DMA ’s control logic reads the queue descriptors and executes them.

The Multi Channel DMA IP integrates the Intel® PCIe Hard IP and interfaces with the host Root Complex via the PCIe serial lanes. On the user logic interface, the Avalon-MM/Avalon-ST interfaces allow the designer easy integration of the multi Channel DMA IP for PCI Express with other Platform Designer components.
Figure 17. MCDMA
The MCDMA engine operates on a software DMA queue to transfer data between the local FPGA and the host. The elements of each queue are software descriptors that are written by driver/software. Hardware reads the queue descriptors and executes them. Data Mover Blocks for both the directions (D2H, H2D) fetch the descriptors and are responsible for data transfers to or from the system memory locations specified in the descriptors or from/to the user application in the hardware.
Figure 18. Descriptor Linked-List

A DMA channel to support Multi Channel DMA data movement consists of a pair of the descriptor queues: one H2D descriptor queue and one D2H descriptor queue. As shown in the figure above, the descriptors are arranged contiguously within a 4 KB page. Each descriptor is 32 bytes in size. The descriptors are kept in host memory in a linked-list of 4 KB pages. For a 32 byte descriptor and a 4 KB page, each page contains up to 128 descriptors. The last descriptor in a 4 KB page must be a “link descriptor” – a descriptor containing a link to the next 4 KB page with the link bit set to 1. The last entry in the linked list must be a link pointing to the base address programmed in the QCSR, in order to achieve a circular buffer containing a linked-list of 4 KB pages.

Software and hardware communicate and manage the descriptors using tail index pointer (Q_TAIL_POINTER) and head index pointer (Q_HEAD_POINTER) QCSR registers as shown in the figure below. The DMA starts when software writes the last valid descriptor index to the Q_TAIL_POINTER register.
Figure 19. Buffer Descriptor (BD) Ring

Descriptors between the Q_HEAD_POINTER and Q_TAIL_POINTER are enabled for HW usage and descriptors between the Q_TAIL_POINTER and Q_HEAD_POINTER are in SW control. DMA operations for a particular queue pause when both the pointers are same.

It also offers a DMA-bypass capability to the host for doing PIO Read/Writes to device memory. This interface is used for downstream logic CSR access. PCIe BAR2 is mapped to the Avalon-MM PIO Master. Any TLP targeting BAR2 is forwarded to the user logic. TLP addresses targeting the PIO interface should be 8 bytes aligned. The PIO interface supports non-bursting 64-bit write and read transfers.

The PIO interface address mapping is as follows: PIO address = {vf_active, pf, vf, csr_addr}
  1. vf_active: This indicates that SRIOV is enabled.
  2. pf [PF_NUM-1:0]: Physical function number decoded from the PCIe header received from the HIP; PF_NUM which is ($clog2(pf_num_tcl)) is the RTL design parameter selected by you such that Multi Channel DMA IP only allocates required number of the bits on Avalon-MM side to limit the number of the wires on the user interface.
  3. vf [VF_NUM-1:0]: Virtual function number decoded from the PCIe header received from the HIP; VF_NUM which is ($clog2(vf_num_tcl)) is the RTL design parameter selected by you such that Multi Channel DMA IP only allocates required number of the bits on Avalon-MM side to limit the number of the wires on the user interface.
  4. csr_addr [ADDR_SIZE-1:0]: Number of bits required for BAR2 size requested across all Functions (PFs and VFs) Example: If BAR2 is selected as 4 MB, the ADDR_SIZE = 22.
The table below shows the DMA channel mapping w.r.t MCDMA IP parameter setting.
  • Number of PFs – 2
  • Number of VFs per PF – 0
  • Number of DMA channels per PF – 1
Table 2.  DMA Channel Mapping w.r.t. MCDMA IP Parameters
PF MCDMA SW Channel User Channel
0 0 0
1 0 1