1.7.2. Throughput for Reads

AN 690: PCI Express* Avalon® -MM DMA Reference Design

Download PDF

ID 683824

Date 5/08/2017

Version

Public

1.7.2. Throughput for Reads

PCI Express uses a split transaction model for reads. The read transaction includes the following steps:

The requester sends a Memory Read Request.
The completer sends out the ACK DLLP to acknowledge the Memory Read Request.
The completer returns a Completion with Data. The completer can split the Completion into multiple completion packets.

Read throughput is typically lower than write throughput because reads require two transactions instead of a single write for the same amount of data. The read throughput also depends on the round trip delay between the time when the Application Layer issues a Memory Read Request and the time when the requested data returns. To maximize the throughput, the application must issue enough outstanding read requests to cover this delay.

Figure 6. Read Request Timing

The figures below show the timing for Memory Read Requests (MRd) and Completions with Data (CplD). The first figure shows the requester waiting for the completion before issuing the subsequent requests. Waiting results in lower throughput. The second figure shows the requester making multiple outstanding read requests to eliminate the delay after the first data returns. Eliminating delays results in higher throughput.

To maintain maximum throughput for the completion data packets, the requester must optimize the following settings:

The number of completions in the RX buffer
The rate at which the Application Layer issues read requests and processes the completion data

Read Request Size

Another factor that affects throughput is the read request size. If a requester requires 4 KB data, the requester can issue four, 1 KB read requests or a single 4 KB read request. The 4 KB request results in higher throughput than the four, 1 KB reads. The Maximum Read Request Size value in Device Control register, bits [14:12], specifies the read request size.

Outstanding Read Requests

A final factor that can affect the throughput is the number of outstanding read requests. If the requester sends multiple read requests to improve throughput, the number of available header tags limits the number of outstanding read requests. To achieve higher performance, Intel® Arria® 10 and Intel® Cyclone® 10 GX read DMA can use up to 16 header tags. The Intel® Stratix® 10 read DMA can use up to 32 header tags.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

AN 690: PCI Express* Avalon® -MM DMA Reference Design

1.7.2. Throughput for Reads

Read Request Size

Outstanding Read Requests