Visible to Intel only — GUID: lxk1612826774656
Ixiasoft
Visible to Intel only — GUID: lxk1612826774656
Ixiasoft
4.3.1.4. Avalon® Streaming TX Interface
The Application Layer transfers data to the Transaction Layer of the R-tile PCI Express IP core over the Avalon® -ST TX interface. The R-tile PCI Express IP core must assert pX_tx_st_ready_o before transmission begins.
If the R-tile PCI Express IP core is configured in Configuration Mode 0 (1x16) with a double-width configuration, there are four segments with a 256-bit data width that allows multiple TLPs per cycle. This means there are four pX_tx_stN_sop_i signals and four pX_tx_stN_eop_i signals for Configuration Mode 0 (1x16).
This interface also does not follow a fixed latency between the pX_tx_st_ready_o and pX_tx_stN_dvalid_i signals as specified by the Avalon Interface Specifications.
The R-tile PCI Express core when in Configuration Mode 0 (1x16) and in a double-width configuration provides four segments with each one having 256 bits of data (pX_tx_stN_data_i[255:0]), 128 bits of header (pX_tx_stN_hdr_i[127:0]), and 32 bits of TLP prefix (pX_tx_stN_prefix_i[31:0]). If the core is configured in Configuration Mode 0 (1x16), all four segments are used, so the data bus becomes a 1024-bit bus altogether, consisting of pX_tx_st0_data_i[255:0], pX_tx_st1_data_i[255:0], pX_tx_st2_data_i[255:0], and pX_tx_st3_data_1[255:0].
Parity generation is done via a 32:1 XOR (i.e. there is one parity bit for every 32 data, header or prefix bits).
- Transmission of a TLP must be uninterrupted when pX_tx_st_ready_o is asserted. The application must not deassert pX_tx_stN_valid_i between pX_tx_stN_sop_i and pX_tx_stN_eop_i on a ready cycle unless there is backpressure from the R-tile PCIe IP core indicated by the deassertion of pX_tx_st_ready_o.
Note: Failing to meet this guideline may cause the transmission of a TLP with an invalid LCRC.
- For the Configuration Mode 0 (1x16) in double-width mode, the start of a TLP (pX_tx_stN_sop_i) can only happen in segment 0 (st0) or segment 2 (st2) (i.e. a given TLP cannot start on segment 1 or segment 3).
- For the Configuration Mode 0 (1x16) in double-width mode, the header segment 2 (st2_hdr) is allowed only if segment 0 and segment 1 are also used (i.e. st0_hdr, st1_hdr and st0_data, st1_data are also used).
- For a single TLP spanning across multiple segments, the application logic needs to send the TLP in the order of the segment index (segment st0 → st1 → st2 → st3 → st0).
- If the TLP length of the TLP being transmitted is greater than the segment size, the segment used to assert the pX_tx_stN_eop_i signal is dictated by the TLP length.
- If the TLP length being transmitted is less than the segment size (255 bits), the corresponding pX_tx_stN_eop_i signal needs to happen in the same segment where pX_tx_stN_sop_i is being asserted.
- The maximum latency between the deassertion of pX_tx_st_ready_o and pX_tx_stN_valid_i is 16 coreclkout_hip cycles.
- For Configuration Mode 0 (1x16) in single-width mode, only one segment can be used per clock cycle (i.e. st0_hdr/st0_data or st1_hdr/st1_data). In addition, If segment 1 is used, st0_data must be used by the previous TLP.
Signal Name | Direction | Description | EP/RP/BP | Clock Domain |
---|---|---|---|---|
pX_tx_stN_data_i[255:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Application Layer data for transmission. The data bus is organized in multiple 256-bit segments. In x16 mode, all four segments are used to effectively form a 1024-bit data bus. In x8 mode, two segments are used to form a 512-bit data bus. In x4 mode, each 256-bit segment is an independent data bus. The Application Layer must provide a properly formatted TLP on the TX interface. The data is valid when the corresponding tx_stN_valid_i signal is asserted. The mapping of message TLPs is the same as the mapping of Transaction Layer TLPs with 4-dword headers. The number of data cycles must be correct for the length and address fields in the header. Issuing a packet with an incorrect number of data cycles results in the TX interface hanging and becoming unable to accept further requests. Note: There must be no Idle cycle between the tx_stN_sop_i and tx_stN_eop_i cycles unless there is backpressure with the deassertion of tx_st_ready_o. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_hdr_i[127:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | This is the header to be transmitted, which follows the TLP header format of the PCIe specifications. Consider the following guidelines:
These signals are valid when the corresponding tx_stN_sop_i signal is asserted. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_prefix_i[31:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | This is the TLP prefix to be transmitted, which follows the TLP prefix format of the PCIe specifications. PASID is supported. These signals are valid when the corresponding tx_stN_sop_i signal is asserted. The TLP prefix uses a Big Endian implementation (i.e. the Fmt field is in bits [31:29] and the Type field is in bits [28:24]). If no prefix is present for a given TLP, that dword, including the Fmt field, is all zeros. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_sop_i where X = 0,1,2,3 (IP core number) N = 0,2 (segment number) |
Input | Indicate the first cycle of a TLP when asserted in conjunction with the corresponding bit of tx_stN_valid_i. For the x16 configuration:
These signals are asserted for one clock cycle per each TLP. They also qualify the corresponding tx_stN_hdr_i and tx_stN_tlp_prfx_i signals.
Note: pX_tx_stN_sop_i pulses can only be sent on segments 0 or 2 (st0 or st2).
|
EP/RP/BP | coreclkout_hip |
pX_tx_stN_eop_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Indicate the last cycle of a TLP when asserted in conjunction with the corresponding bit of tx_stN_valid_i. For the x16 configuration:
These signals are asserted for one clock cycle per each TLP. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_dvalid_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Qualify the data of the corresponding segment of tx_stN_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_stN_dvalid_i signals. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_hvalid_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Qualify the header of the corresponding segment of tx_stN_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_stN_hvalid_i signals. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_pvalid_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Qualify the prefix of the corresponding segment of tx_stN_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_stN_pvalid_i signals. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_data_par_i[Z:0] where X = 0,1,2,3 (IP core number) and Z varies based on the core. N = 0,1,2,3 (segment number) |
Input | Parity for tx_stN_data_i. Bit [0] corresponds to tx_stN_data_i[31:0], bit [1] corresponds to tx_stN_data_i[63:32], and so on. By default, the PCIe Hard IP generates the parity for the TX data. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_hdr_par_i[3:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Parity for tx_stN_hdr_i. By default, the PCIe Hard IP generates the parity for the TX header. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_prefix_par_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Parity for tx_stN_tlp_prfx_i. By default, the PCIe Hard IP generates the parity for the TX TLP prefix. |
EP/RP/BP | coreclkout_hip |
pX_tx_st_ready_o where X = 0,1,2,3 (IP core number) |
Output | Indicates that the PCIe Hard IP is ready to accept data. The readyLatency maximum is 16 cycles. If tx_st_ready_o is asserted by the Transaction Layer in the PCIe Hard IP on cycle <n>, then <n> + readyLatency is a ready cycle, during which the Application may assert tx_stN_valid_i and transfer data. If tx_st_ready_o is deasserted by the Transaction Layer on cycle <n>, then the Application must deassert tx_stN_valid_i within the readyLatency number of cycles after cycle <n>. tx_st_ready_o can be deasserted in the following conditions:
|
EP/RP/BP | coreclkout_hip |
As an example, Avalon® Streaming TX Interface Timings below shows the behavior of the Avalon Streaming TX interface in a back-to-back TLPs scenario with data spanning across multiple segments. The following text describes the waveforms per clock cycle:
- Clock cycle 1: The R-tile Intel FPGA IP for PCI Express asserts p0_tx_st_ready_o signal, indicating the Hard IP is ready to accept TLPs from the Application logic.
- Clock cycle 2:
- The start of the first TLP (T0) is in segment 0, indicated by the assertion of p0_tx_st0_sop_i.
- The signal p0_tx_st0_hvalid_i is asserted to validate the header of this first TLP (T0H0) in the p0_tx_st0_hdr_i bus.
- The signal p0_tx_st0_dvalid_i is asserted to validate the data of this first TLP (T0D0) in the p0_tx_st0_data_i bus.
- The signal p0_tx_st1_dvalid_i is asserted to validate the next portion of the data of this first TLP (T0D1) in the p0_tx_st1_data_i bus.
- The signal p0_tx_st2_dvalid_i is asserted to validate the next portion of the data of this first TLP (T0D2) in the p0_tx_st2_data_i bus.
- The signal p0_tx_st3_dvalid_i is asserted to validate the final portion of the data of this first TLP (T0D3) in the p0_tx_st3_data_i bus.
- The end of this first TLP (T0) is in segment 3, denoted by the assertion of p0_tx_st3_eop_i.
- Clock cycle 3:
- The next TLP (T1), arrives in segment 0, as denoted by p0_tx_st0_sop_i staying high.
- The signal p0_tx_st0_hvalid_i is asserted to validate the header of this TLP (T1H0) in the p0_tx_st0_hdr_i bus.
- The signal p0_tx_st0_dvalid_i is asserted to validate the data of this TLP (T1D0) in the p0_tx_st0_data_i bus.
- The signal p0_tx_st1_dvalid_i is asserted to validate the next portion of the data of this TLP (T1D1) in the p0_tx_st1_data_i bus.
- The signal p0_tx_st2_dvalid_i is asserted to validate the next portion of the data of this TLP (T1D2) in the p0_tx_st2_data_i bus.
- The signal p0_tx_st3_dvalid_i is asserted to validate the final portion of the data of this TLP (T1D2) in the p0_tx_st3_data_i bus.
- The end of this TLP (T1) is in segment 3, denoted by p0_tx_st3_eop_i staying high.