L- and H-Tile Avalon® Streaming and Single Root I/O Virtualization (SR-IOV) Intel® FPGA IP for PCI Express* User Guide
ID
683111
Date
9/12/2024
Public
1. Introduction
2. Quick Start Guide
3. Interface Overview
4. Parameters
5. Designing with the IP Core
6. Block Descriptions
7. Interrupts
8. Registers
9. Testbench and Design Example
10. Document Revision History
A. PCI Express Core Architecture
B. TX Credit Adjustment Sample Code
C. Root Port Enumeration
D. Troubleshooting and Observing the Link Status
1.1. Avalon-ST Interface with Optional SR-IOV for PCIe Introduction
1.2. Features
1.3. Release Information
1.4. Device Family Support
1.5. Recommended Fabric Speed Grades
1.6. Performance and Resource Utilization
1.7. Transceiver Tiles
1.8. PCI Express IP Core Package Layout
1.9. Channel Availability
2.1. Design Components
2.2. Hardware and Software Requirements
2.3. Directory Structure
2.4. Generating the Design Example
2.5. Simulating the Design Example
2.6. Compiling the Design Example and Programming the Device
2.7. Installing the Linux Kernel Driver
2.8. Running the Design Example Application
3.1. Avalon-ST RX Interface
3.2. Avalon-ST TX Interface
3.3. TX Credit Interface
3.4. TX and RX Serial Data
3.5. Clocks
3.6. Function-Level Reset (FLR) Interface
3.7. Control Shadow Interface for SR-IOV
3.8. Configuration Extension Bus Interface
3.9. Hard IP Reconfiguration Interface
3.10. Interrupt Interfaces
3.11. Power Management Interface
3.12. Reset
3.13. Transaction Layer Configuration Interface
3.14. PLL Reconfiguration Interface
3.15. PIPE Interface (Simulation Only)
4.1. Stratix 10 Avalon-ST Settings
4.2. Multifunction and SR-IOV System Settings
4.3. Base Address Registers
4.4. Device Identification Registers
4.5. TPH/ATS Capabilities
4.6. PCI Express and PCI Capabilities Parameters
4.7. Configuration, Debug and Extension Options
4.8. PHY Characteristics
4.9. Example Designs
6.1.1. TLP Header and Data Alignment for the Avalon-ST RX and TX Interfaces
6.1.2. Avalon-ST 256-Bit RX Interface
6.1.3. Avalon-ST 512-Bit RX Interface
6.1.4. Avalon-ST 256-Bit TX Interface
6.1.5. Avalon-ST 512-Bit TX Interface
6.1.6. TX Credit Interface
6.1.7. Interpreting the TX Credit Interface
6.1.8. Clocks
6.1.9. Update Flow Control Timer and Credit Release
6.1.10. Function-Level Reset (FLR) Interface
6.1.11. Resets
6.1.12. Interrupts
6.1.13. Control Shadow Interface for SR-IOV
6.1.14. Transaction Layer Configuration Space Interface
6.1.15. Configuration Extension Bus Interface
6.1.16. Hard IP Status Interface
6.1.17. Hard IP Reconfiguration
6.1.18. Power Management Interface
6.1.19. Serial Data Interface
6.1.20. PIPE Interface
6.1.21. Test Interface
6.1.22. PLL IP Reconfiguration
6.1.23. Message Handling
8.1.1. Register Access Definitions
8.1.2. PCI Configuration Header Registers
8.1.3. PCI Express Capability Structures
8.1.4. Intel Defined VSEC Capability Header
8.1.5. General Purpose Control and Status Register
8.1.6. Uncorrectable Internal Error Status Register
8.1.7. Uncorrectable Internal Error Mask Register
8.1.8. Correctable Internal Error Status Register
8.1.9. Correctable Internal Error Mask Register
8.1.10. SR-IOV Virtualization Extended Capabilities Registers Address Map
8.1.10.1. ARI Enhanced Capability Header
8.1.10.2. SR-IOV Enhanced Capability Registers
8.1.10.3. Initial VFs and Total VFs Registers
8.1.10.4. VF Device ID Register
8.1.10.5. Page Size Registers
8.1.10.6. VF Base Address Registers (BARs) 0-5
8.1.10.7. Secondary PCI Express Extended Capability Header
8.1.10.8. Lane Status Registers
8.1.10.9. Transaction Processing Hints (TPH) Requester Enhanced Capability Header
8.1.10.10. TPH Requester Capability Register
8.1.10.11. TPH Requester Control Register
8.1.10.12. Address Translation Services ATS Enhanced Capability Header
8.1.10.13. ATS Capability Register and ATS Control Register
9.4.1. ebfm_barwr Procedure
9.4.2. ebfm_barwr_imm Procedure
9.4.3. ebfm_barrd_wait Procedure
9.4.4. ebfm_barrd_nowt Procedure
9.4.5. ebfm_cfgwr_imm_wait Procedure
9.4.6. ebfm_cfgwr_imm_nowt Procedure
9.4.7. ebfm_cfgrd_wait Procedure
9.4.8. ebfm_cfgrd_nowt Procedure
9.4.9. BFM Configuration Procedures
9.4.10. BFM Shared Memory Access Procedures
9.4.11. BFM Log and Message Procedures
9.4.12. Verilog HDL Formatting Functions
A.3. Physical Layer
The Physical Layer is the lowest level of the PCI Express protocol stack. It is the layer closest to the serial link. It encodes and transmits packets across a link and accepts and decodes received packets. The Physical Layer connects to the link through a high‑speed SERDES interface running at 2.5 Gbps for Gen1 implementations, at 2.5 or 5.0 Gbps for Gen2 implementations, and at 2.5, 5.0 or 8.0 Gbps for Gen3 implementations.
The Physical Layer is responsible for the following actions:
- Training the link
- Scrambling/descrambling and 8B/10B encoding/decoding for 2.5 Gbps (Gen1), 5.0 Gbps (Gen2), or 128b/130b encoding/decoding of 8.0 Gbps (Gen3) per lane
- Serializing and deserializing data
- Equalization (Gen3)
- Operating the PIPE 3.0 Interface
- Implementing auto speed negotiation (Gen2 and Gen3)
- Transmitting and decoding the training sequence
- Providing hardware autonomous speed control
- Implementing auto lane reversal
The Physical Layer is subdivided by the PIPE Interface Specification into two layers (bracketed horizontally in above figure):
- PHYMAC—The MAC layer includes the LTSSM and the scrambling/descrambling. byte reordering, and multilane deskew functions.
- PHY Layer—The PHY layer includes the 8B/10B encode and decode functions for Gen1 and Gen2. It includes 128b/130b encode and decode functions for Gen3. The PHY also includes elastic buffering and serialization/deserialization functions.
The Physical Layer integrates both digital and analog elements. Intel designed the PIPE interface to separate the PHYMAC from the PHY. The Intel Hard IP for PCI Express complies with the PIPE interface specification.
Note: The internal PIPE interface is visible for simulation. It is not available for debugging in hardware using a logic analyzer such as Signal Tap. If you try to connect Signal Tap to this interface the design fails compilation.
Figure 77. Physical Layer Architecture
The PHYMAC block comprises four main sub-blocks:
- MAC Lane—Both the RX and the TX path use this block.
- On the RX side, the block decodes the Physical Layer packet and reports to the LTSSM the type and number of TS1/TS2 ordered sets received.
- On the TX side, the block multiplexes data from the DLL and the Ordered Set and SKP sub-block (LTSTX). It also adds lane specific information, including the lane number and the force PAD value when the LTSSM disables the lane during initialization.
- LTSSM—This block implements the LTSSM and logic that tracks TX and RX training sequences on each lane.
- For transmission, it interacts with each MAC lane sub-block and with the LTSTX sub-block by asserting both global and per-lane control bits to generate specific Physical Layer packets.
- On the receive path, it receives the Physical Layer packets reported by each MAC lane sub-block. It also enables the multilane deskew block. This block reports the Physical Layer status to higher layers.
- LTSTX (Ordered Set and SKP Generation)—This sub-block generates the Physical Layer packet. It receives control signals from the LTSSM block and generates Physical Layer packet for each lane. It generates the same Physical Layer Packet for all lanes and PAD symbols for the link or lane number in the corresponding TS1/TS2 fields. The block also handles the receiver detection operation to the PCS sub-layer by asserting predefined PIPE signals and waiting for the result. It also generates a SKP Ordered Set at every predefined timeslot and interacts with the TX alignment block to prevent the insertion of a SKP Ordered Set in the middle of packet.
- Deskew—This sub-block performs the multilane deskew function and the RX alignment between the initialized lanes and the datapath. The multilane deskew implements an eight-word FIFO buffer for each lane to store symbols. Each symbol includes eight data bits, one disparity bit, and one control bit. The FIFO discards the FTS, COM, and SKP symbols and replaces PAD and IDL with D0.0 data. When all eight FIFOs contain data, a read can occur. When the multilane lane deskew block is first enabled, each FIFO begins writing after the first COM is detected. If all lanes have not detected a COM symbol after seven clock cycles, they are reset and the resynchronization process restarts, or else the RX alignment function recreates a 64-bit data word which is sent to the DLL.