Intel FPGA P-Tile Avalon Streaming IP for PCI Express User Guide
Version Information
Updated for: |
---|
Intel® Quartus® Prime Design Suite 20.4 |
IP Version 4.0.0 |
1. Introduction
1.1. Overview
P-Tile is an FPGA companion tile die that supports PCI Express* Gen4 in Endpoint, Root Port and TLP Bypass modes.
It serves as a companion tile for both Intel® Stratix® 10 DX and Intel® Agilex™ devices.
P-Tile natively supports PCI Express* Gen3 and Gen4 configurations.
1.2. Features
The P-tile Avalon® streaming IP for PCI Express* supports the following features:
- Complete protocol stack including the Transaction, Data Link, and Physical Layers implemented as a Hard IP.
- Configurations supported:
Table 1. Configurations Supported by the P-Tile Avalon® streaming IP for PCI Express Gen3/Gen4 x16 Gen3/Gen4 x8 Gen3/Gen4 x4 Endpoint (EP) Yes Yes N/A Root Port (RP) Yes N/A Yes TLP Bypass Yes Yes Yes Note: Gen1/Gen2 configurations are supported via link down-training. - Static port bifurcation (four x4s Root Port, two x8s Endpoint).
- Supports TLP Bypass mode.
- Supports one x16, two x8, or four x4 interfaces.
- Supports upstream/downstream TLP bypass mode.
- Supports up to 512-byte maximum payload size (MPS).
- Supports up to 4096-byte (4 KB) maximum read request size (MRRS).
- Single Virtual Channel (VC).
- Page Request Services (PRS).
- Completion Timeout Ranges.
- Atomic Operations (FetchAdd/Swap/CAS).
- Extended Tag Support.
- 10-bit Tag Support (Port 0 x16 Controller only)
- Separate Refclk with Independent Spread Spectrum Clocking (SRIS).
- Separate Refclk with no Spread Spectrum Clocking (SRNS).
- Common Refclk architecture.
-
PCI Express*
Advanced Error Reporting
(PF only).Note: Advanced Error Reporting is always enabled in the P-Tile Avalon® streaming IP for PCIe.
- ECRC generation and checking.
- Data bus parity protection.
- Supports D0 and D3 PCIe power states.
- Lane Margining at Receiver.
- Retimers presence detection.
- SR-IOV support (8 PFs, 2K VFs per each Endpoint).
- Access Control Service (ACS) capability.Note: For ACS, only ports 0 and 1 are supported.
- Alternative Routing-ID Interpretation (ARI).
- Function Level Reset (FLR).
- TLP Processing Hint (TPH).Note: TPH supports the "No Steering Tag (ST)" mode only.
- Address Translation Services (ATS). (For more information, refer to Implementation of Address Translation Services (ATS) in Endpoint Mode).
- Process Address Space ID (PasID).
- Configuration Intercept Interface (for VirtIO).
- User packet interface with separate header, data and prefix.
- User packet interface with a split-bus architecture where the header, data and prefix busses consist of two segments each (x16 mode only). This improves the bandwidth efficiency of this interface as it can handle up to 2 TLPs in any given cycle.
- Maximum numbers
of
outstanding Non-Posted
requests
(NPRs) supported when 8-bit tags or 10-bit tags are
enabled are
summarized in the table below:
Table 2. Outstanding Non-Posted Requests Supported Ports Active Cores 8-bit Tags 10-bit Tags 0 x16 256 512 (*) 1 x8 256 N/A 2 and 3 x4 256 N/A Note: (*): Use tags 256 to 767. - Completion timeout interface.
- The PCIe Hard IP can optionally track outgoing non-posted packets to report completion timeout information to the application.
- You cannot change the pin allocations for the P-Tile Avalon® streaming IP for PCI Express* in the Intel® Quartus® Prime project. However, this IP does support lane reversal and polarity inversion on the PCB by default.
- Supports Autonomous Hard IP mode.
- This mode allows the PCIe Hard IP to communicate with the Host
before the FPGA configuration and entry into User mode are complete.Note: Unless Readiness Notifications mechanisms are used, the Root Complex and/or system software must allow at least 1.0 s after a Conventional Reset of a device before it may determine that a device that fails to return a Successful Completion status for a valid Configuration Request is a broken device. This period is independent of how quickly Link training completes.
- This mode allows the PCIe Hard IP to communicate with the Host
before the FPGA configuration and entry into User mode are complete.
- FPGA core configuration via PCIe link (CvP Init and CvP Update).Note: CvP Init and CvP Update are available for Intel® Stratix® 10 DX devices. For Intel® Agilex™ devices, CvP Init is available, and CvP Update will be available in a future Intel® Quartus® Prime release.Note: For Gen3 and Gen4 x16 variants, Port 0 (corresponding to lanes 0 - 15) supports the CvP features. For Gen3 and Gen4 x8 variants, only Port 0 (corresponding to lanes 0 - 7) supports the CvP features. Port 1 (corresponding to lanes 8 - 15) does not support CvP.
- Device-dependent PLD clock (coreclkout_hip) frequency.
- 350 MHz / 400 MHz for Intel® Stratix® 10 DX devices, 350 MHz / 400 MHz / 500 MHz for Intel® Agilex™ devices.
- P-Tile Debug Toolkit including the following features:
- Protocol and link status information.
- Basic and advanced debugging capabilities including PMA register access and Eye viewing capability.
- Modelsim and VCS are the simulators supported in the 20.3 release of Intel® Quartus® Prime. Other simulators may be supported in a future release.
1.3. Release Information
Item |
Description |
---|---|
IP Version |
4.0.0 |
Intel® Quartus® Prime Version | 20.4 |
Release Date | December 2020 |
Ordering Codes |
No ordering code is required |
IP versions are the same as the Intel Quartus Prime Design Suite software versions up to v19.1. From Intel Quartus Prime Design Suite software version 19.2 or later, IPs have a new IP versioning scheme.
- X indicates a major revision of the IP. If you update your Intel Quartus Prime software, you must regenerate the IP.
- Y indicates the IP includes new features. Regenerate your IP to include these new features.
- Z indicates the IP includes minor changes. Regenerate your IP to include these changes.
Intel verifies that the current version of the Intel® Quartus® Prime Pro Edition software compiles the previous version of each IP core, if this IP core was included in the previous release. Intel reports any exceptions to this verification in the Intel IP Release Notes or clarifies them in the Intel® Quartus® Prime Pro Edition IP Update tool. Intel does not verify compilation with IP core versions older than the previous release.
1.4. Device Family Support
The following terms define device support levels for Intel® FPGA IP cores:
- Advance support—the IP core is available for simulation and compilation for this device family. Timing models include initial engineering estimates of delays based on early post-layout information. The timing models are subject to change as silicon testing improves the correlation between the actual silicon and the timing models. You can use this IP core for system architecture and resource utilization studies, simulation, pinout, system latency assessments, basic timing assessments (pipeline budgeting), and I/O transfer strategy (data-path width, burst depth, I/O standards tradeoffs).
- Preliminary support—the IP core is verified with preliminary timing models for this device family. The IP core meets all functional requirements, but might still be undergoing timing analysis for the device family. It can be used in production designs with caution.
- Final support—the IP core is verified with final timing models for this device family. The IP core meets all functional and timing requirements for the device family and can be used in production designs.
Device Family |
Support Level |
---|---|
Intel® Stratix® 10 DX |
Final support |
Intel® Agilex™ |
Preliminary support |
Other device families |
No support Refer to the Intel PCI Express Solutions web page on the Intel website for support information on other device families. |
1.5. Performance and Resource Utilization
The following table shows the recommended FPGA fabric speed grades for all the configurations that the Avalon® -ST IP core supports.
Lane Rate |
Link Width |
Application Interface Data Width |
Application Clock Frequency (MHz) |
Recommended FPGA Fabric Speed Grades |
---|---|---|---|---|
Gen4 |
x4 | 128-bit |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
-1, -2 |
x8 | 256-bit |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
-1, -2 | |
x16 | 512-bit |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
-1, -2 | |
Gen3 |
x4 | 128-bit | 250 MHz | -1, -2, -3 |
x8 |
256-bit | 250 MHz | -1, -2, -3 | |
x16 | 512-bit | 250 MHz | -1, -2, -3 |
The following table shows the typical resource utilization information for selected configurations.
The resource usage is based on the Avalon® -ST IP core top-level entity (intel_pcie_ptile_ast) that includes IP core soft logic implemented in the FPGA fabric.
Design Example Used | Link Configuration | Device Family | ALMs | M20Ks | Logic Registers |
---|---|---|---|---|---|
Programmed I/O (PIO) | Gen4 x16, EP | Intel® Stratix® 10 DX | 3,191 | 0 | 10,255 |
Programmed I/O (PIO) | Gen4 x16, EP | Intel® Agilex™ | 3,513 | 0 | 9,896 |
For details on the application clock frequencies that the IP core can support, refer to Table 10.
1.6. IP Core and Design Example Support Levels
The following table shows the support levels of the Avalon® -ST IP core and design example in Intel® Stratix® 10 DX devices.
Configuration | PCIe IP Support | Design Example Support | ||||
---|---|---|---|---|---|---|
EP | RP | BP | EP | RP | BP | |
Gen4 x16 512-bit | S C T H | S C T H | S C T H | S C T H | N/A | N/A |
Gen4 x8/x8 256-bit | S C T H | N/A | S C T H | S C T H | N/A | N/A |
Gen4 x4/x4/x4/x4 128-bit | N/A | S C T H | S C T H | N/A | N/A | N/A |
Gen3 x16 512-bit | S C T H | S C T H | S C T H | S C T H | N/A | N/A |
Gen3 x8/x8 256-bit | S C T H | N/A | S C T H | S C T H | N/A | N/A |
Gen3 x4/x4/x4/x4 128-bit | N/A | S C T H | S C T H | N/A | N/A | N/A |
The following table shows the support levels of the Avalon® -ST IP core and design example in Intel® Agilex™ devices.
Configuration | PCIe IP Support | Design Example Support | ||||
---|---|---|---|---|---|---|
EP | RP | BP | EP | RP | BP | |
Gen4 x16 512-bit | S C T H | S C T H | S C T H | S C T H | N/A | N/A |
Gen4 x8/x8 256-bit | S C T H | N/A | S C T H | S C T H | N/A | N/A |
Gen4 x4/x4/x4/x4 128-bit | N/A | S C T H | S C T H | N/A | N/A | N/A |
Gen3 x16 512-bit | S C T H | S C T H | S C T H | S C T H | N/A | N/A |
Gen3 x8/x8 256-bit | S C T H | N/A | S C T H | S C T H | N/A | N/A |
Gen3 x4/x4/x4/x4 128-bit | N/A | S C T H | S C T H | N/A | N/A | N/A |
2. IP Architecture and Functional Description
2.1. Architecture
- PMA/PCS
- Four PCIe* cores (one x16 core, one x8 core and two x4 cores)
- Embedded Multi-die Interconnect Bridge (EMIB)
- Soft logic blocks in the FPGA fabric to implement functions such as VirtIO, etc.
The four cores in the PCIe Hard IP can be configured to support the following topologies:
Configuration Mode | Native IP Mode | Endpoint (EP) / Root Port (RP) / TLP Bypass (BP) | Active Cores |
---|---|---|---|
Configuration Mode 0 | Gen3x16 or Gen4x16 | EP/RP/BP | x16 |
Configuration Mode 1 | Gen3x8/Gen3x8 or Gen4x8/Gen4x8 | EP/BP | x16, x8 |
Configuration Mode 2 | Gen3x4/Gen3x4/Gen3x4/Gen3x4 or Gen4x4/Gen4x4/Gen4x4/Gen4x4 | RP/BP | x16, x8, x4_0, x4_1 |
In Configuration Mode 0, only the x16 core is active, and it operates in x16 mode (in either Gen3 or Gen4).
In Configuration Mode 2, all four cores (x16, x8, x4_0, x4_1) are active, and they operate as four Gen3 x4 cores or four Gen4 x4 cores.
Each of the cores has its own Avalon® -ST interface to the user logic. The number of IP-to-User Logic interfaces exposed to the FPGA fabric are different based on the configuration modes. For more details, refer to the Overview section of the Interfaces chapter.
2.1.1. Clock Domains
- PHY clock domain (i.e. core_clk domain): this clock is synchronous to the SerDes parallel clock.
- EMIB/FPGA fabric interface clock domain (i.e. pld_clk domain): this clock is derived from the same reference clock (refclk0) as the one used by the SerDes. However, this clock is generated from a stand-alone core PLL.
- Application clock domain (coreclkout_hip): this clock is an output from the P-Tile IP, and it has the same frequency as pld_clk.
The PHY clock domain (i.e. core_clk domain) is a dynamic frequency domain. The PHY clock frequency is dependent on the current link speed.
Link Speed | PHY Clock Frequency | Application Clock Frequency |
---|---|---|
Gen1 | 125 MHz | Gen1 is supported only via link down-training and not natively. Hence, the application clock frequency depends on the configuration you choose in the IP Parameter Editor. For example, if you choose a Gen3 configuration, the application clock frequency is 250 MHz. |
Gen2 | 250 MHz | Gen2 is supported only via link down-training and not natively. Hence, the application clock frequency depends on the configuration you choose in the IP Parameter Editor. For example, if you choose a Gen3 configuration, the application clock frequency is 250 MHz. |
Gen3 | 500 MHz | 250 MHz |
Gen4 | 1000 MHz |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
2.1.2. Refclk
P-Tile has two reference clock inputs at the package level, refclk0 and refclk1. You must connect a 100 MHz reference clock source to these two inputs. Depending on the port mode, you can drive the two refclk inputs using either a single clock source or two independent clock sources.
In 1x16 and 4x4 modes, drive the refclk inputs with a single clock source (through a fanout buffer) as shown in the figure below.
- If the link can handle two separate reference clocks, drive the refclk0 of P-Tile with the on-board free-running oscillator.
- If the link needs to use a common reference clock, then PERST# needs to indicate the stability of this reference clock. If this reference clock goes down, the entire P-Tile must be reset.
2.1.3. Reset
- pin_perst_n is a "power good" indicator from the associated power domain (to which P-Tile is connected). Also, it shall qualify that both the P-Tile refclk0 and refclk1 are stable. If one of the reference clocks becomes stable later, deassert pin_perst_n after this reference clock becomes stable.
- pin_perst_n assertion is required for proper Autonomous P-Tile functionality. In Autonomous mode (enabled by default), P-Tile can successfully link up upon the release of pin_perst_n regardless of the FPGA fabric configuration and will send out CRS (Configuration Retry Status) until the FPGA fabric is configured and ready.
The following is an example where a single PERST# (pin_perst_n) is driven with independent refclk0 and refclk1. In this example, the add-in card (FPGA and Soc) is powered up first. P-Tile refclk0 is fed by the on-board free-running oscillator. P-Tile refclk1 driven by the Host becomes stable later. Hence, the PERST# is connected to the Host.

2.2. Functional Description
2.2.1. PMA/PCS
The P-Tile Avalon® -ST IP for PCI Express contains Physical Medium Attachment (PMA) and PCI Express Physical Coding Sublayer (PCIe PCS) blocks for handling the Physical layer (PHY) packets. The PMA receives and transmits high-speed serial data on the serial lanes. The PCS acts as an interface between the PMA and the PCIe controller, and performs functions like data encoding and decoding, scrambling and descrambling, block synchronization etc. The PCIe PCS in the P-Tile Avalon® -ST IP for PCI Express is based on the PHY Interface for PCI Express (PIPE) Base Specification 4.4.1.
In this IP, the PMA consists of up to four quads. Each quad contains a pair of transmit PLLs and four SerDes lanes capable of running up to 16 GT/s to perform the various TX and RX functions.
PLLA generates the required transmit clocks for Gen1/Gen2 speeds, while PLLB generates the required clocks for Gen3/Gen4 speeds. For the x8 and x16 lane widths, one of the quads acts as the master PLL source to drive the clock inputs for each of the lanes in the other quads.
The PMA performs functions such as serialization/deserialization, clock data recovery, and analog front-end functions such as Continuous Time Linear Equalizer (CTLE), Decision Feedback Equalizer (DFE) and transmit equalization.
The transmitter consists of a 3-tap equalizer with one tap of pre-cursor, one tap of main cursor and one tap of post-cursor.
- Maximum Timing Offset: -0.2UI to +0.2UI.
- Number of timing steps: 9.
- Independent left and right timing margining is supported.
- Independent Error Sampler is not supported (lane margining may produce logical errors in the data stream and cause the LTSSM to go to the Recovery state).
The PHY layer uses a fixed 16-bit PCS-PMA interface width to output the PHY clock (core_clk). The frequency of this clock is dependent on the current link speed. Refer to Table 10 for the frequencies at various link speeds.
2.2.2. Data Link Layer Overview
The Data Link Layer (DLL) is located between the Transaction Layer and the Physical Layer. It maintains packet integrity and communicates (by DLL packet transmission) at the PCI Express link level.
The DLL implements the following functions:
- Link management through the reception and transmission of DLL Packets (DLLP), which are
used for the following functions:
- Power management of DLLP reception and transmission
- To transmit and receive ACK/NAK packets
- Data integrity through the generation and checking of CRCs for TLPs and DLLPs
- TLP retransmission in case of NAK DLLP reception or replay timeout, using the retry (replay) buffer
- Management of the retry buffer
- Link retraining requests in case of error through the Link Training and Status State Machine (LTSSM) of the Physical Layer
The DLL has the following sub-blocks:
- Data Link Control and Management State Machine—This state machine connects to both the Physical Layer’s LTSSM state machine and the Transaction Layer. It initializes the link and flow control credits and reports status to the Transaction Layer.
- Power Management—This function handles the handshake to enter low power mode. Such a transition is based on register values in the Configuration Space and received Power Management (PM) DLLPs. For more details on the power states supported by the P-Tile Avalon® -ST IP for PCIe, refer to section Power Management Interface.
- Data Link Layer Packet Generator and Checker—This block is associated with the DLLP’s 16-bit CRC and maintains the integrity of transmitted packets.
- Transaction Layer Packet Generator—This block generates transmit packets, including a sequence number and a 32-bit Link CRC (LCRC). The packets are also sent to the retry buffer for internal storage. In retry mode, the TLP generator receives the packets from the retry buffer and generates the CRC for the transmit packet.
- Retry Buffer—The retry buffer stores TLPs and retransmits all unacknowledged packets in the case of NAK DLLP reception. In case of ACK DLLP reception, the retry buffer discards all acknowledged packets.
- ACK/NAK Packets—The ACK/NAK block handles ACK/NAK DLLPs and generates the sequence number of transmitted packets.
- Transaction Layer Packet Checker—This block checks the integrity of the received TLP and generates a request for transmission of an ACK/NAK DLLP.
- TX Arbitration—This block arbitrates transactions, prioritizing in the following order:
- Initialize FC Data Link Layer packet
- ACK/NAK DLLP (high priority)
- Update FC DLLP (high priority)
- PM DLLP
- Retry buffer TLP
- TLP
- Update FC DLLP (low priority)
- ACK/NAK FC DLLP (low priority)
2.2.3. Transaction Layer Overview
The following figure shows the major blocks in the P-Tile Avalon® -ST IP for PCI Express Transaction Layer:
The RAS (Reliability, Availability, and Serviceability) block includes a set of features to maintain the integrity of the link.
For example: Transaction Layer inserts an optional ECRC in the transmit logic and checks it in the receive logic to provide End-to-End data protection.
When the application logic sets the TLP Digest (TD) bit in the Header of the TLP, the P-Tile Avalon® -ST IP for PCIe will append the ECRC automatically.
Note that in TLP Bypass mode, the PCIe Hard IP does not generate/check the ECRC and will not remove it if the received TLP has the ECRC.
The TX block sends out the TLPs that it receives as-is. It also sends the information about non-posted TLPs to the CPL Timeout Block for CPL timeout detection.
- Filtering block: This module checks if the TLP is good or bad and generates the associated error message and completion. It also tracks received completions and updates the completion timeout (CPL timeout) block.
- RX Buffer Queue: The P-Tile IP for PCIe has separate queues for posted/non-posted transactions and completions. This avoids head-of-queue blocking on the received TLPs and provides flexibility to extract TLPs according to the PCIe ordering rules.
3. Parameters
This chapter provides a reference for all the parameters that are configurable in the Intel® Quartus® Prime IP Parameter Editor for the P-Tile Avalon® -ST IP for PCIe.
3.1. Top-Level Settings
Parameter | Value | Default Value | Description |
---|---|---|---|
Hard IP Mode |
Gen4x16, Interface - 512-bit Gen3x16, Interface - 512-bit Gen4x8, Interface - 256-bit Gen3x8, Interface - 256-bit Gen4x4, Interface - 128-bit Gen3x4, Interface - 128-bit |
Gen4x16, Interface - 512-bit |
Select the following elements: Lane data rate:
Lane width:
|
Port Mode |
Root Port Native Endpoint Note: These are the available options when Enable TLP Bypass is set to False. If TLP Bypass mode is enabled, refer to the table Port Mode Options in TLP Bypass below for available port
mode options.
|
Native Endpoint |
Specifies the port type. |
Enable PHY Reconfiguration | True/False | False | Enable the PHY Reconfiguration Interface. |
PLD Clock Frequency |
500 MHz 400 MHz 350 MHz 250 MHz |
400 MHz (for Gen4 modes) 250 MHz (for Gen3 modes) |
Select the frequency of the Application clock. The options available vary depending on the setting of the Hard IP Mode parameter. For Gen4 modes, the available clock frequencies are 500 MHz / 400 MHz / 350 MHz (for Intel® Agilex™ ) and 400 MHz / 350 MHz (for Intel® Stratix® 10 DX). For Gen3 modes, the available clock frequency is 250 MHz (for Intel® Agilex™ and Intel® Stratix® 10 DX). |
Enable TLP Bypass | True/False | False |
Enable the TLP Bypass feature.
Note:
For configurations where multiple ports are available, it is
possible to enable TLP Bypass on a per-port basis. Refer to Table 12 for the available
port modes and configurations. |
Enable SRIS Mode | True/False | False |
Enable the Separate Reference Clock with Independent Spread Spectrum Clocking (SRIS) feature. |
P-Tile Sim Mode | True/False | False | Enabling this parameter reduces
the simulation time of Hot Reset tests by 5 ms. Note: Do not enable this option if you need to run
synthesis.
|
Enable RST of PCS & Controller | True/False | False |
Enable the reset of PCS and Controller in User Mode for Endpoint and Bypass Upstream modes. When this parameter is True, depending on the topology, new signals (p<n>_pld_clrpcs_n) are exported to the Avalon® Streaming interface. When this parameter is False (default), the IP internally ties off these signals instead of exporting them. Note: This feature is only supported in the X8X8 Endpoint/Bypass
Upstream topology.
Note: If you have more questions regarding the bifurcation feature and
its usage, contact your Application Engineer.
|
Configuration | Available Port Modes | |||
---|---|---|---|---|
Port 0 | Port 1 | Port 2 | Port 3 | |
1x16 (Gen4x16 or Gen3x16) |
TLP Bypass On : Downstream (Default) |
N/A |
N/A |
N/A |
TLP Bypass On : Upstream |
||||
2x8 (Gen4x8/Gen4x8 or Gen3x8/Gen3x8) |
TLP Bypass On : Downstream (Default) |
TLP Bypass On : Downstream (Default) |
N/A |
N/A |
TLP Bypass On : Upstream |
TLP Bypass On : Upstream |
|||
TLP Bypass Off : Endpoint |
TLP Bypass On : Upstream |
|||
TLP Bypass On : Upstream |
TLP Bypass On : Downstream |
|||
TLP Bypass On : Upstream |
TLP Bypass Off : Endpoint |
|||
4x4 (Gen4x4/Gen4x4 / Gen4x4/Gen4x4 or Gen3x4/Gen3x4 / Gen3x4/Gen3x4) |
TLP Bypass On : Downstream (Default) |
TLP Bypass On : Downstream (Default) |
TLP Bypass On : Downstream (Default) |
TLP Bypass On : Downstream (Default) |
TLP Bypass On : Upstream |
TLP Bypass On : Upstream |
TLP Bypass On : Upstream |
TLP Bypass On : Upstream |

3.2. Core Parameters
Depending on which Hard IP Mode you choose in the Top-Level Settings tab, you will see different tabs for setting the core parameters.




3.2.1. System Parameters
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Multiple Physical Functions | True/False | False | Enable support for multiple physical functions. |
3.2.2. Avalon Parameters
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Power Management Interface and Hard IP Status Interface | True/False | False | When enabled, the Power Management Interface and Hard IP Status Interface are exported. For more details, refer to section Power Management Interface. |
Enable Legacy Interrupt | True/False | False |
Enable the support for legacy interrupts. For more details, refer to section Legacy Interrupts. |
Enable Parity Error | True/False | True | Enable the support for parity error checking. Parity errors are indicated by outputs rx_par_err_o and tx_par_err_o. |
Enable Completion Timeout Interface | True/False | False | Enable the Completion Timeout Interface. For more details, refer to section Completion Timeout Interface. |
Enable Configuration Intercept Interface | True/False | False | Enable the Configuration Intercept
Interface. For more details, refer to section Configuration Intercept Interface (EP Only). Note: This parameter is only available in EP mode.
|
Enable PRS Event | True/False | False | Enable the Page Request Service
(PRS) Event Interface. For more details, refer to section Page Request Service (PRS) Interface (EP Only). Note: This parameter is only available in EP mode.
|
Enable Error Interface | True/False | False |
Enable the Error Interface. For more details, refer to section Error Interface. |
Enable Byte Parity Ports on Avalon® -ST Interface | True/False | False | When this parameter is enabled, the byte parity ports appear on the block symbol. These byte parity ports include: rx_st_data_par_o, rx_st_hdr_par_o, rx_st_tlp_prfx_par_o, tx_st_data_par_o, tx_st_hdr_par_o, and tx_st_tlp_prfx_par_o ports. |
3.2.3. Base Address Registers
Parameter | Value | Description |
---|---|---|
BAR0 Type |
Disabled 64-bit prefetchable memory 64-bit non-prefetchable memory 32-bit non-prefetchable memory 32-bit prefetchable memory |
If you select 64-bit prefetchable memory, 2 contiguous BARs are combined to form a 64-bit prefetchable BAR; you must set the higher numbered BAR to Disabled. Defining memory as prefetchable allows contiguous data to be fetched ahead. Prefetching memory is advantageous when the requestor may require more data from the same region than was originally requested. If you specify that a memory is prefetchable, it must have the following 2 attributes:
|
BAR1 Type |
Disabled 32-bit non-prefetchable memory 32-bit prefetchable memory |
For a definition of prefetchable memory, refer to the BAR0 Type description. |
BAR2 Type |
Disabled 64-bit prefetchable memory 64-bit non-prefetchable memory 32-bit non-prefetchable memory 32-bit prefetchable memory |
For a definition of prefetchable memory and a description of what happens when you select the 64-bit prefetchable memory option, refer to the BAR0 Type description. |
BAR3 Type |
Disabled 32-bit non-prefetchable memory 32-bit prefetchable memory |
For a definition of prefetchable memory, refer to the BAR0 Type description. |
BAR4 Type |
Disabled 64-bit prefetchable memory 64-bit non-prefetchable memory 32-bit non-prefetchable memory 32-bit prefetchable memory |
For a definition of prefetchable memory and a description of what happens when you select the 64-bit prefetchable memory option, refer to the BAR0 Type description. |
BAR5 Type |
Disabled 32-bit non-prefetchable memory 32-bit prefetchable memory |
For a definition of prefetchable memory, refer to the BAR0 Type description. |
BARn Size |
128 Bytes - 16 EBytes |
Specifies the size of the address space accessible to BARn when BARn is enabled. n = 0, 1, 2, 3, 4 or 5 |
Expansion ROM |
Disabled 4 KBytes - 12 bits 8 KBytes - 13 bits 16 KBytes - 14 bits 32 KBytes - 15 bits 64 KBytes - 16 bits 128 KBytes - 17 bits 256 KBytes - 18 bits 512 KBytes - 19 bits 1 MByte - 20 bits 2 MBytes - 21 bits 4 MBytes - 22 bits 8 MBytes - 23 bits 16 MBytes - 24 bits |
Specifies the size of the expansion ROM from 4 KBytes to 16 MBytes when enabled. |
3.2.4. Multi-function and SR-IOV
Parameter | Value | Default Value | Description |
---|---|---|---|
Total Physical Functions (PFs) | 1 - 8 | 1 | Set the number of physical functions. The IP core can support 1 - 8 PFs. This parameter is visible only if Enable multiple physical functions is set to True (under the PCIe Device tab). |
Enable SR-IOV Support | True/False | False | Enable SR-IOV support. |
Total Virtual Functions of Physical Function 0 (PF0 VFs) | 0 - 2048 | 0 | Set the number of VFs to be assigned to Physical Function 0. |
Enable VirtIO Support | True/False | False | Enable VirtIO support. This parameter is visible only if Enable SR-IOV Support is True and Enable multiple physical functions is also set to True (under the PCIe Device tab). |
3.2.4.1. VirtIO Parameters
To enable VirtIO support, first enable the support for multiple physical functions in the IP Parameter Editor as shown in the following screenshot:

Make sure that SR-IOV support is also enabled:

Enable VirtIO support as shown in the screenshot below:

Finally, you can configure the appropriate VirtIO capability parameters in the tabs shown in the screenshot below:

The following table provides a reference for all the configurable high-level parameters of the VirtIO block for P-Tile. Parameters below are dedicated to each core.
Parameter | Description | Allowed Range | Default Value |
---|---|---|---|
Total Physical Functions (PFs) Count Number |
The number of supported Physical Functions. | 1-8 | 1 |
Total Physical Functions (PFs) Count Number Width |
Width of the supported Physical Functions number count. | 3 | 3 |
Total Virtual Functions Count Number of PFs | Total number of VFs associated with PFs. Only present when SR-IOV is enabled. | 0-2K | 0 |
Total Virtual Functions Count Number Width of PFs | Width of the count of the total number of VFs associated with PFs. Only present when SR-IOV is enabled. | 11 | 11 |
Virtual Functions Count Number associated with PF 0-7 | Number of VFs associated with PFs 0-7. The sum of all the VF counts for PFs 0-7 cannot exceed the total number of VFs. | 0-2K | 0 |
Enable PF VirtIO | Enable Physical Function 0-7 VirtIO capability. | 1’b1 / 1’b0 | 1’b0 |
Enable VF VirtIO | Enable VirtIO capability of VFs associated with PFs 0-7. | 1’b1 / 1’b0 | 1’b0 |
Base Address of VirtIO Common Configuration Capability Structure | Start byte address of VirtIO common configuration capability structure for both PFs and VFs. | 0x48 | 0x48 |
The next table summarizes the parameters associated with the five VirtIO device configuration structures:
Parameter | Description | Allowed Range | Default Value |
---|---|---|---|
PF/VF VirtIO Common Configuration Structure Capability Parameters | |||
PFs 0-7 Common Configuration Structure BAR Indicator | Indicates BAR holding the Common Configuration Structure of PFs 0-7. | 0-5 | 0 |
PFs 0-7 VFs Common Configuration Structure BAR Indicator | Indicates BAR holding the Common Configuration Structure of VFs associated with PFs 0-7. | 0-5 | 0 |
PFs 0-7 Common Configuration Structure Offset within BAR | Indicates starting position of Common Config Structure in a given BAR of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Common Configuration Structure BAR Indicator | Indicates starting position of Common Config Structure in a given BAR of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 Common Configuration Structure Length | Indicates length in bytes of Common Config Structure of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Common Configuration Structure Length | Indicates length in bytes of Common Config Structure of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PF/VF VirtIO Notifications Structure Capability Parameters | |||
PFs 0-7 Notifications Structure BAR Indicator | Indicates BAR holding the Notifications Structure of PFs 0-7. | 0-5 | 0 |
PFs 0-7 VFs Notifications Structure BAR Indicator | Indicates BAR holding the Notifications Structure of VFs associated with PFs 0-7. | 0-5 | 0 |
PFs 0-7 Notifications Structure Offset within BAR | Indicates starting position of Notifications Structure in given BAR of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Notifications Structure BAR Indicator | Indicates starting position of Notifications Structure in given BAR of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 Notifications Structure Length | Indicates length in bytes of Notifications Structure of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Notifications Structure Length | Indicates length in bytes of Notifications Structure of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 Notifications Structure Notify Off Multiplier | Indicates multiplier for queue_notify_off in Notifications Structure of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Notifications Structure Notify Off Multiplier | Indicates multiplier for queue_notify_off in Notifications Structure of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PF/VF VirtIO ISR Status Structure Capability Parameters | |||
PFs 0-7 ISR Status Structure BAR Indicator | Indicates BAR holding the ISR Status Structure of PFs 0-7. | 0-5 | 0 |
PFs 0-7 VFs ISR Status Structure BAR Indicator | Indicates BAR holding the ISR Status Structure of VFs associated with PFs 0-7. | 0-5 | 0 |
PFs 0-7 ISR Status Structure Offset within BAR | Indicates starting position of ISR Status Structure in given BAR of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs ISR Status Structure BAR Indicator | Indicates starting position of ISR Status Structure in given BAR of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 ISR Status Structure Length | Indicates length in bytes of ISR Status Structure of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs ISR Status Structure Length | Indicates length in bytes of ISR Status Structure of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PF/VF VirtIO Device-Specific Configuration Structure Capability Parameters | |||
Enable PFs 0-7 VirtIO Device Specific Capability | Enable PFs 0-7 VirtIO Device-Specific Configuration Structure Capability. | Ture / False | False |
Enable PFs 0-7 VFs VirtIO Device-Specific Capability | Enable VirtIO Device-Specific Configuration Structure Capability of VFs associated with PFs 0-7. | Ture / False | False |
PFs 0-7 Device-Specific Configuration Structure BAR Indicator |
Indicates BAR holding the Device-Specific Configuration Structure of PFs 0-7. |
0-5 | 0 |
PFs 0-7 VFs Device-Specific Configuration Structure BAR Indicator | Indicates BAR holding the Device-Specific Configuration Structure of VFs associated with PFs 0-7. | 0-5 | 0 |
PFs 0-7 Device-Specific Configuration Structure Offset within BAR | Indicates starting position of Device-Specific Configuration Structure in given BAR of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Device-Specific Configuration Structure BAR Indicator | Indicates starting position of Device-Specific Configuration Structure in given BAR of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 Device-Specific Configuration Structure Length | Indicates length in bytes of Device-Specific Configuration Structure of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs Device-Specific Configuration Structure Length | Indicates length in bytes of Device-Specific Configuration Structure of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PF/VF VirtIO PCI Configuration Access Structure Capability Parameters | |||
PFs 0-7 PCI Configuration Access Structure BAR Indicator | Indicates BAR holding the PCI Configuration Access Structure of PFs 0-7. | 0-5 | 0 |
PFs 0-7 VFs PCI Configuration Access Structure BAR Indicator | Indicates BAR holding the PCI Configuration Access Structure of VFs associated with PFs 0-7. | 0-5 | 0 |
PFs 0-7 PCI Configuration Access Structure Offset within BAR | Indicates Starting position of PCI Configuration Access Structure in given BAR of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs PCI Configuration Access Structure BAR Indicator | Indicates Starting position of PCI Configuration Access Structure in given BAR of VFs associated with PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 PCI Configuration Access Structure Length | Indicates length in bytes of PCI Configuration Access Structure of PFs 0-7. | 0-536870911 | 0 |
PFs 0-7 VFs PCI Configuration Access Structure Length | Indicates length in bytes of PCI Configuration Access Structure of VFs associated with PFs 0-7. | 0-536870911 | 0 |
3.2.5. TLP Processing Hints (TPH)/Address Translation Services (ATS) Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Address Translation Services (ATS) | True/False | False |
Enable or disable Address Translation Services (ATS) capability. When ATS is enabled, senders can request and cache translated addresses using the RP memory space for later use. |
Enable TLP Processing Hints (TPH) | True/False | False |
Enable or disable TLP Processing Hints (TPH) capability. Using TPH may improve the latency performance and reduce traffic congestion. |
3.2.6. PCI Express and PCI Capabilities Parameters
For each core (PCIe0/PCIe1/PCIe2/PCIe3), the PCI Express / PCI Capabilities tab contains separate tabs for the device, PRS (Endpoint mode), MSI (Endpoint mode), ACS capabilities (Root Port mode), slot (Root Port mode), MSI-X, and legacy interrupt pin register parameters.

3.2.6.1. Device Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
Maximum payload sizes supported |
128 bytes 256 bytes 512 bytes |
512 bytes | Specifies the maximum payload size supported. This parameter sets the read-only value of the max payload size supported field of the Device Capabilities register. |
Enable Multiple Physical Functions | True/False | False | Enables multiple physical functions. |
Enable Function Level Reset | True/False | False |
When this option is True, each function has its own individual reset. Required for all SR-IOV functions. This option appears only when Enable Multiple Physical Functions is set to True. |
3.2.6.2. Link Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
Link port number (Root Port only) | 0 - 255 | 1 | Sets the read-only value of the port number field in the Link Capabilities register. This parameter is for Root Ports only. It should not be changed. |
Slot clock configuration | True/False | True | When this parameter is True, it indicates that the Endpoint uses the same physical reference clock that the system provides on the connector. When it is False, the IP core uses an independent clock regardless of the presence of a reference clock on the connector. This parameter sets the Slot Clock Configuration bit (bit 12) in the PCI Express Link Status register. |
3.2.6.3. Legacy Interrupt Pin Register
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Legacy Interrupts for PF0 | True/False | False | Enable Legacy Interrupts (INTx) for PF0 of PCIe0. |
Set Interrupt Pin for PF0 |
NO INT INTA INTA/INTB/INTC/INTD |
NO INT |
When Legacy Interrupts are not enabled, the only option available is NO INT. When Legacy Interrupts are enabled and multifunction is disabled, the only option available is INTA. When Legacy Interrupts are enabled and multifunction is enabled, the options available are INTA, INTB, INTC and INTD. |
3.2.6.4. MSI Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
PF0 Enable MSI | True/False | False |
Enables MSI functionality for PF0. If this parameter is True, the Number of MSI messages requested parameter will appear allowing you to set the number of MSI messages. |
PF0 MSI Extended Data Capable | True/False | False | Enables or disables MSI extended data capability for PF0. |
PF0 Number of MSI messages requested |
1 2 4 8 16 32 |
1 | Sets the number of messages that the application can request in the multiple message capable field of the Message Control register. |
3.2.6.5. MSI-X Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable MSI-X (Endpoint only) | True/False | False |
Enables the MSI-X functionality. |
MSI-X Table Size | 0x0 - 0x7FF (only values of powers of two minus 1 are valid) | 0 |
System software reads this field to determine the MSI-X table size <n>, which is encoded as <n-1>. For example, a returned value of 2047 indicates a table size of 2048. This field is read-only. Address offset: 0x068[26:16] |
MSI-X Table Offset | 0x0 - 0xFFFFFFFF | 0 | Points to the base of the MSI-X table. The lower 3 bits of the table BAR indicator (BIR) are set to zero by software to form a 64-bit qword-aligned offset. This field is read-only after being programmed. |
Table BAR indicator | 0x0 - 0x5 | 0 | Specifies which one of a function's BARs, located beginning at 0x10 in Configuration Space, is used to map the MSI-X table into memory space. This field is read-only after being programmed. |
Pending bit array (PBA) offset | 0x0 - 0xFFFFFFFF | 0 | Used as an offset from the address contained in one of the function's Base Address registers to point to the base of the MSI-X PBA. The lower 3 bits of the PBA BIR are set to zero by software to form a 32-bit qword-aligned offset. This field is read-only after being programmed. |
PBA BAR indicator | 0x0 - 0x5 | 0 | Specifies the function's Base Address register, located beginning at 0x10 in Configuration Space, that maps the MSI-X PBA into memory space. This field is read-only after being programmed. |
VF Table size | 0x0 - 0x7FF (only values of powers of two minus 1 are valid) | 0 | Sets the number of entries in the MSI-X table for VFs. MSI-X cannot be disabled for VFs. Set to 1 to save resources. |
3.2.6.6. Slot Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
Use Slot register | True/False | False |
This parameter is only supported in Root Port mode. The slot capability is required for Root Ports if a slot is implemented on the port. Slot status is recorded in the PCI Express Capabilities register. |
Slot power scale | 0 - 3 | 0 |
Specifies the scale used for the slot power limit. The following
coefficients are defined:
The default value prior to hardware and firmware initialization is b’00. Writes to this register also cause the port to send the Set_Slot_Power_Limit message. |
Slot power limit | 0 - 255 | 0 | In combination with the Slot power scale value, specifies the upper limit in watts for the power supplied by the slot. |
Slot number | 0 - 8191 | 0 | Specifies the slot number. |
3.2.6.7. Latency Tolerance Reporting (LTR)
This capability allows the P-Tile Avalon streaming IP, when operating in Endpoint mode, to report the delay that it can tolerate when requesting service from the Host. This information can help software optimize performance when the Endpoint needs a fast response, or optimize system power when a fast response is not necessary.
Parameter | Value | Default Value | Description |
---|---|---|---|
PCIe0 Enable LTR | True/False | False | Enable or disable LTR capability for PCIe0. |
3.2.6.8. Process Address Space ID (PASID)
Parameter | Value | Default Value | Description |
---|---|---|---|
PCIe0 PF0 Enable PASID | True/False | False | Enable or disable PASID capability for PCIe0 PF0. |
PCIe0 PF0 Enable Execute Permission Support | True/False | False | Enable or disable PASID Execute Permission Support for PCIe0 PF0. |
PCIe0 PF0 Enable Privileged Mode Support | True/False | False | Enable or disable PASID Privileged Mode Support for PCIe0 PF0. |
PCIe0 PF0 Max PASID Width | 0 - 20 | 0 | Set the Max PASID Width for PCIe0 PF0. |
3.2.6.9. Device Serial Number Capability
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Device Serial Number Capability | True/False | False |
Enables the device serial number capability. This is an optional extended capability that provides a unique identifier for the PCIe device. |
3.2.6.10. Page Request Service (PRS)
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable PRS | True/False | False |
Enable or disable Page Request Service (PRS) capability. |
3.2.6.11. Access Control Service (ACS) Capabilities
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Access Control Service (ACS) | True/False | False | ACS defines a set of control points within a PCI Express topology to determine whether a TLP is to be routed normally, blocked, or redirected. |
Enable ACS P2P Traffic Support | True/False | False | Indicates if the component supports Peer to Peer Traffic. |
Enable ACS P2P Egress Control | True/False | False |
Indicates if the component implements ACS P2P Egress Control. This parameter is visible only if Enable ACS P2P Traffic Support is set to True. |
Enable ACS P2P Egress Control Vector Size | 0 - 255 | 0 |
Indicates the number of bits in the ACS P2P Egress Control Vector. |
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable Access Control Service (ACS) | True/False | False | ACS defines a set of control points within a PCI Express topology to determine whether a TLP is to be routed normally, blocked, or redirected. |
3.2.6.12. Power Management
Parameter | Value | Default Value | Description |
---|---|---|---|
Enable L0s acceptable latency |
Maximum of 64 ns Maximum of 128 ns Maximum of 256 ns Maximum of 512 ns Maximum of 1 us Maximum of 2 us Maximum of 4 us No limit |
Maximum of 64 ns |
This design parameter specifies the maximum acceptable latency that the application layer can tolerate for any link between the device and the root complex to exit the L0s state. It sets the read-only value of the Endpoint L0s acceptable latency field of the Device Capabilities Register (0x084). This Endpoint does not support the L0s or L1 states. However, in a switched system, there may be links connected to switches that have L0s and L1 enabled. This parameter is set to allow system configuration software to read the acceptable latencies for all devices in the system and the exit latency for each link to determine which links can enable Active State Power Management (ASPM). This setting is disabled for Root Ports. The default value of this parameter is 64 ns. This is the safest setting for most designs. |
Endpoint L1 acceptable latency |
Maximum of 1 us Maximum of 2 us Maximum of 4 us Maximum of 8 us Maximum of 16 us Maximum of 32 us Maximum of 64 us No limit |
Maximum of 1 us |
This value indicates the acceptable latency that an Endpoint can withstand in the transition from the L1 state to L0 state. It is an indirect measure of the Endpoint’s internal buffering. It sets the read-only value of the Endpoint L1 acceptable latency field of the Device Capabilities Register. This Endpoint does not support the L0s or L1 states. However, a switched system may include links connected to switches that have L0s and L1 enabled. This parameter is set to allow system configuration software to read the acceptable latencies for all devices in the system and the exit latency for each link to determine which links can enable Active State Power Management (ASPM). This setting is disabled for Root Ports. |
3.2.6.13. Vendor Specific Extended Capability (VSEC) Registers
Parameter | Value | Default Value | Description |
---|---|---|---|
Vendor Specific Extended Capability | 0/1 | 0 | Enables the Vendor Specific Extended Capability (VSEC). |
User ID register from the Vendor Specific Extended Capability | 0 - 65534 | 0 | Sets the read-only value of the 16-bit User ID register from the Vendor Specific Extended Capability. This parameter is only valid for Endpoints. |
Drops Vendor Type0 Messages | 0/1 | 0 |
When this parameter is set to 1, the IP core drops vendor Type 0 messages while treating them as Unsupported Requests (UR). When it is set to 0, the IP core passes these messages on to the user logic. |
Drops Vendor Type1 Messages | 0/1 | 0 |
When this parameter is set to 1, the IP core silently drops vendor Type 1 messages. When it is set to 0, the IP core passes these messages on to the user logic. |
3.2.7. Device Identification Registers
The following table lists the default values of the Device ID registers. You can use the parameter editor to change the values of these registers.
Register Name | Range | Default Value | Description |
---|---|---|---|
Vendor ID | 16 bits | 0x00001172 |
Sets the read-only value of the Vendor ID register. This parameter cannot be set to 0xFFFF per the PCI Express Base Specification. Note: Set
your own Vendor ID by
changing
this parameter.
Address offset: 0x000. |
Device ID | 16 bits | 0x00000000 |
Sets the read-only value of the Device ID register. This register is only valid in the Type 0 (Endpoint) Configuration Space. Address offset: 0x000. |
Revision ID | 8 bits | 0x00000001 |
Sets the read-only value of the Revision ID register. Address offset: 0x008. |
Class Code | 24 bits | 0x00FF0000 |
Sets the read-only value of the Class Code register. Address offset: 0x008. This parameter cannot be set to 0x0 per the PCI Express Base Specification. |
Subsystem Vendor ID | 16 bits | 0x00000000 |
Sets the read-only value of the Subsystem Vendor ID register in the PCI Type 0 Configuration Space. This parameter cannot be set to 0xFFFF per the PCI Express Base Specification. This value is assigned by PCI-SIG to the device manufacturer. Address offset: 0x02C. |
Subsystem Device ID | 16 bits | 0x00000000 |
Sets the read-only value of the Subsystem Device ID register in the PCI Type 0 Configuration Space. Address offset: 0x02C. |
3.2.8. Configuration, Debug and Extension Options
Parameter | Value | Default Value | Description |
---|---|---|---|
Gen 3 Requested equalization far-end TX preset vector | 0 - 65535 | 0x00000004 | Specifies the Gen 3 requested phase 2/3 far-end TX preset vector. Choosing a value different from the default is not recommended for most designs. |
Gen 4 Requested equalization far-end TX preset vector | 0 - 65535 | 0x00000270 | Specifies the Gen 4 requested phase 2/3 far-end TX preset vector. Choosing a value different from the default is not recommended for most designs. |
Enable RX Buffer Limit Ports | True/False | False | When selected, RX buffer limit ports are exported allowing you to control the buffer limits for RX Posted, Non-Posted and CplD packets. Otherwise, the Maximum Buffer Size is used. |
Port 1 REFCLK Init Active | True/False | True |
If this parameter is True (default), the refclk1 is stable after pin_perst and is free-running. This parameter must be set to True for Type A/B/C systems. If this parameter is False, refclk1 is only available later in User Mode. This parameter must be set to False for Type D systems. This parameter is only available in the PCIe1 Settings tab for a X8X8 topology. Note: If you have more questions regarding the bifurcation feature and
its usage, contact your Application Engineer.
|
Enable Debug Toolkit | True/False | False | Enable the P-Tile Debug Toolkit for JTAG-based System Console debug access. |
Enable HIP dynamic reconfiguration of PCIe* registers | True/False | False | Enable the user Hard IP reconfiguration Avalon® -MM interface. |

4. Interfaces
This section focuses mainly on the signal interfaces that the P-Tile IP for PCIe uses to communicate with the Application Layer in the FPGA fabric core. However, it also briefly covers the Serial Data Interface, which allows the IP to communicate with the link partner across the PCIe link.
4.1. Overview
- p0 : x16 core
- p1 : x8 core
- p2 : x4_0 core
- p3 : x4_1 core
Figure 20 shows the top-level signals of this IP. Note that the signal names in the figure will get the appropriate prefix pn (where n = 0, 1, 2 or 3) depending on which of the three supported configurations (1x16, 2x8, or 4x4) the P-Tile Avalon® -ST IP for PCI Express* is in.
- In the 1x16 configuration, only the x16 core is active. In this case, this bus appears as p0_rx_st_data_o[511:0].
- In the 2x8 configuration, both the x16 core and x8 core are active. In this case, this bus is split into p0_rx_st_data_o[255:0] and p1_rx_st_data_o[255:0].
- In the 4x4 configuration, all four cores are active. In this case, this bus is split into p0_rx_st_data_o[127:0], p1_rx_st_data_o[127:0], p2_rx_st_data_o[127:0] and p3_rx_st_data_o[127:0].
The only cases where the interface signal names do not get the pn prefixes are the interfaces that are common for all the cores, like the PHY reconfiguration interface, clocks and resets. For example, there is only one xcvr_reconfig_clk that is shared by all the cores.
You can enable the PHY reconfiguration interface from the Top Level Settings in the GUI.
Each of the cores has its own Avalon® -ST interface to the user logic. The number of IP-to-User Logic interfaces exposed to the FPGA fabric are different based on the configuration modes:
Mode | Avalon-ST Interface Count | Data Width (each Interface) | Header Width (each Interface) | TLP Prefix Width (each Interface) | Application Clock Frequency |
---|---|---|---|---|---|
Gen4 x16 EP/RP mode | 1 | 512-bit | 256-bit | 64-bit |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
Gen3 x16 EP/RP mode | 1 | 512-bit | 256-bit | 64-bit | 250 MHz |
Gen4 x8 x8 EP mode | 2 | 256-bit | 128-bit | 32-bit |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
Gen3 x8 x8 EP mode | 2 | 256-bit | 128-bit | 32-bit | 250 MHz |
Gen4 x4 x4 x4 x4 RP mode | 4 | 128-bit | 128-bit | 32-bit |
350 MHz / 400 MHz ( Intel® Stratix® 10 DX) 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
Gen3 x4 x4 x4 x4 RP mode | 4 | 128-bit | 128-bit | 32-bit | 250 MHz |
Variable | 1x16 Configuration | 2x8 Configuration | 4x4 Configuration |
---|---|---|---|
w | 4 | 2 | 1 |
n | 2 | 1 | 1 |
p | 6 | 3 | 2 |
b | 16 | 8 | 4 |
4.2. Clocks and Resets
4.2.1. Interface Clock Signals
Name | I/O | Description | EP/RP/BP | Clock Frequency |
---|---|---|---|---|
coreclkout_hip | O |
This clock drives the Application Layer. The frequency depends on the data rate and the number of lanes being used. |
EP/RP/BP |
Native Gen3: 250 MHz Native Gen4: 350 MHz / 400 MHz ( Intel® Stratix® 10 DX) Native Gen4: 350 MHz / 400 MHz / 500 MHz ( Intel® Agilex™ ) |
refclk[1:0] | I |
These are the input reference clocks for the IP core. These clocks must be free-running. For more details on how to connect these clocks, refer to the section Clock Sharing in Bifurcation Modes. |
EP/RP/BP | 100 MHz ± 300 ppm |
p0_hip_reconfig_clk | I | Clock for the hip_reconfig interface. This is an Avalon® -MM interface. It is an optional interface that is enabled when the Enable HIP dynamic reconfiguration of PCIe read-only registers option in the PCIe Configuration, Debug and Extension Options tab is enabled. | EP/RP/BP |
50 MHz - 125 MHz (range) 100 MHz (recommended) |
xcvr_reconfig_clk | I | Clock for the PHY reconfiguration interface. This is an Avalon® -MM interface. This optional interface is enabled when you turn on the Enable PHY reconfiguration option in the Top-Level Settings tab. This interface is shared among all the cores. | EP/RP/BP |
50 MHz - 125 MHz (range) 100 MHz (recommended) |
p0_cpl_timeout_avmm_clk | I | Avalon® -MM clock for Completion timeout interface. This interface is optional, and is enabled when the Enable Completion Timeout Interface option in the PCIe Avalon Settings tab is enabled. | EP/RP/BP |
50 MHz - 125 MHz (range) 100 MHz (recommended) |
4.2.2. Resets
4.2.2.1. Interface Reset Signals
Signal Name | Direction | Clock | EP/RP/BP | Description |
---|---|---|---|---|
pin_perst_n | Input | Asynchronous | EP/RP/BP | This is an active-low input to the PCIe* Hard IP, and implements the PERST# function defined by the PCIe* specification. |
p<n>_pin_perst_n where n = 0, 1, 2, 3 | Output | Asynchronous | EP/RP/BP | This is the PERST output signal from the Hard IP. It is derived from the pin_perst_n input signal. |
p<n>_reset_status_n where n = 0, 1, 2, 3 | Output | Synchronous | EP/RP/BP | This active-low signal is held low until pin_perst_n has been deasserted and the PCIe* Hard IP has come out of reset. This signal is synchronous to coreclkout_hip. When port bifurcation is used, there is one such signal for each Avalon® -ST interface. The signals are differentiated by the prefixes pn. This is a per-port signal. |
ninit_done | Input | Asynchronous | EP/RP | A "1" on this active-low signal indicates that the FPGA device is not yet fully configured. A "0" indicates the device has been configured and is in normal operating mode. |
4.2.2.2. Function-Level Reset (FLR) Interface (EP Only)
FLR allows specific physical/virtual functions to be reset without affecting other physical/virtual functions or the link they share. FLR can be enabled by checking the check-box Enable Function Level Reset (FLR) in the PCIe Device tab of the PCIe PCI Express / PCI Capabilities tab in the GUI.
This interface is only present in EP mode (for x16/x8 configurations).
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
p0_flr_rcvd_pf_o[7:0] | O | Active high signals. Once asserted, the signals remain high until the Application Layer sets the p0_flr_completed_pf_i[7:0] high for the associated function. The Application Layer must perform actions necessary to clear any pending transactions associated with the function being reset. The Application Layer must assert p0_flr_completed_pf_i[7:0] to indicate it has completed the FLR actions and is ready to re-enable the PF. These busses are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_rcvd_vf_o | O | A one-cycle pulse indicates that an FLR was received from host targeting a VF. When port bifurcation is used, there is one such signal for each Avalon-ST interface. These signals are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_rcvd_pf_num_o[2:0] | O | Parent PF number of the VF undergoing FLR. When port bifurcation is used, there is one such bus for each Avalon-ST interface. These busses are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_rcvd_vf_num_o[10:0] | O | VF number offset of the VF undergoing FLR. When port bifurcation is used, there is one such bus for each Avalon-ST interface. These busses are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_completed_pf_i[7:0] | I | One bit per PF. A one-cycle pulse on any bit indicates that the application has completed the FLR sequence for the corresponding PF and is ready to be enabled. When port bifurcation is used, there is one such bus for each Avalon-ST interface. These busses are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_completed_vf_i | I | One-cycle pulse from the application re-enables a VF. When port bifurcation is used, there is one such signal for each Avalon-ST interface. These signals are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_completed_pf_num_i[2:0] | I | Parent PF number of the VF to re-enable. When port bifurcation is used, there is one such bus for each Avalon-ST interface. These busses are differentiated by the prefixes pn. | coreclkout_hip | EP |
p0_flr_completed_vf_num_i[10:0] | I | VF number offset of the VF to re-enable. When port bifurcation is used, there is one such bus for each Avalon-ST interface. These busses are differentiated by the prefixes pn. | coreclkout_hip | EP |
4.3. Serial Data Interface
P-Tile natively supports 4, 8, or 16 PCIe lanes. Each lane includes a TX differential pair and an RX differential pair. Data is striped across all available lanes.
Signal Name | Direction | Description |
---|---|---|
tx_p_out[<b>-1:0], tx_n_out[<b>-1:0] | O | Transmit serial data outputs using the High Speed Differential I/O standard. |
rx_p_in[<b>-1:0], rx_n_in[<b>-1:0] | I | Receive serial data inputs using the High Speed Differential I/O standard. |
4.4. Avalon-ST Interface
The P-tile PCIe Hard IP provides an Avalon® -ST-like interface with separate header and data to improve the bandwidth utilization.
The Avalon® -ST interface has different data bus widths depending on the link width configuration of the PCIe IP.
PCIe Link Width | Data Width (bits) | Header Width (bits) | TLP Prefix Width (bits) |
---|---|---|---|
x16 | 512 (2 x 256) | 256 (2 x 128) | 64 (2 x 32) |
x8 | 256 | 128 | 32 |
x4 | 128 | 128 | 32 |
- For the x16 configuration, two segments of 256-bit data and two segments of 128-bit header are available.
- x4 configuration is only present in Root Port mode.
4.4.1. TLP Header and Data Alignment for the Avalon-ST RX and TX Interfaces
The TLP prefix, header and data are sent and received on the TX and RX interfaces.
The ordering of bytes in the header and data portions of packets is different. The first byte of the header dword is located in the most significant byte of the dword. The first byte of the data dword is located in the least significant byte of the dword on the data bus.


4.4.2. Avalon -ST RX Interface
The Application Layer receives data from the Transaction Layer of the PCI Express* IP core over the Avalon® -ST RX interface. The application must assert rx_st_ready_i before transfers can begin.
This interface supports two rx_st_sop_o signals and two rx_st_eop_o signals per cycle when the P-Tile IP is operating in a x16 configuration. It also does not follow a fixed latency between rx_st_ready_i and rx_st_valid_o as specified by the Avalon Interface Specifications.
The x16 core provides two segments with each one having 256 bits of data (rx_st_data_o[511:256] and rx_st_data_o[255:0]), 128 bits of header (rx_st_hdr_o[255:128] and rx_st_hdr_o[127:0]), and 32 bits of TLP prefix (rx_st_tlp_prfx_o[63:32] and rx_st_tlp_prfx_o[31:0]). If this core is configured in the 1x16 mode, both segments are used, so the data bus becomes a 512-bit bus rx_st_data_o[511:0]. The start of packet can appear in the upper segment or lower segment, as indicated by the rx_st_sop_o[1:0] signals.
If this core is configured in the 2x8 mode, only the lower segment is used. In this case, the data bus is a 256-bit bus rx_st_data_o[255:0].
Finally, if this core is configured in the 4x4 mode, only the lower segment is used and only the MSB 128 bits of data are valid. In this case, the data bus is a 128-bit bus rx_st_data_o[127:0].
The x8 core provides one segment with 256 bits of data, 128 bits of header and 32 bits of TLP prefix. If this core is configured in 4x4 mode, only the LSB 128 bits of data are used.
The x4 core provides one segment with 128 bits of data, 128 bits of header and 32 bits of TLP prefix.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
x16 PCIe configuration: rx_st_data_o[511:0] x8 PCIe configuration: rx_st_data_o[255:0] x4 configuration: rx_st_data_o[127:0] |
O |
This is the Receive data bus. The Application Layer receives data from the Transaction Layer on this bus. For TLPs with an end-of-packet cycle in the lower 256 bits, the 512-bit interface supports a start-of-packet cycle in the upper 256 bits. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_empty_o[5:0] x8: rx_st_empty_o[2:0] x4: rx_st_empty_o[1:0] |
O |
Specify the number of dwords that are empty during cycles when the rx_st_eop_o signals are asserted. These signals are not valid when the rx_st_eop_o signals are not asserted. |
coreclkout_hip | EP/RP/BP |
rx_st_ready_i | I |
Indicates the Application Layer is ready to accept data. The readyLatency is 27 cycles. If rx_st_ready_i is deasserted by the Application Layer on cycle <n>, the Transaction Layer in the PCIe Hard IP continues to send traffic up to <n>+ readyLatency cycles after the deassertion of rx_st_ready_i. Once rx_st_ready_i reasserts, rx_st_valid_o resumes data transfer within readyLatency cycles. To achieve the best performance, the Application Layer must include a receive buffer large enough to avoid the deassertion of rx_st_ready_i. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_sop_o[1:0] x8/x4: rx_st_sop_o |
O |
Signals the first cycle of the TLP when asserted in conjunction with the corresponding bit of rx_st_valid_o[1:0]. rx_st_sop_o[1]: When asserted, signals the start of a TLP on rx_st_data_o[511:256]. rx_st_sop_o[0]: When asserted, signals the start of a TLP on rx_st_data_o[255:0]. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_eop_o[1:0] x8/x4: rx_st_eop_o |
O |
Signals the last cycle of the TLP when asserted in conjunction with the corresponding bit of rx_st_valid_o[1:0]. rx_st_eop_o[1]: When asserted, signals the end of a TLP on rx_st_data_o[511:256]. rx_st_eop_o[0]: When asserted, signals the end of a TLP on rx_st_data_o[255:0]. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_valid_o[1:0] x8/x4: rx_st_valid_o |
O |
These signals qualify the rx_st_data_o signals going into the Application Layer. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_hdr_o[255:0] x8/x4: rx_st_hdr_o[127:0] |
O |
This is the received header, which follows the TLP header format of the PCIe specifications. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_tlp_prfx_o[63:0] x8/x4: rx_st_tlp_prfx_o[31:0] |
O |
This is the first TLP prefix received, which follows the TLP prefix format of the PCIe specifications. PASID is included. These signals are valid when the corresponding rx_st_sop_o is asserted. The TLP prefix uses a Big Endian implementation (i.e, the Fmt field is in bits [31:29] and the Type field is in bits [28:24]). If no prefix is present for a given TLP, that dword (including the Fmt field) is all zeros. |
coreclkout_hip | EP/RP/BP |
x16: rx_st_vf_active_o[1:0] x8: rx_st_vf_active_o x4: NA |
O |
When asserted, these signals indicate that the received TLP is targeting a virtual function. When these signals are deasserted, the received TLP is targeting a physical function and the rx_st_func_num signals indicate the function number. These signals are valid when the corresponding rx_st_sop_o is asserted. These signals are multiplexed with the rx_st_hdr_o signals in the x4 configuration. These signals are valid in Endpoint mode only. |
coreclkout_hip | EP |
x16: rx_st_func_num_o[5:0] x8: rx_st_func_num_o[2:0] x4: NA |
O |
Specify the target physical function number for the received TLP. These signals are valid when the corresponding rx_st_sop_o is asserted. These signals are multiplexed with the rx_st_hdr_o signals in the x4 configuration. These signals are valid in Endpoint mode only. |
coreclkout_hip | EP |
x16: rx_st_vf_num_o[19:0] x8: rx_st_vf_num_o[10:0] x4: NA |
O |
Specify the target VF number for the received TLP. The application uses this information for both request and completion TLPs. For a completion TLP, these bits specify the VF number of the requester for this completion TLP. These signals are valid when rx_st_vf_active_o and the corresponding rx_st_sop_o are asserted. These signals are multiplexed with the rx_st_hdr_o signals in the x4 configuration. These signals are valid in Endpoint mode only. |
coreclkout_hip | EP |
x16: rx_st_bar_range_o[5:0] x8/x4: rx_st_bar_range_o[2:0] |
O |
Specify the BAR for the TLP being output. For each BAR range, the following encodings are defined:
These outputs are valid when both rx_st_sop_o and rx_st_valid_o are asserted. |
coreclkout_hip | EP/RP |
x16: rx_st_tlp_abort_o[1:0] x8/x4: rx_st_tlp_abort_o |
O |
By default, the PCIe Hard IP drops an errored TLP (a malformed TLP, or a TLP with an ECRC error or tag/requester ID (RID) mismatches). The PCIe Hard IP asserts rx_st_tlp_abort_o to notify the application an errored TLP has been dropped. |
coreclkout_hip | EP/RP |
x16: rx_st_data_par_o[63:0] x8: rx_st_data_par_o[31:0] x4: rx_st_data_par_o[15:0] |
O | Byte parity signals for rx_st_data_o. These parity signals are not available when ECC is enabled. | coreclkout_hip | EP/RP/BP |
x16: rx_st_hdr_par_o[31:0] x8/x4: rx_st_hdr_par_o[15:0] |
O | Byte parity signals for rx_st_hdr_o. These parity signals are not available when ECC is enabled. | coreclkout_hip | EP/RP/BP |
x16: rx_st_tlp_prfx_par_o[7:0] x8/x4: rx_st_tlp_prfx_par_o[3:0] |
O | Byte parity signals for rx_st_tlp_prfx_o. These parity signals are not available when ECC is enabled. | coreclkout_hip | EP/RP/BP |
rx_par_err_o | O | Asserted for a single cycle to indicate that a parity error was detected in a TLP at the input of the RX buffer. This error is logged as an uncorrectable internal error in the VSEC registers. If this error occurs, you must reset the Hard IP because parity errors can leave the Hard IP in an unknown state. | coreclkout_hip | EP/RP/BP |
4.4.3. Avalon -ST RX Interface rx_st_ready Behavior
The following timing diagram illustrates the timing of the RX interface when the application throttles the P-Tile IP for PCIe by deasserting rx_st_ready_i. The Transaction Layer in the P-Tile IP deasserts rx_st_valid_o within 27 cycles of the rx_st_ready_i deassertion. It also reasserts rx_st_valid_o within 27 cycles after rx_st_ready_i reasserts if there is more data to send. This behavior means that the readyLatency of this interface is 27. Refer to the Avalon® Interface Specifications for a detailed definition of readyLatency. rx_st_data_o is held until the application is able to accept it.
4.4.4. RX Flow Control Interface
The RX flow control interface provides information on the application's available RX buffer space to the PCIe Hard IP in a time-division multiplexing (TDM) manner. It reports the space available in number of TLPs.
The RX flow control interface is optional and disabled by default in the IP GUI. If disabled, it indicates that there is no limit in the application RX buffer space.
- Posted (P) transactions: TLPs that do not require a response.
- Non-posted (NP) transactions: TLPs that require a completion.
- Completions (CPL): TLPs that respond to non-posted transactions.
TLP Type | Category |
---|---|
Memory Write | Posted |
Memory Read | Non-posted |
Memory Read Lock | |
I/O Read | Non-posted |
I/O Write | |
Configuration Read | Non-posted |
Configuration Write | |
Message | Posted |
Completion | Completion |
Completion with Data | |
Completion Lock | |
Completion Lock with Data | |
Fetch and Add AtomicOp | Non-posted |
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
rx_buffer_limit_i[11:0] | I |
When the RX Flow Control Interface is enabled, the application can use these signals for TLP flow control. These signals indicate the application RX buffer space made available since reset/initialization. Initially, the signals are set according to the buffer size (in terms of the number of TLPs the RX buffer can take). The value of these signals always increments and rolls over. For example, if the initial value is 0xfff, the rx_buffer_limit_i[11:0] value increments by 1 and rolls over to 0x000 when one received TLP exits the application RX buffer. If a TLP type is blocked due to a lack of the corresponding RX buffer space in the application layer, other TLP types may bypass it per the PCIe transaction ordering rules. Note that the initial value of rx_buffer_limit_i[11:0] cannot be larger than 2048 TLPs. |
coreclkout_hip | EP/RP/BP |
rx_buffer_limit_tdm_idx_i[1:0] | I | These signals indicate the type of
buffer for the corresponding rx_buffer_limit_i[11:0]
signals. The Application Layer should provide the buffer limit information for all the
enabled ports in a TDM manner. The following encodings are defined:
|
coreclkout_hip | EP/RP/BP |
For more details on the usage of the scale factors, refer to Section 3.4.2 of the PCI Express Base Specification, Rev. 4.0 Version 1.0.
4.4.5. Avalon -ST TX Interface
The Application Layer transfers data to the Transaction Layer of the PCI Express* IP core over the Avalon® -ST TX interface. The Transaction Layer must assert tx_st_ready_o before transmission begins. Transmission of a packet must be uninterrupted when tx_st_ready_o is asserted.
This 512-bit interface supports two locations for the beginning of a TLP, bit[0] and bit[256]. The interface supports multiple TLPs per cycle only when an end-of-packet cycle occurs in the lower 256 bits.
The x16 core provides two segments with each one having 256 bits of data (tx_st_data_i[511:256] and tx_st_data_i[255:0]), 128 bits of header (tx_st_hdr_i[255:128] and tx_st_hdr_i[127:0]), and 32 bits of TLP prefix (tx_st_tlp_prfx_i[63:32] and tx_st_tlp_prfx_i[31:0]). If this core is configured in the 1x16 mode, both segments are used, so the data bus becomes a 512-bit bus tx_st_data_i[511:0]. The start of packet can appear in the upper segment or lower segment, as indicated by the tx_st_sop_i[1:0] signals.
If this core is configured in the 2x8 mode, only the lower segment is used. In this case, the data bus is a 256-bit bus tx_st_data_i[255:0].
Finally, if this core is configured in the 4x4 mode, only the lower segment is used and only the LSB 128 bits of data are valid. In this case, the data bus is a 128-bit bus tx_st_data_i[127:0].
The x8 core provides one segment with 256 bits of data, 128 bits of header and 32 bits of TLP prefix. If this core is configured in 4x4 mode, only the LSB 128 bits of data are used.
The x4 core provides one segment with 128 bits of data, 128 bits of header and 32 bits of TLP prefix.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
x16: tx_st_data_i[511:0] x8: tx_st_data_i[255:0] x16: tx_st_data_i[127:0] |
I |
Application Layer data for transmission. The Application Layer must provide a properly formatted TLP on the TX interface. Valid when the corresponding tx_st_valid_i signal is asserted. The mapping of message TLPs is the same as the mapping of Transaction Layer TLPs with 4-dword headers. The number of data cycles must be correct for the length and address fields in the header. Issuing a packet with an incorrect number of data cycles results in the TX interface hanging and becoming unable to accept further requests. Note: There must be no Idle cycle between the tx_st_sop_i and tx_st_eop_i cycles
unless there is backpressure with the deassertion of tx_st_ready_o.
|
coreclkout_hip | EP/RP/BP |
x16: tx_st_sop_i[1:0] x8/x4: tx_st_sop_i |
I |
Indicate the first cycle of a TLP when asserted in conjunction with
the corresponding bit of tx_st_valid_i. For the x16
configuration:
These signals are asserted for one clock cycle per each TLP. They also qualify the corresponding tx_st_hdr_i and tx_st_tlp_prfx_i signals. |
coreclkout_hip | EP/RP/BP |
x16: tx_st_eop_i[1:0] x8/x4: tx_st_eop_i |
I |
Indicate the last cycle of a TLP when asserted in conjunction with
the corresponding bit of tx_st_valid_i. For the x16
configuration:
These signals are asserted for one clock cycle per each TLP. |
coreclkout_hip | EP/RP/BP |
x16: tx_st_valid_i[1:0] x8/x4: tx_st_valid_i |
I |
Qualify the corresponding data segment of tx_st_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_st_valid_i signals. Note: There must be no Idle cycle between the tx_st_sop_i and tx_st_eop_i cycles
unless there is backpressure with the deassertion of tx_st_ready_o.
|
coreclkout_hip | EP/RP/BP |
tx_st_ready_o | O |
Indicates that the PCIe Hard IP is ready to accept data for transmission. The readyLatency is three cycles. If tx_st_ready_o is asserted by the Transaction Layer in the PCIe Hard IP on cycle <n>, then <n> + readyLatency is a ready cycle, during which the Application may assert tx_st_valid_i and transfer data. If tx_st_ready_o is deasserted by the Transaction Layer on cycle <n>, then the Application must deassert tx_st_valid_i within the readyLatency number of cycles after cycle <n>.
tx_st_ready_o can be deasserted
in the following conditions:
|
coreclkout_hip | EP/RP/BP |
x16: tx_st_err_i[1:0] x8/x4: tx_st_err_i |
I |
When asserted, indicate an error in the transmitted TLP. These
signals are asserted with tx_st_eop_i and nullify a
packet.
|
coreclkout_hip | EP/RP/BP |
x16: tx_st_hdr_i[255:0] x8/x4: tx_st_hdr_i[127:0] |
I |
This is the header to be transmitted, which follows the TLP header
format of the PCIe specifications except for the requester ID/completer ID fields
(tx_st_hdr_i[95:80]):
These signals are valid when the corresponding tx_st_sop_i signal is asserted. The header uses a Big Endian implementation. |
coreclkout_hip | EP/RP/BP |
x16: tx_st_tlp_prfx_i[63:0] x8/x4: tx_st_tlp_prfx_i[31:0] |
I |
This is the TLP prefix to be transmitted, which follows the TLP prefix format of the PCIe specifications. PASID is included. These signals are valid when the corresponding tx_st_sop_i signal is asserted. The TLP prefix uses a Big Endian implementation (i.e. the Fmt field is in bits [31:29] and the Type field is in bits [28:24]). If no prefix is present for a given TLP, that dword, including the Fmt field, is all zeros. |
coreclkout_hip | EP/RP/BP |
x16: tx_st_data_par_i[63:0] x8: tx_st_data_par_i[31:0] x4: tx_st_data_par_i[15:0] |
I |
Byte parity for tx_st_data_i. Bit [0] corresponds to tx_st_data_i[7:0], bit [1] corresponds to tx_st_data_i[15:8], and so on. By default, the PCIe Hard IP generates the parity for the TX data. However, when ECC is off, the parity can be passed in from the FPGA core by setting the k_pcie_parity_bypass register. |
coreclkout_hip | EP/RP/BP |
x16: tx_st_hdr_par_i[31:0] x8/x4: tx_st_hdr_par_i[15:0] |
I |
Byte parity for tx_st_hdr_i. By default, the PCIe Hard IP generates the parity for the TX header. However, when ECC is off, the parity can be passed in from the FPGA core by setting the k_pcie_parity_bypass register. |
coreclkout_hip | EP/RP/BP |
x16: tx_st_tlp_prfx_par_i[7:0] x8/x4: tx_st_tlp_prfx_par_i[3:0] |
I |
Byte parity for tx_st_tlp_prfx_i. By default, the PCIe Hard IP generates the parity for the TX TLP prefix. However, when ECC is off, the parity can be passed in from the FPGA core by setting the k_pcie_parity_bypass register. |
coreclkout_hip | EP/RP/BP |
tx_par_err_o | O | Asserted for a single cycle to indicate a parity error during TX TLP transmission. The IP core transmits TX TLP packets even when a parity error is detected. | coreclkout_hip | EP/RP/BP |
4.4.6. Avalon -ST TX Interface tx_st_ready Behavior
The following timing diagram illustrates the behavior of tx_st_ready_o, which is deasserted to pause the data transmission to the Transaction Layer of the P-Tile IP for PCIe, and then reasserted. The timing diagram shows a readyLatency of three cycles. Refer to the Avalon® Interface Specifications for a detailed definition of readyLatency. The application deasserts tx_st_valid_i three cycles after tx_st_ready_o is deasserted.
The application must not deassert tx_st_valid_i between tx_st_sop_i and tx_st_eop_i on a ready cycle. For the definition of a ready cycle, refer to the Avalon® Interface Specifications.
4.4.7. TX Flow Control Interface
Before a TLP can be transmitted, flow control logic verifies that the link partner's RX port has sufficient buffer space to accept it. The TX Flow Control interface reports the link partner's available RX buffer space to the Application. It reports the space available in units called Flow Control credits for posted, non-posted and completion TLPs (as defined in the RX Flow Control Interface section).
TX credit limit signals are provided in a TDM manner similar to how the RX credit limit signals are provided.
This example shows how this interface is updated when multiple MWr requests are sent. The tx_cdts_limit_o[15:0] bus value is incremented when a TLP is acknowledged by the receiver and will roll over when reaching 0xFFFF.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
tx_cdts_limit_o[15:0] | O |
Indicate the Flow Control (FC) credit units advertised by the remote Receiver. These signals represent the total number of FC credits made available by the Receiver since Flow Control initialization. Initially, these signals indicate the number of FC credits available in the remote Receiver. The value of these signals always increments and rolls over. For example, if the remote Receiver advertises an initial Non-Posted Header (NPH) FC credit of 0xFFFF, after it receives a MRd request, the NPH FC credits value increments by 1 and rolls over to 0x0000. The tx_cdts_limit_tdm_idx_o[2:0] signals determine the traffic type. When the traffic type is header credit, only the LSB 12 bits are valid. Note that, in addition to the TLPs transmitted by the user application, internally generated TLPs also consume FC credits. |
coreclkout_hip | EP/RP/BP |
tx_cdts_limit_tdm_idx_o[2:0] | O |
Indicate the traffic type for the tx_cdts_limit_o[15:0] signals. This interface provides credit limit information for all enabled ports in a TDM manner. The following encodings are defined:
|
coreclkout_hip | EP/RP/BP |
4.4.8. Tag Allocation
The P-Tile PCIe Hard IP supports the 10-bit tag Requester capability in the x16 Controller (Port 0) only. It supports up to 512 outstanding Non-Posted Requests (NPRs) with valid tag values ranging from 256 to 767.
The x8 (Port 1) and x4 Controllers (Port 2/3) don’t support the 10-bit tag Requester capability, although they support the 10-bit Completer capability.
Both x8 and x4 Controllers can allow up to 256 outstanding NPRs with valid tag values ranging from 0 to 255.
- 8-bit tags : 0 - 63
- 10-bit tags : 320 - 511, 576 - 767
Note that all PFs and their associated VFs share the same tag space. This means that different PFs and VFs cannot have outstanding tags having the same tag values.
In the TLP bypass mode, there is no restriction on the tag allocation since the P-Tile PCIe Hard IP does not do any tag management. Hence, 10-bit tags can be used without any restriction across all the cores.
4.4.8.1. Completion Buffer Size
P-tile implements Completion (Cpl) buffers for header and data for each PCIe core. In Endpoint mode, when Completion credits are infinite, user application needs to manage the number of outstanding requests to prevent overflow and lost Completions.
Completion Buffer | Depth | Width |
---|---|---|
Port 0 Cpl header | 1144 | NA |
Port 0 Cpl data | 1444 | 256 |
Port 1 Cpl header | 572 | NA |
Port 1 Cpl data | 1444 | 128 |
Port 2 Cpl header | 286 | NA |
Port 2 Cpl data | 1444 | 64 |
Port 3 Cpl header | 286 | NA |
Port 3 Cpl data | 1444 | 64 |
4.5. Hard IP Status Interface
This interface includes the signals that are useful for debugging, such as the link status signal, LTSSM state outputs, etc. These signals are available when the optional Power Management interface is enabled.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
link_up_o | O | When asserted, this signal indicates the link is up. | coreclkout_hip | EP/RP/BP |
dl_up_o | O | When asserted, this signal indicates the Data Link (DL) Layer is active. | coreclkout_hip | EP/RP/BP |
ltssm_state_o[5:0] | O | Indicates the LTSSM state:
|
coreclkout_hip | EP/RP/BP |
4.6. Interrupt Interface
The P-Tile Avalon® -ST IP for PCI Express* supports Message Signaled Interrupts (MSI), MSI-X interrupts, and legacy interrupts. MSI and legacy interrupts are mutually exclusive.
The user application generates MSI which are single-Dword memory write TLPs to implement interrupts. This interrupt mechanism conserves pins because it does not use separate wires for interrupts. In addition, the single Dword provides flexibility for the data presented in the interrupt message. The MSI Capability structure is stored in the Configuration Space and is programmed using Configuration Space accesses.
The user application generates MSI-X messages which are single-Dword memory writes. The MSI-X Capability structure points to an MSI-X table structure and an MSI-X Pending Bit Array (PBA) structure which are stored in memory. This scheme is different than the MSI Capability structure, which contains all the control and status information for the interrupts.
Enable legacy interrupts by programming the Interrupt Disable bit (bit[10]) of the Configuration Space Command to 1'b0. When legacy interrupts are enabled, the IP core emulates INTx interrupts using virtual wires. The app_int_i ports control legacy interrupt generation.
4.6.1. Legacy Interrupts
Legacy interrupts mimic the original PCI level-sensitive interrupts using virtual wire messages. The P-tile IP for PCIe signals legacy interrupts on the PCIe link using Message TLPs. The term INTx refers collectively to the four legacy interrupts, INTA#,INTB#, INTC# and INTD#. The P-tile IP for PCIe asserts app_int_i to cause an Assert_INTx Message TLP to be generated and sent upstream. A deassertion of app_int_i, i.e a transition of this signal from high to low, causes a Deassert_INTx Message TLP to be generated and sent upstream. To use legacy interrupts, you must clear the Interrupt Disable bit, which is bit 10 of the Command Register in the configuration header. Then, you must turn off the MSI Enable bit.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
x16/x8: app_int_i[7:0] x4: NA |
I | When asserted, these signals indicate an assertion of an INTx message is requested. A transition from high to low indicates a deassertion of the INTx message is requested. This bus is for EP only. Each bit is associated with a corresponding physical function. These signals must be asserted for at least 8 cycles. | coreclkout_hip | EP |
int_status_o[7:0] | O | These signals drive legacy interrupts to the Application Layer in Root Port mode. The source of the interrupt will be logged in the Root Port Interrupt Status registers in the Port Configuration and Status registers. | coreclkout_hip | RP |
app_int_i[0] is asserted for at least eight clock cycles to cause an Assert_INTx Message TLP to be generated and sent upstream for physical function 0. For a multi-functions implementation, app_int_i[0] is for physical function 0, app_int_i[1] is for physical function 1 and so on. Deasserting an app_int_i signal by driving it from high to low causes a Deassert_INTx Message TLP to be generated and sent upstream.
4.6.2. MSI
MSI interrupts are signaled on the PCI Express link using a single dword Memory Write TLP. The user application issues an MSI request (MWr) through the Avalon® -ST interface and updates the configuration space register using the MSI interface.
For more details on the MSI Capability Structure, refer to Figure 84.
The Mask Bits register and Pending Bits register are 32 bits in length each, with each potential interrupt message having its own mask bit and pending bit. If bit[0] of the Mask Bits register is set, interrupt message 0 is masked. When an interrupt message is masked, the MSI for that vector cannot be sent. If software clears the mask bit and the corresponding pending bit is set, the function must send the MSI request at that time.
You should obtain the necessary MSI information (such as the message address and data) from the configuration output interface (tl_cfg_*) to create the MWr TLP in the format shown below to be sent via the Avalon® -ST interface.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
msi_pnd_func_i[2:0] | I | Function number select for the Pending Bits register in the MSI capability structure. | coreclkout_hip | EP |
msi_pnd_addr_i[1:0] | I | Byte select for Pending Bits Register in the MSI Capability Structure. For example if msi_pnd_addr_i[1:0] = 00, bits [7:0] of the Pending Bits register will be updated with msi_pnd_byte_i[7:0]. If msi_pnd_addr_i[1:0] = 01, bits [15:8] of the Pending Bits register will be updated with msi_pnd_byte_i[7:0]. | coreclkout_hip | EP |
msi_pnd_byte_i[7:0] | I | Indicate that function has a pending associated message. | coreclkout_hip | EP |
The following figure shows the timings of msi_pnd_* signals in three scenarios. The first scenario shows the case when the MSI pending bits register is not used. The second scenario shows the case when only physical function 0 is enabled and the MSI pending bits register is used. The last scenario shows the case when four physical functions are enabled and the MSI pending bits register is used.
There are 32 possible MSI messages. The number of messages requested by a particular component does not necessarily correspond to the number of messages allocated. For example, in the following figure, the Endpoint requests eight MSIs but is only allocated two. In this case, you must design the Application Layer to use only two allocated messages.
The following table describes three example implementations. The first example allocates all 32 MSI messages. The second and third examples only allocate 4 interrupts.
MSI |
Allocated |
||
---|---|---|---|
32 |
4 |
4 |
|
System Error |
31 |
3 |
3 |
Hot Plug and Power Management Event |
30 |
2 |
3 |
Application Layer |
29:0 |
1:0 |
2:0 |
MSI interrupts generated for Hot Plug, Power Management Events, and System Errors always use Traffic Class 0. MSI interrupts generated by the Application Layer can use any Traffic Class. For example, a DMA that generates an MSI at the end of a transmission can use the same traffic control as was used to transfer data.
The following figure illustrates a possible implementation of the Interrupt Handler Module with a per vector enable bit in the Application Layer. Alternatively, the Application Layer could implement a global interrupt enable instead of this per vector MSI.
4.6.3. MSI-X
The P-Tile IP for PCIe provides a Configuration Intercept Interface. User soft logic can monitor this interface to get MSI-X Enable and MSI-X function mask related information. User application logic needs to implement the MSI-X tables for all PFs and VFs at the memory space pointed to by the BARs as a part of your Application Layer.
For more details on the MSI-X related information that you can obtain from the Configuration Intercept Interface, refer to the MSI-X Registers section in the Registers chapter.
MSI-X is an optional feature that allows the user application to support large amount of vectors with independent message data and address for each vector.
When MSI-X is supported, you need to specify the size and the location (BARs and offsets) of the MSI-X table and PBA. MSI-X can support up to 2048 vectors per function versus 32 vectors per function for MSI.
A function is allowed to send MSI-X messages when MSI-X is enabled and the function is not masked. The application uses the Configuration Output Interface (address 0x0C bit[5:4]) or Configuration Intercept Interface to access this information.
When the application needs to generate an MSI-X, it will use the contents of the MSI-X Table (Address and Data) and generate a Memory Write through the Avalon® -ST interface.
You can enable MSI-X interrupts by turning on the Enable MSI-X option under the PCI Express/PCI Capabilities tab in the parameter editor. If you turn on the Enable MSI-X option, you should implement the MSI-X table structures at the memory space pointed to by the BARs as a part of your Application Layer.
The MSI-X Capability Structure contains information about the MSI-X Table and PBA Structure. For example, it contains pointers to the bases of the MSI-X Table and PBA Structure, expressed as offsets from the addresses in the function's BARs. The Message Control register within the MSI-X Capability Structure also contains the MSI-X Enable bit, the Function Mask bit, and the size of the MSI-X Table. For a picture of the MSI-X Capability Structure, refer to Figure 86.
MSI-X interrupts are standard Memory Writes, therefore Memory Write ordering rules apply.
Example:
MSI-X Vector | MSI-X Upper Address | MSI-X Lower Address | MSI-X Data |
---|---|---|---|
0 | 0x00000001 | 0xAAAA0000 | 0x00000001 |
1 | 0x00000001 | 0xBBBB0000 | 0x00000002 |
2 | 0x00000001 | 0xCCCC0000 | 0x00000003 |
PBA Table | PBA Entries |
---|---|
Offset 0 | 0x0 |
If the application needs to generate an MSI-X interrupt (vector 1), it will read the MSI-X Table information, generate a MWR TLP through the Avalon® -ST interface and assert the corresponding PBA bits (bit[1]) in a similar fashion as for MSI generation.
The generated TLP will be sent to address 0x00000001_BBBB0000 and the data will be 0x00000002. When the MSI-X has been sent, the application can clear the associated PBA bits.
4.6.3.1. Implementing MSI-X Interrupts
-
Host software sets up the MSI-X interrupts in the Application
Layer by completing the following steps:
-
Host software reads the Message
Control register at 0x050 register to determine the MSI-X
Table size. The number of table entries is the <value read> + 1.
The maximum table size is 2048 entries. Each 16-byte entry is divided in 4 fields as shown in the figure below. The MSI-X table can be accessed on any BAR configured. The base address of the MSI-X table must be aligned to a 4 KB boundary.
-
The host sets up the MSI-X table. It programs MSI-X
address, data, and masks bits for each entry as shown in the figure
below.
Figure 40. Format of MSI-X Table
-
The host calculates the address of the <n
th
> entry using the following formula:
nth_address = base address[BAR] + 16<n>
-
Host software reads the Message
Control register at 0x050 register to determine the MSI-X
Table size. The number of table entries is the <value read> + 1.
- When Application Layer has an interrupt, it drives an interrupt request to the IRQ Source module.
-
The IRQ Source sets appropriate bit in the MSI-X PBA table.
The PBA can use qword or dword accesses. For qword accesses, the IRQ Source calculates the address of the <m th > bit using the following formulas:
qword address = <PBA base addr> + 8(floor(<m>/64)) qword bit = <m> mod 64
Figure 41. MSI-X PBA Table -
The IRQ Processor reads the entry in the MSI-X table.
- If the interrupt is masked by the Vector_Control field of the MSI-X table, the interrupt remains in the pending state.
- If the interrupt is not masked, IRQ Processor sends Memory Write Request to the TX slave interface. It uses the address and data from the MSI-X table. If Message Upper Address = 0, the IRQ Processor creates a three-dword header. If the Message Upper Address > 0, it creates a 4-dword header.
- The host interrupt service routine detects the TLP as an interrupt and services it.
4.7. Error Interface
This is an optional interface in the Intel P-Tile Avalon® -ST IP for PCI Express that allows the Application Layer to report errors to the IP core and vice versa. Specifically, the Application Layer can report the different types of errors defined by the app_error_info_i signal to the IP. For Advanced Error Reporting (AER), the Application Layer can provide the information to log the TLP header and the error log request via the app_err_* interface.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
serr_out_o | O |
Indicates system error is detected. RP mode: A one-clock-cycle pulse on this signal indicates if any device in the hierarchy reports any of the following errors and the associated enable bit is set in the Root Control register: ERR_COR, ERR_FATAL, ERR_NONFATAL. Also asserted when an internal error is detected. The source of the error will be logged in the Root Port Error Status registers in the Port Configuration and Status registers. EP mode: Asserted when the P-Tile PCIe Hard IP sends a message of correctable/non-fatal/fatal error. BP mode: The transaction layer or data link layer errors detected by the Hard IP core trigger this signal. Detailed information are logged in the Bypass Mode Error Status registers in the Port Configuration and Status registers. |
coreclkout_hip | EP/RP/BP |
hip_enter_err_mode_o | O | Asserted when the Hard IP enters the error mode. This usually happens when the Hard IP detects an uncorrectable RAM ECC error. Upon seeing the assertion of this signal, you should discard all the TLPs received. | coreclkout_hip | EP/RP/BP |
app_err_valid_i | I | A one-cycle pulse on this signal indicates that the data on app_err_info, app_err_hdr, and app_err_func_num are valid in that cycle and app_err_hdr_i will be valid during the following four cycles. | coreclkout_hip | EP/RP |
app_err_hdr_i[31:0] | I |
This bus contains the header and TLP prefix information for the error TLP. The 128-bit header and 32-bit TLP prefix are sent to the Hard IP over five cycles (32 bits of information are sent in each clock cycle). Cycle 1 : header[31:0] Cycle 2 : header[63:32] Cycle 3 : header[95:64] Cycle 4 : header[127:96] Cycle 5 : TLP prefix |
coreclkout_hip | EP/RP |
app_err_info_i[12:0] | I | This error bus carries the following
information:
|
coreclkout_hip | EP/RP |
x16/x8: app_err_func_num_i[2:0] x4: NA |
I | This bus contains the function number for the function that asserts the error valid signal. | coreclkout_hip | EP/RP |
4.7.1. Completion Timeout Interface
The P-Tile IP for PCIe features a Completion timeout mechanism to keep track of Non-Posted requests sent by the user application and the corresponding Completions received. When the P-Tile IP detects a Completion timeout, it notifies the user application by asserting the cpl_timeout_o signal.
When a Completion timeout happens, the user application can use the Avalon® -MM Completion Timeout Interface (for each port) to access the Completion timeout FIFO in the Hard IP to get more detailed information about the event and update the AER capability registers if required. After the completion timeout FIFO becomes empty, the IP core deasserts the cpl_timeout_o signal.
The cpl_timeout_avmm interface is synchronized to the cpl_timeout_avmm_clk_i clock.
Example:
When cpl_timeout_o is asserted, the user application can issue an Avalon® -MM Read to retrieve information from the Completion FIFO. Then, it can issue an Avalon® -MM Write to write 1 to bit[0] of the CONTROL register to get access to the next data.
Signal Name | Direction | Description | Clock domain | EP/RP/BP |
---|---|---|---|---|
cpl_timeout_o | O |
Indicates the event that the completion TLP for a request has not been received within the expected time window. The IP core asserts this signal as long as the completion timeout FIFO in the Hard IP is not empty. You can obtain more details about the completion timeout event by looking at the signals on the completion timeout Avalon® -MM interface (listed below). |
coreclkout_hip | EP/RP/BP |
cpl_timeout_avmm_read_i | I | Avalon® -MM read enable. | cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_readdata_o[7:0] | O | Avalon® -MM read data outputs. | cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_readdata_valid_o | O | This signal qualifies the cpl_timeout_avmm_readdata_o signals into the Application Layer. | cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_write_i | I | Avalon® -MM write enable. | cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_writedata_i[7:0] | I | Avalon® -MM write data inputs. | cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_addr_i[20:0] | I |
Avalon® -MM address inputs. [20:3] : Reserved. Tie them to 0. [2:0] : Address for the FIFO register. Refer to the address map table below for more details. |
cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_waitrequest_o | O | When asserted, this signal indicates the IP core is not ready to take any request. | cpl_timeout_avmm_clk_i | EP/RP/BP |
cpl_timeout_avmm_clk_i | I |
Avalon® -MM clock. 50 MHz - 125 MHz (Range) 100 MHz (Recommended) |
EP/RP/BP |
Address | Name | Access Type | Description |
---|---|---|---|
0x0 | STATUS | RO |
[7:2] : Reserved [1] : Completion timeout FIFO full [0] : Completion timeout FIFO empty |
0x1 | CONTROL | WO |
[7:1] : Reserved [0] : Read (popping data from the FIFO). You need to read all the information regarding the timed out request before writing 1 to bit 0 of the CONTROL register. Writing to bit 0 of the CONTROL register makes the next data appear. |
0x2 | VF | RO |
[7:0] : vfunc_num[7:0] Virtual Function number for the VF that initiates the non-posted transaction for which the completion timeout is observed. |
0x3 | PF | RO |
[7] : vfunc_active [6] : Reserved [5:3] : func_num[2:0] Physical function number (least significant 8 bits) for the PF that initiates the non-posted transaction for which the completion timeout is observed. [2:0] : vfunc_num[10:8] Virtual Function number (most significant 3 bits) for the VF that initiates the memory read request for which the completion timeout is observed. |
0x4 | LEN1 | RO |
[7:0] : cpl_lenn[7:0] Transfer length in bytes (least significant 8 bits), of the expected completion that timed out for the non-posted transaction. For a split completion, it indicates the number of bytes remaining to be delivered when the completion timed out (Max length is Max Read request size. Ex: 4K Bytes = 2^12 bytes) |
0x5 | LEN2 | RO |
[7:4] : Reserved [3:0] : cpl_lenn[11:8] Transfer length in bytes (most significant 4 bits), of the expected completion that timed out for the non-posted transaction. For a split completion, it indicates the number of bytes remaining to be delivered when the completion timed out (Max length is Max Read request size. Ex: 4K Bytes = 2^12 bytes) |
0x6 | TAG1 | RO |
[7:0] : cpl_tag[7:0] Tag ID (least significant 8 bits) of the expected completion that timed out for the non-posted transaction. |
0x7 | TAG2 | RO |
[7:5] : cpl_tc[2:0] Traffic class of the expected completion that timed out for the non-posted transaction. [4:3] : cpl_attr[1:0] Attribute of the expected completion that timed out for the non-posted transaction. ID based ordering is not supported. [4] -> Relaxed ordering, [3] -> No Snoop [2] : Reserved [1:0]: cpl_tag[9:8] Tag ID (most significant 2 bits) of the expected completion that timed out for the non-posted transaction. |
4.8. Hot Plug Interface (RP Only)
Hot Plug support means that the device can be added to or removed from a system during runtime. The Hot Plug Interface in the P-Tile IP for PCIe allows an Intel FPGA with this IP to safely provide this capability.
This section describes the signals reported by the on-board hot plug components in the Downstream Port. This interface is available only if the Slot Status Register of the PCI Express Capability Structure is enabled.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
sys_atten_button_pressed_i | I | Attention Button Pressed. Indicates that the system attention button was pressed, and sets the Attention Button Pressed bit in the Slot Status Register. | coreclkout_hip | RP |
sys_pwr_fault_det_i | I | Power Fault Detected. Indicates the power controller detected a power fault at this slot. | coreclkout_hip | RP |
sys_mrl_sensor_chged_i | I | MRL Sensor Changed. Indicates that the state of the MRL sensor has changed. | coreclkout_hip | RP |
sys_pre_det_chged_i | I | Presence Detect Changed. Indicates that the state of the card presence detector has changed. | coreclkout_hip | RP |
sys_cmd_cpled_int_i | I | Command Completed Interrupt. Indicates that the Hot Plug controller completed a command. | coreclkout_hip | RP |
sys_pre_det_state_i | I |
Indicates whether or not a card is present in the slot. 0 : slot is empty. 1 : card is present in the slot. |
coreclkout_hip | RP |
sys_mrl_sensor_state_i | I |
MRL Sensor State. Indicates the state of the manually operated retention latch (MRL) sensor. 0 : MRL is closed. 1 : MRL is open. |
coreclkout_hip | RP |
sys_eml_interlock_engaged_i | I | Indicates whether the system electromechanical interlock is engaged, and controls the state of the electromechanical interlock status bit in the Slot Status Register. | coreclkout_hip | RP |
sys_aux_pwr_det_i | I |
Auxiliary Power Detected. Used to report to the host software that auxiliary power (Vaux) is present. Refer to the Device Status Register in the PCI Express Capability Structure. |
coreclkout_hip | RP |
4.9. Power Management Interface
Software programs the device into a D-state by writing to the Power Management Control and Status register in the PCI Power Management Capability Structure. The power management output signals indicate the current power state. The IP core supports the two mandatory power states: D0 (full power) and D3 (preparation for a loss of power). It does not support the optional D1 and D2 low-power states.
The correspondence between the device power states (D states) and link power states (L states) is as follows:
Device Power State | Link Power State |
---|---|
D0 | L0 |
D1 (not supported) | L1 |
D2 (not supported) | L1 |
D3 | L1, L2/L3 Ready |
P-Tile does not support ASPM.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
pm_state_o[2:0] | O | Indicates the current power
state.
|
coreclkout_hip | EP/RP/BP |
x16/x8: pm_dstate_o[31:0] x4: pm_dstate_o[3:0] |
O | Power management D-state for each
function.
|
Async | EP/RP/BP |
x16/x8: apps_pm_xmt_pme_i[7:0] x4: NA |
I | The application logic asserts this signal for one cycle to wake up the Power Management Capability (PMC) state machine from a D1, D2, or D3 power state. Upon wake-up, the IP core sends a PM_PME message. | coreclkout_hip | EP/BP |
apps_pm_xmt_turnoff_i | I | This signal is a request from the Application Layer to generate a PM_Turn_Off message. The Application Layer must assert this signal for one clock cycle. The IP core does not return an acknowledgement or grant signal. The Application Layer must not pulse the same signal again until the previous message has been transmitted. | coreclkout_hip | RP |
app_init_rst_i | I | The Application Layer uses this signal to request a hot reset to downstream devices. The hot reset request will be sent when a single-cycle pulse is applied to this pin. | coreclkout_hip | RP |
app_req_retry_en_i | I | When this signal is asserted, the PCIe Controller responds to Configuration TLPs with a CRS (Configuration Retry Status) if it has not already responded to a Configuration TLP with non-CRS status since the last reset. You can use this signal to hold off on enumeration. This input is not used for Root Ports. | coreclkout_hip | EP/BP |
4.10. Configuration Output Interface
The Transaction Layer configuration output (tl_cfg) bus provides a subset of the information stored in the Configuration Space. Use this information in conjunction with the app_err* signals to understand TLP transmission problems.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
tl_cfg_ctl_o[15:0] | O | Multiplexed data output from the register specified by tl_cfg_add_o[4:0]. The detailed information for each field in this bus is defined in the following table. | coreclkout_hip | EP/RP/BP |
tl_cfg_add_o[4:0] | O | This address bus contains the index indicating which Configuration Space register information is being driven onto the tl_cfg_ctl_o[15:0] bits. | coreclkout_hip | EP/RP/BP |
x16/x8: tl_cfg_func_o[2:0] x4: NA |
O | Specifies the function whose
Configuration Space register values are being driven out on tl_cfg_ctl_o[15:0].
and so on |
coreclkout_hip | EP/RP/BP |
The table below provides the tl_cfg_add_o[4:0] to tl_cfg_ctl_o[15:0] mapping.
tl_cfg_add_o[4:0] | tl_cfg_ctl_o[15:8] | tl_cfg_ctl_o[7:0] |
---|---|---|
5'h00 |
[15]: memory space enable [14]: IDO completion enable [13]: perr_en [12]: serr_en [11]: fatal_err_rpt_en [10]: nonfatal_err_rpt_en [9]: corr_err_rpt_en [8]: unsupported_req_rpt_en |
Device control: [7]: bus master enable [6]: extended tag enable [5:3]: maximum read request size [2:0]: maximum payload size |
5'h01 |
[15]: IDO request enable [14]: No Snoop enable [13]: Relaxed Ordering enable [12:8]: Device number |
bus number |
5'h02 |
[15]: pm_no_soft_rst [14]: RCB control [13]: Interrupt Request (IRQ) disable [12:8]: PCIe Capability IRQ message number |
[7:5]: reserved [4]: system power control [3:2]: system attention indicator control [1:0]: system power indicator control |
5'h03 | Number of VFs [15:0] | |
5'h04 |
[15]: reserved [14]: AtomicOP Egress Block field (cfg_atomic_egress_block) [13:9]: ATS Smallest Translation Unit (STU)[4:0] [8]: ATS cache enable |
[7]: ARI forward enable [6]: Atomic request enable [5:3]: TPH ST mode [2:1]: TPH enable [0]: VF enable |
5'h05 |
[15:12]: auto negotiation link speed. Link speed encoding values
are:
[11:1]: Index of Start VF [10:0] [0]: reserved |
|
5'h06 | MSI Address [15:0] | |
5'h07 | MSI Address [31:16] | |
5'h08 | MSI Address [47:32] | |
5'h09 | MSI Address [63:48] | |
5'h0A | MSI Mask [15:0] | |
5'h0B | MSI Mask [31:16] | |
5'h0C |
[15]: cfg_send_f_err [14]: cfg_send_nf_err [13]: cfg_send_cor_err [12:8]: AER IRQ message number |
[7]: Enable extended message data for MSI (cfg_msi_ext_data_en) [6]: MSI-X func mask [5]: MSI-X enable [4:2]: Multiple MSI enable [1]: 64-bit MSI [0]: MSI enable |
5'h0D | MSI Data [15:0] | |
5'h0E | AER uncorrectable error mask [15:0] | |
5'h0F | AER uncorrectable error mask [31:16] | |
5'h10 | AER correctable error mask [15:0] | |
5'h11 | AER correctable error mask [31:16] | |
5'h12 | AER uncorrectable error severity [15:0] | |
5'h13 | AER uncorrectable error severity [31:16] | |
5'h14 | [15:8]: ACS Egress Control Register (cfg_acs_egress_ctrl_vec) |
[7]: ACS function group enable (cfg_acs_func_grp_en) [6]: ACS direct translated P2P enable (cfg_acs_p2p_direct_tranl_en) [5]: ACS P2P egress control enable (cfg_acs_egress_ctrl_en) [4]: ACS upstream forwarding enable (cfg_acs_up_forward_en) [3]: ACS P2P completion redirect enable (cfg_acs_p2p_compl_redirect_en) [2]: ACS P2P request redirect enable (cfg_acs_p2p_req_redirect_en) [1]: ACS translation blocking enable (cfg_acs_at_blocking_en) [0]: ACS source validation enable (RP) (cfg_acs_validation_en) |
5'h15 |
[15]: reserved [14]: 10-bit tag requester enable (cfg_10b_tag_req_en) [13]: VF 10-bit tag requester enable (cfg_vf_10b_tag_req_en) [12]: PRS_RESP_FAILURE (cfg_prs_response_failure) [11]: PRS_UPRGI (cfg_prs_uprgi) [10]: PRS_STOPPED (cfg_prs_stopped) [9]: PRS_RESET (cfg_prs_reset) [8]: PRS_ENABLE (cfg_prs_enable) |
[7:3]: reserved [2:0]: ARI function group (cfg_ari_func_grp) |
5'h16 | PRS_OUTSTANDING_ALLOCATION (cfg_prs_outstanding_allocation) [15:0] | |
5'h17 | PRS_OUTSTANDING_ALLOCATION (cfg_prs_outstanding_allocation) [31:16] | |
5'h18 |
[15:10]: reserved [9]: Disable autonomous generation of LTR clear message (cfg_disable_ltr_clr_msg) [8]: LTR mechanism enable (cfg_ltr_m_en) |
[7]: Infinite credits for Posted header [6]: Infinite credits for Posted data [5]: Infinite credits for Completion header [4]: Infinite credits for Completion data [3]: End-end TLP prefix blocking (cfg_end2end_tlp_pfx_blck) [2]: PASID enable (cfg_pf_pasid_en) [1]: Execute permission enable (cfg_pf_passid_execute_perm_en ) [0]: Privileged mode enable (cfg_pf_passid_priv_mode_en) |
5'h19 |
[15:9]: reserved [8]: Slot control attention button pressed enable (cfg_atten_button_pressed_en) |
[7]: Slot control power fault detect enable (cfg_pwr_fault_det_en) [6]: Slot control MRL sensor changed enable (cfg_mrl_sensor_chged_en) [5]: Slot control presence detect changed enable (cfg_pre_det_chged_en) [4]: Slot control hot plug interrupt enable (cfg_hp_int_en) [3]: Slot control command completed interrupt enable (cfg_cmd_cpled_int_en) [2]: Slot control DLL state change enable (cfg_dll_state_change_en) [1]: Slot control accessed (cfg_hp_slot_ctrl_access) [0]: PF’s SERR# enable (cfg_br_ctrl_serren) |
5'h1A | LTR maximum snoop latency register (cfg_ltr_max_latency[15:0]) | |
5'h1B | LTR maximum no-snoop latency register (cfg_ltr_max_latency[31:16]) | |
5'h1C | [15:8]: enabled Traffic Classes (TCs) (cfg_tc_enable[7:0]) |
[5:0]: auto negotiation link width 6’h01 = x1 6’h02 = x2 6’h04 = x4 6’h08 = x8 6’h10 = x16 |
5'h1D | MSI Data[31:16] | |
5'h1E | N/A | |
5'h1F | N/A |
The information on the Configuration Output (tl_cfg) bus is time-division multiplexed (TDM).
- When tl_cfg_func[2:0] = 3'b000, tl_cfg_ctl[31:0] drive out the PF0 Configuration Space register values.
- Then, tl_cfg_func[2:0] are incremented to 3'b001.
- When tl_cfg_func[2:0] = 3'b001, tl_cfg_ctl[31:0] drive out the PF1 Configuration Space register values.
- This pattern repeats to cover all enabled PFs.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
dl_timer_update_o | O |
Active high pulse that asserts whenever the current link speed, link width, or max payload size changes. When any of these parameters changes, the IP's internal Replay/Ack-Nak timers default back to their internally calculated PCIe tables. To override these default values, reprogram the Port Logic register when these events occur. |
coreclkout_hip | EP/RP/BP |
4.11. Configuration Intercept Interface (EP Only)
The Configuration Intercept Interface (CII) allows the application logic to detect the occurrence of a Configuration (CFG) request on the link and to modify its behavior.
The application logic should detect the CFG request at the rising edge of cii_req. Due to the latency of the EMIB, the cii_req can be deasserted many cycles after the deassertion of cii_halt.
- Delay the processing of a CFG request by the controller. This allows the application to perform any housekeeping task first.
- Overwrite the data payload of a CfgWr request. The application logic can also overwrite the data payload of a CfgRd completion TLP.
This interface also allows you to implement the Intel Vendor Specific Extended Capability (VSEC) registers. All configuration accesses targeting the Intel VSEC registers (addresses 0xD00 to 0xFFF) are automatically mapped to this interface and can be monitored via this interface.
If you are not using this interface, tie cii_halt_p0/1 to logic 0.
Signal Name | Direction | Description | Clock domain | EP/RP/BP |
---|---|---|---|---|
cii_req_o | O |
Indicates the CFG request is intercepted and all the other CII signals are valid. |
coreclkout_hip | EP |
cii_hdr_poisoned_o | O |
The poisoned bit in the received TLP header on the CII. |
coreclkout_hip | EP |
cii_hdr_first_be_o[3:0] | O |
The first dword byte enable field in the received TLP header on the CII. |
coreclkout_hip | EP |
cii_func_num_o[2:0] | O |
The function number in the received TLP header on the CII. |
coreclkout_hip | EP |
cii_wr_o | O |
Indicates that cii_dout_p0/1 is valid. This signal is asserted only for a configuration write request. |
coreclkout_hip | EP |
cii_addr_o[9:0] | O |
The double word register address in the received TLP header on the CII. |
coreclkout_hip | EP |
cii_dout_o[31:0] | O |
Received TLP payload data from the link partner to your application client. The data is in little endian format. The first received payload byte is in [7:0]. |
coreclkout_hip | EP |
cii_override_en_i | I |
Override enable. When the application logic asserts this input, the PCIe Hard IP overrides the CfgWr payload or CfgRd completion using the data supplied by the application logic on cii_override_din. |
coreclkout_hip | EP |
cii_override_din_i[31:0] | I |
Override data.
|
coreclkout_hip | EP |
cii_halt_i | I |
Flow control input signal. When cii_halt_p0/1 is asserted, the PCIe Hard IP halts the processing of CFG requests for the PCIe configuration space registers. |
coreclkout_hip | EP |
4.12. Hard IP Reconfiguration Interface
The Hard IP reconfiguration interface is an Avalon® -MM slave interface with a 21‑bit address and an 8‑bit data bus. It is also sometimes referred to as the User Avalon® -MM Interface. You can use this interface to dynamically modify the value of configuration registers. Note that after a warm reset or cold reset, changes made to the configuration registers of the Hard IP via the Hard IP reconfiguration interface are lost as these registers revert back to their default values.
In Root Port mode, the application logic uses the Hard IP reconfiguration interface to access its PCIe configuration space to perform link control functions (such as Hot Reset, link disable, or link retrain).
In TLP Bypass mode, the Hard IP forwards the received Type0/1 Configuration request TLPs to the application logic, which must respond with Completion TLPs with a status of Successful Completion (SC), Unsupported Request (UR), Configuration Request Retry Status (CRS), or Completer Abort (CA). If a received Configuration request TLP needs to update a PCIe configuration space register, the application logic needs to use the Hard IP reconfiguration interface to access that PCIe configuration space register.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
hip_reconfig_clk | I |
Reconfiguration clock 50 MHz - 125 MHz (Range) 100 MHz (Recommended) |
EP/RP/BP | |
hip_reconfig_readdata_o[7:0] | O | Avalon® -MM read data outputs | hip_reconfig_clk | EP/RP/BP |
hip_reconfig_readdatavalid_o | O | Avalon® -MM read data valid. When asserted, the data on hip_reconfig_readdata_o[7:0] is valid. | hip_reconfig_clk | EP/RP/BP |
hip_reconfig_write_i | I | Avalon® -MM write enable | hip_reconfig_clk | EP/RP/BP |
hip_reconfig_read_i | I | Avalon® -MM read enable | hip_reconfig_clk | EP/RP/BP |
hip_reconfig_address_i[20:0] | I | Avalon® -MM address | hip_reconfig_clk | EP/RP/BP |
hip_reconfig_writedata_i[7:0] | I | Avalon® -MM write data inputs | hip_reconfig_clk | EP/RP/BP |
hip_reconfig_waitrequest_o | O | When asserted, this signal indicates that the IP core is not ready to respond to a request. | hip_reconfig_clk | EP/RP/BP |
dummy_user_avmm_rst | I | Reset signal. You can tie it to ground or leave it floating when using the Hard IP Reconfiguration Interface. | EP/RP/BP |
Reading and Writing to the Hard IP Reconfiguration Interface
Reading from the Hard IP reconfiguration interface of the P-Tile Avalon® -ST IP for PCI Express retrieves the current value at a specific address. Writing to the reconfiguration interface changes the data value at a specific address. Intel recommends that you perform read-modify-writes when writing to a register, because two or more features may share the same reconfiguration address.
Modifying the PCIe configuration registers directly affects the behavior of the PCIe device.
4.12.1. Address Map for the User Avalon-MM Interface
The User Avalon® -MM interface provides access to the configuration registers and the IP core registers. This interface includes an 8-bit data bus and a 21-bit address bus (which contains the byte addresses).
- Using direct User Avalon® -MM interface (byte access)
- Using the Debug (DBI) register access (dword access). This method is useful when you need to read/write the entire 32 bits at one time (Counter/ Lane Margining, etc.)
Registers | User Avalon® -MM Offsets | Comments |
---|---|---|
Physical function 0 | 0x0000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16, x8 and x4 cores. |
Physical function 1 | 0x1000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
Physical function 2 | 0x2000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
Physical function 3 | 0x3000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
Physical function 4 | 0x4000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
Physical function 5 | 0x5000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
Physical function 6 | 0x6000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
Physical function 7 | 0x7000 | Refer to Appendix A for more details of the PF configuration space. This PF is available for x16 and x8 cores only. |
User Avalon-MM Port Configuration Register | 0x104068 | Refer to User Avalon-MM Port Configuration Register (Offset 0x104068) for more details. |
Debug (DBI) Register | 0x104200 to 0x104204 | Refer to Using the Debug Register Interface Access (Dword Access) for more details. |
4.12.1.1. User Avalon-MM Port Configuration Register (Offset 0x104068)
Bits | Register Description | Default Value | Access |
---|---|---|---|
[31:29] | Reserved | 0x0 | RO |
[28:18] | Select the virtual function number. | 0x0 | RW/RO |
[17] | To access the virtual function registers, this bit should be set to one. | 0x0 | RW/RO |
[16:2] | Reserved | 0x0 | RO |
[1] |
Reserved. Clear this bit for access to standard PCIe* configuration registers. |
0x0 | RW/RO |
[0] | If set, it allows access to Intel VSEC registers. | 0x0 | RW/RO |
4.12.2. Configuration Registers Access
4.12.2.1. Using Direct User Avalon-MM Interface (Byte Access)
Targeting PF Configuration Space Registers
User application needs to specify the offsets of the targeted PF registers.
For example, if the application wants to read the MSI Capability Register of PF0, it will issue a Read with address 0x0050 to target the MSI Capability Structure of PF0.
Targeting VF Configuration Space Registers
User application needs to first specify the VF number of the targeted configuration register.
The application needs to program the User Avalon-MM Port Configuration Register at offset 0x104068 accordingly.
- Issue a user Avalon® -MM Write request with address 0x104068 and data 0xE ( vf_num[28:18] = 3, vf _select[17] = 1, vsec[0]=0).
- Issue a user Avalon® -MM Read request with address 0xB0 to access VF3 registers.
Targeting VSEC Registers
User application needs to program the VSEC field (0x104068 bit[0]) first. Then all accesses from the user Avalon® -MM interface starting at offset 0xD00 will be translated to VSEC configuration space registers.
4.12.2.2. Using the Debug Register Interface Access (Dword Access)
DEBUG_DBI_ADDR register is located at user Avalon® -MM offsets 0x104204 to 0x104207 (corresponding to byte 0 to byte 3).
Names | Bits | R/W | Descriptions |
---|---|---|---|
d_done | 31 | RO | 1: indicates debug DBI read/write access done |
d_write | 30 | R/W |
1: write access 0: read access |
d_warm_reset | 29 | RO |
1: normal operation 0: warm reset is on-going |
d_vf | 28:18 | R/W | Specify the virtual function number. |
d_vf_select | 17 | R/W | To access the virtual function registers, set this bit to one. |
d_pf | 16:14 | R/W | Specify the physical function number. |
reserved | 13:12 | R/W | Reserved |
d_addr | 11:2 | R/W | Specify the DW address for the P-Tile Hard IP DBI interface. |
d_shadow_select | 1 | R/W |
Reserved. Clear this bit for access to standard PCIe configuration registers. |
d_vsec_select | 0 | R/W | If set, this bit allows access to Intel VSEC registers. |
DEBUG_DBI_DATA register is located at user Avalon® -MM offsets 0x104200 to 0x104203 (corresponding to byte 0 to byte 3).
Names | Bits | R/W | Descriptions |
---|---|---|---|
d_data | 31:0 | R/W | Read or write data for the P-Tile Hard IP register access. |
- Use the user_avmm interface to access 0x104200 to 0x104203 to write the data first.
- Use the user_avmm interface to access 0x104204 to 0x104206 to set the address and control bits.
- Use the user_avmm interface to write to 0x104207 to enable the read/write bit (bit[30]).
- Use the user_avmm interface to access 0x104207 bit[31] to poll if the write is complete.
- Use the user_avmm interface to access 0x104204 to 0x104206 to set the address and control bits.
- Use the user_avmm interface to write to 0x104207 to enable the read bit (bit[30]).
- Use the user_avmm interface to access 0x104207 bit[31] to poll if the read is complete.
- Use the user_avmm interface to access 0x104200 to 0x104203 to read the data
4.13. PHY Reconfiguration Interface
The PHY reconfiguration interface is an optional Avalon® -MM slave interface with a 26‑bit address and an 8‑bit data bus. Use this bus to read the value of PHY registers. Refer to Table 110 for details on addresses and bit mappings for the PHY registers that you can access using this interface.
These signals are present when you turn on Enable PHY reconfiguration on the Top-Level Settings tab using the parameter editor.
Please note that the PHY reconfiguration interface is shared among all the PMA quads.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
xcvr_reconfig_clk | I |
Reconfiguration clock 50 MHz - 125 MHz (Range) 100 MHz (Recommended) |
EP/RP/BP | |
xcvr_reconfig_readdata[7:0] | O | Avalon® -MM read data outputs | xcvr_reconfig_clk | EP/RP/BP |
xcvr_reconfig_readdatavalid | O | Avalon® -MM read data valid. When asserted, the data on xcvr_reconfig_readdata[7:0] is valid. | xcvr_reconfig_clk | EP/RP/BP |
xcvr_reconfig_write | I | Avalon® -MM write enable | xcvr_reconfig_clk | EP/RP/BP |
xcvr_reconfig_read | I | Avalon® -MM read enable. This interface is not pipelined. You must wait for the return of the xcvr_reconfig_readdata[7:0] from the current read before starting another read operation. | xcvr_reconfig_clk | EP/RP/BP |
xcvr_reconfig_address[25:0] | I |
Avalon® -MM address [25:21] are used to indicate the Quad. 5'b00001 : Quad 0 5'b00010 : Quad 1 5'b00100 : Quad 2 5'b01000 : Quad 3 [20:0] are used to indicate the offset address. |
xcvr_reconfig_clk | EP/RP/BP |
xcvr_reconfig_writedata[7:0] | I | Avalon® -MM write data inputs | xcvr_reconfig_clk | EP/RP/BP |
xcvr_reconfig_waitrequest | O | When asserted, this signal indicates that the PHY is not ready to respond to a request. | xcvr_reconfig_clk | EP/RP/BP |
Reading from the PHY Reconfiguration Interface
Reading from the PHY reconfiguration interface of the P-Tile Avalon® -ST IP for PCI Express retrieves the current value at a specific address.
4.14. Page Request Service (PRS) Interface (EP Only)
When an Endpoint determines that it requires access to a page for which the ATS translation is not available, it sends a Page Request message to request that the page be mapped into system memory.
The PRS interface allows the monitoring of when PRS events happen, what functions these PRS events belong to, and what types of events they are.
The PRS interface is only available in EP mode, and with TLP Bypass disabled.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
prs_event_valid_i | I | This signal qualifies prs_event_func_i and prs_event_i. There is a single-cycle pulse for each PRS event. | coreclkout_hip | EP |
prs_event_func_i[2:0] | I | The function number for the PRS event. | coreclkout_hip | EP |
prs_event_i[1:0] | I |
00 : Indicate that the function has received a PRG response failure. 01: Indicate that the function has received a response with Unexpected Page Request Group Index. 10: Indicate that the function has completed all previously issued page requests and that it has stopped requests for additional pages. Only valid when the PRS enable bit is clear. 11: reserved. |
coreclkout_hip | EP |
The figure below shows the timing diagram for the PRS event interface when the application layer of function 0 sends an event of PRG response reception, and the application layer of function 1 sends an event stopping requests for additional pages.
5. Advanced Features
5.1. PCIe Port Bifurcation and PHY Channel Mapping
The PCIe* controller IP contains a set of port bifurcation muxes to remap the four controller PIPE lane interfaces to the shared 16 PCIe* PHY lanes. The table below shows the relationship between PHY lanes and the port mapping.
Bifurcation Mode | Port 0 (x16) | Port 1 (x8) | Port 2 (x4) | Port 3 (x4) |
---|---|---|---|---|
1 x16 | 0 - 15 | NA | NA | NA |
2 x8 | 0 - 7 | 8 - 15 | NA | NA |
4 x4 | 4 - 7 | 8 - 11 | 0 - 3 | 12 - 15 |
5.2. Virtualization Support
- Single root I/O virtualization (SR-IOV)
- VirtIO
5.2.1. SR-IOV Support
The P-Tile IP for PCIe supports SR-IOV. The endpoint port controllers in the IP support up to eight physical functions (PF) and 2048 virtual functions (VF) per SR-IOV endpoint. The VF configuration space registers are hardened in the P-Tile. The specific VF-based work queues and interrupt tables must be implemented in the FPGA fabric by the user application.
For more details on the configuration space registers required for virtualization support, refer to Configuration Space Registers for Virtualization.
5.2.1.1. SR-IOV Supported Features List
Feature | Support |
---|---|
SR-IOV |
Supported in x16/x8 controller EP mode. Not supported in RP mode (x4). |
MSI |
Supported in PFs only. Not supported in VFs. No Per Vector Masking (PVM). If you need PVM, you must use MSI-X. Note: When SR-IOV is enabled, either MSI or MSI-X must be enabled.
|
MSI-X |
Supported by all PFs. For SR-IOV, PFs and VFs are always MSI-X capable. Note: VFs share a common Table Size. VF Table BIR/Offset and PBA
BIR/Offset are fixed at compile time.
Note: When SR-IOV is enabled, either MSI or MSI-X must be enabled.
|
Function Level Reset (FLR) |
Supported by all PFs/VFs. Required for all SR-IOV functions. |
Extended Tags |
Supported by all PFs/VFs. The Extended Tags feature allows the TLP Tag field to be 8-bit, thus allowing the support of 256 tags. Note that the application is restricted to a max of 256 outstanding tags, at any given time, for all functions combined. The application logic is responsible for implementing the tag generation/tracking functions. This feature is reflected in the Extended Tag Field Supported in the Device Capabilities register. By default, this field is set to 1 in every physical function enabled in the Intel FPGA P-Tile IP for PCI Express. |
10-bit Tags | Supported by all PFs/VFs. Refer to Tag Allocation for more details. |
AER |
PFs are always AER capable. No AER implemented for VFs. |
Active-State Power Management (ASPM) Optionality Compliance |
Supported by all PFs/VFs. Only used to indicate ASPM is not supported. |
Atomic Ops |
Requester capability is supported by all PFs/VFs. Completer capability is supported. Compare and Swap (CAS) AtomicOps are also supported. They can handle up to 128-bit operands. |
Internal Error Reporting | Supported by all PFs (because all PFs are AER capable). No support for VFs (because VFs do not support AER). |
TLP Processing Hints |
2-bit Processing Hint and 8-bit Steering Tag are supported by all PFs/VFs. TPH Prefixes are not supported. You can optionally choose to enable the TPH Requestor capability. However, the IP is always TPH Completer capable. |
ID-Based ordering |
Supported by all PFs/VFs. However, the IP core does NOT perform the reordering. The Application Layer must do this. The IP core only provides the IDO Request & Completion Enable bits in the Device Control 2 register. This gives the application permission to set the Attr bits in Requests and Completions that it transmits. Note: Reordering capability on the RX side may be limited by your bypass
queue. On the TX side, the IP core does not set the IDO bits on internally
generated TLPs.
|
Relaxed Ordering |
Implemented on the RX side. This feature is always active. On the TX side, reordering is done by the application. |
Alternative Routing ID Interpretation (ARI) |
EP (PFs/VFs) is always ARI capable. This is a device-level option (all lanes or none will support ARI). In addition, RP will always be ARI capable (ARI Forwarding Supported bit is always 1). |
Address Translation Service (ATS) | Supported by all EP PFs/VFs. |
Page Request Service Interface (PRI) | Supported by all EP PFs/VFs. |
User Extensions (Customer VSEC) | Supported by all PFs/VFs. |
Gen3 Receiver Impedance (3.0 ECN) | Supported |
Device Serial Number | Supported |
Completion Timeout Ranges (Device Capabilities 2) | All ranges are supported. |
Data Link Layer Active Reporting Capability (Link Capabilities) |
This capability is always supported in RP mode, but not in EP mode. |
Surprise Down Error Reporting Capability (Link Capabilities) | Supported |
PM-PCI Power Management | Only D0/D3 states are supported. |
ASPM (L0s/L1) | Not supported |
Process Address Space ID (PASID) | Supported |
TLP prefix | Supported, mainly for PASID |
Latency Tolerance Reporting (LTR) | Supported (only for PASID) |
Access Control Services | Supported |
5.2.1.2. Implementation
The VF configuration space is implemented in P-Tile logic, and does not require FPGA fabric resources.
Accessing VF PCIe Information:
Due to the limited number of pins between P-Tile and the FPGA fabric, the PCIe configuration space for VFs is not directly available to the user application.
- Monitor specific VF registers using the Configuration Intercept Interface (for more details, refer to section Configuration Intercept Interface (EP Only)).
- Read/write specific VF registers using the Hard IP Reconfiguration Interface (for more details, refer to Targeting VF Configuration Space Registers in section Using Direct User Avalon-MM Interface (Byte Access)).
Accessing VF PCIe Information:
VF IDs are calculated within P-Tile. User application has sideband signals rx_st_vf_num_o and rx_st_vf_active_o with the TLP to identify the associated VFs within the PFs.
BDF Assignments:
When SR-IOV is enabled, the ARI capability is always enabled.
The P-Tile IP for PCIe automatically calculates the completer/requester ID on the Transmit side.
User application needs to provide the VF and PF information in the Header as shown below:
(For X16, sn is either s0 or s1. For X8, sn is s0).
- tx_st_hdr_sn[127]: must be set to 0
- tx_st_hdr_sn[83]: tx_st_vf_active
- tx_st_hdr_sn[82:80]: tx_st_func_num[2:0]
- tx_st_hdr_sn[95:84]: tx_st_vf_num[11:0]
In the following example, VF3 of PF1 is receiving and sending a request:
For the Receive TLP:
rx_st_func_num_o = 1h indicating that a VF associated with PF1 is making the request.
rx_st_vf_num_o = 3h, and rx_st_vf_active_o = 1 indicating that VF3 of PF1 is the active VF.
- tx_st_hdr_sn[83] = 1h
- tx_st_hdr_sn[82:80] = 1h
- tx_st_hdr_sn[95:84] = 3h
5.2.1.2.1. VF Error Flag Interface (for x16/x8 Cores Only)
The VFs, with no AER support, are required to generate Non-Fatal error messages. The IP does not generate any error message. It is up to the user application logic to generate appropriate messages when specific error conditions occur.
Signal Name | Direction | Description | Clock Domain | EP/RP/BP |
---|---|---|---|---|
X16: vf_err_poisonedwrreq_s0/1/2/3_o X8: vf_err_poisonedwrreq_s0/1_o |
O | Indicates a Poisoned Write Request is received. | coreclkout_hip | EP |
X16: vf_err_poisonedcompl_s0/1/2/3_o X8: vf_err_poisonedcompl_s0/1_o |
O | Indicates a Poisoned Completion is received. | coreclkout_hip | EP |
X16: vf_err_ur_posted_s0/1/2/3_o X8: vf_err_ur_posted_s0/1_o |
O | Indicates the IP core received a Posted UR request. | coreclkout_hip | EP |
X16: vf_err_ca_postedreq_s0/1/2/3_o X8: vf_err_ca_postedreq_s0/1_o |
O | Indicates the IP core received a Posted CA request. | coreclkout_hip | EP |
X16: vf_err_vf_num_s0/1/2/3_o[10:0] X8: vf_err_vf_num_s0/1_o[10:0] |
O | Indicates the VF number for which the error is detected. | coreclkout_hip | EP |
X16: vf_err_func_num_s0/1/2/3_o[2:0] X8: vf_err_func_num_s0/1_o[2:0] |
O | Indicates the physical function number associated with the VF that has the error. | coreclkout_hip | EP |
vf_err_overflow_o | O | Indicates a VF error FIFO overflow and a loss of an error report. The overflow can happen when coreclkout_hip is slower than the default value. If coreclkout_hip is running at the default frequency, the overflow will not happen. | coreclkout_hip | EP |
user_sent_vfnonfatalmsg_s0_i | I | Indicates the user application sent a non-fatal error message in response to an error detected. | coreclkout_hip | EP |
user_vfnonfatalmsg_vfnum_s0_i[10:0] | I | Indicates the VF number for which the error message was generated. This bus is valid when user_sent_vfnonfatalmsg_s0_i is high. | coreclkout_hip | EP |
user_vfnonfatalmsg_func_num_s0_i[2:0] | I | Indicates the PF number associated with the VF with the error. This bus is valid when user_sent_vfnonfatalmsg_s0_i is high. | coreclkout_hip | EP |
5.2.1.3. VF to PF Mapping
VF to PF mapping always starts from the lowest possible PF number. For instance, if the IP has 2 PFs, wherein PF0 has 64 VFs and PF1 has 16 VFs, VF1 to VF64 are mapped to PF0, and VF65 to VF80 are mapped to PF1.
Number of PFs | Number of VFs per PF (PF0/PF1/PF2/PF3/PF4/PF5/PF6/PF7) | Total VFs |
---|---|---|
1 | 8 | 8 |
1 | 16 | 16 |
1 | 32 | 32 |
1 | 64 | 64 |
1 | 128 | 128 |
1 | 256 | 256 |
1 | 512 | 512 |
2 | 16/16 | 32 |
2 | 32/32 | 64 |
2 | 128/128 | 256 |
2 | 256/256 | 512 |
2 | 32/0 | 32 |
2 | 0/32 | 32 |
2 | 64/0 | 64 |
2 | 0/64 | 64 |
2 | 128/0 | 128 |
2 | 0/128 | 128 |
2 | 256/0 | 256 |
2 | 0/256 | 256 |
2 | 512/0 | 512 |
2 | 0/512 | 512 |
2 | 1024/0 | 1024 |
2 | 0/1024 | 1024 |
2 | 2048/0 | 2048 |
2 | 0/2048 | 2048 |
4 | 128/0/0/0 | 128 |
4 | 0/128/0/0 | 128 |
4 | 256/0/0/0 | 256 |
4 | 0/256/0/0 | 256 |
4 | 1024/0/0/0/0 | 1024 |
4 | 0/1024/0/0 | 1024 |
8 | 256/0/0/0/0/0/0/0 | 256 |
8 | 0/256/0/0/0/0/0/0 | 256 |
For example, the row that shows the combination of four PFs, 256 VFs, and the notation 256/0/0/0 in the Number of VFs per PF column indicates that all 256 VFs are mapped to PF0, while no VF is mapped to PF1, PF2 or PF3.
SR-IOV permutations allow any PF to be assigned the initial VF allocation.
5.2.1.4. Function Level Reset (FLR)
Use the FLR interface to reset individual SR-IOV functions. The PCIe* Hard IP supports FLR for both PFs and VFs. If the FLR is for a specific VF, the received packets for that VF are no longer valid. The flr_* interface signals are provided to the application interface for this purpose. When the flr_rcvd* signal is asserted, it indicates that an FLR is received for a particular PF/VF. Application logic needs to perform its FLR routine and send the completion status back on the flr_completed* interface. The Hard IP will wait for the flr_completed* status to re-enable the VF. Prior to that event, the Hard IP will respond to all transactions to the function that is reset by the FLR with completions with an Unsupported Request (UR) status.
The following figure shows the timing diagram for an FLR event targeting a PF (PF2 in this example):
Here is the timing diagram for an FLR event targeting a VF:
5.2.2. VirtIO Support
5.2.2.1. VirtIO Supported Features List
- VirtIO devices are implemented as PCI Express devices.
- Support 8 PFs and 2K VFs VirtIO capability structure for each EP.
- Configuration Intercept Interface in the P-Tile IP for PCIe (EP mode only) is provided for VirtIO transport.
- Five VirtIO device configuration structures are supported:
- Common configuration
- Notifications
- ISR Status
- Device-specific configuration (optional)
- PCI configuration access
- Location of each structure is specified using a vendor-specific PCI capability located in the PCI configuration space of the device.
- VirtIO capability structure uses little-endian format.
- All fields of the VirtIO capability structure are read-only for the driver by default.
- Support PFs and VFs FLR
- Supports x16 and x8 cores.
- MSI is not supported with VirtIO.
5.2.2.2. Overview
The VirtIO PCI configuration access capability creates an alternative access method to the common configuration, notifications, ISR, and device-specific configuration structure regions. This interface provides a means for the driver to access the VirtIO device region of Physical Functions (PFs) or Virtual Functions (VFs).
VirtIO is an industry standard for software-based virtualization that is supported natively by Linux. In VirtIO, software implements the virtualization stack, whereas in the case of SR-IOV, this stack is implemented mostly in hardware.
Below is the block diagram of the Soft IP which implements the VirtIO capability for PFs and VFs. This Soft IP block is automatically included when the VirtIO feature is enabled in the IP Parameter Editor.
5.2.2.3. Parameters
For a detailed discussion of the VirtIO-related parameters, refer to the section VirtIO Parameters in the Parameters chapter.
5.2.2.4. VirtIO PCI Configuration Access Interface
To access a VirtIO device region, pci_cfg_data will provide a window of size cap.length (1, 2 or 4 Bytes) into the given cap.bar (0x0 – 0x5) at offset cap.offset (multiple of cap.length). Detailed interfaces mapping for the user application logic are shown in the following table.
As for the VirtIO device, upon detecting a driver write access to pci_cfg_data, the user application side's VirtIO device must execute a write access at cap.offset at the BAR selected by cap.bar using the first cap.length bytes from pci_cfg_data. Moreover, upon detecting a driver read access to pci_cfg_data, the user application side's VirtIO device must execute a read access of length cap.length at cap.offset at the BAR selected by cap.bar and store the first cap.length bytes in pci_cfg_data.
Name | Direction | Description | Clock Domain |
---|---|---|---|
virtio_pcicfg_vfaccess_o | O |
Indicates the driver access is to a VF. The corresponding Virtual Function is identified from the value of virtio_pcicfg_vfnum_o. |
coreclkout_hip |
virtio_pcicfg_vfnum_o[VFNUM_WIDTH-1:0] | O |
Indicates the corresponding Virtual Function number associated with the current Physical Function that the driver’s write or read access is targeting. Validated by virtio_pcicfg_vfaccess_o and by driver write access to pci_cfg_data, or driver read access to pci_cfg_data. |
coreclkout_hip |
virtio_pcicfg_pfnum_o[PFNUM_WIDTH-1:0] | O |
Indicates the corresponding Physical Function number that the driver’s write or read access is targeting. Validated by driver write access to pci_cfg_data, or driver read access to pci_cfg_data. |
coreclkout_hip |
virtio_pcicfg_bar_o[7:0] | O |
Indicates the BAR holding the PCI configuration access structure. The driver sets the BAR to access by writing to cap.bar. Values 0x0 to 0x5 specify a BAR belonging to the function located beginning at 10h in the PCI Configuration Space. The BAR can be either 32-bit or 64-bit. Validated by driver write access to pci_cfg_data, or driver read access to pci_cfg_data. The corresponding PF or VF is identified from the value of virtio_pcicfg_p/vfnum_o. |
coreclkout_hip |
virtio_pcicfg_length_o[31:0] | O |
Indicates the length of the structure. The length may include padding, or fields unused by the driver, or future extensions. The driver sets the size of the access by writing 1, 2 or 4 to cap.length. Validated by driver write access to pci_cfg_data, or driver read access to pci_cfg_data. The corresponding PF or VF is identified from the value of virtio_pcicfg_p/vfnum_o. |
coreclkout_hip |
virtio_pcicfg_baroffset_o[31:0] | O |
Indicates where the structure begins relative to the base address associated with the BAR. The driver sets the offset within the BAR by writing to cap.offset. Validated by driver write access to pci_cfg_data, or driver read access to pci_cfg_data. The corresponding PF or VF is identified from the value of virtio_pcicfg_p/vfnum_o. |
coreclkout_hip |
virtio_pcicfg_cfgdata_o[31:0] | O |
Indicates the data for BAR access. The pci_cfg_data will provide a window of size cap.length into the given cap.bar at offset cap.offset. Validated by driver write access to pci_cfg_data, or driver read access to pci_cfg_data. The corresponding PF or VF is identified from the value of virtio_pcicfg_p/vfnum_o. |
coreclkout_hip |
virtio_pcicfg_cfgwr_o | O |
Indicates driver write access to pci_cfg_data. The corresponding PF or VF is identified from the value of virtio_pcicfg_p/vfnum_o. |
coreclkout_hip |
virtio_pcicfg_cfgrd_o | O |
Indicates driver read access to pci_cfg_data. The corresponding PF or VF is identified from the value of virtio_pcicfg_p/vfnum_o. |
coreclkout_hip |
virtio_pcicfg_appvfnum_i[VFNUM_WIDTH-1:0] | I |
Indicates the corresponding Virtual Function number associated with the current Physical Function that the application config data storage is for. Validated by virtio_pcicfg_rdack_i. |
coreclkout_hip |
virtio_pcicfg_apppfnum_i[PFNUM_WIDTH-1:0] | I |
Indicates the corresponding Physical Function number that the application config data storage is for. Validated by virtio_pcicfg_rdack_i. |
coreclkout_hip |
virtio_pcicfg_rdack_i | I |
Indicates an application read access data ack to store the config data in pci_cfg_data. Usually the reasonable ack latency is no more than 10 cycles. The corresponding Virtual Function is identified from the value of virtio_pcicfg_appvfnum_i. |
coreclkout_hip |
virtio_pcicfg_rdbe_i[3:0] | I |
Indicates application enabled bytes within virtio_pcicfg_data_i. Validated by virtio_pcicfg_rdack_i. The corresponding Virtual Function is identified from the value of virtio_pcicfg_appvfnum_i. |
coreclkout_hip |
virtio_pcicfg_data_i[31:0] | I |
Indicates application data to be stored in PCI Configuration Access data registers. Validated by virtio_pcicfg_rdack_i and virtio_pcicfg_rdbe_i. The corresponding Virtual Function is identified from the value of virtio_pcicfg_appvfnum_i. |
coreclkout_hip |
5.2.2.5. Registers
The following VirtIO capability structure registers references apply to each PF and VF. Addresses shown are register addresses.
Capability | Start Byte Address | Last Byte Address | DW Count |
---|---|---|---|
Type0 | 0x00 | 0x3F | 16 |
PM (PF only) | 0x40 | 0x47 | 2 |
VirtIO Common Configuration | 0x48 | 0x57 | 4 |
VirtIO Notifications | 0x58 | 0x6B | 5 |
Reserved | 0x6C | 0x6F | 1 |
PCIe | 0x70 | 0xA3 | 13 |
Reserved | 0xA4 | 0xAF | 3 |
MSIX | 0xB0 | 0xBB | 3 |
VirtIO ISR Status | 0xBC | 0xCB | 4 |
VirtIO Device-Specific Configuration | 0xCC | 0xDB | 4 |
VirtIO PCI Configuration Access | 0xDC | 0xEF | 5 |
Reserved | 0xF0 | 0xFF | 4 |
Address | Name | Description |
---|---|---|
012 | Common Configuration Capability Register | Capability ID, next capability pointer, capability length |
013 | BAR Indicator Register | Lower 8 bits indicate which BAR holds the structure |
014 | BAR Offset Register | Indicates starting address of the structure within the BAR |
015 | Structure Length Register | Indicates length of structure |
VirtIO Notifications Capability Structure | ||
016 | Notifications Capability Register | Capability ID, next capability pointer, capability length |
017 | BAR Indicator Register | Lower 8 bits indicate which BAR holds the structure |
018 | BAR Offset Register | Indicates starting address of the structure within the BAR |
019 | Structure Length Register | Indicates length of structure |
01A | Notify Off Multiplier | Multiplier for queue_notify_off |
VirtIO ISR Status Capability Structure | ||
02F | ISR Status Capability Register | Capability ID, next capability pointer, capability length |
030 | BAR Indicator Register | Lower 8 bits indicate which BAR holds the structure |
031 | BAR Offset Register | Indicates starting address of the structure within the BAR |
032 | Structure Length Register | Indicates length of structure |
VirtIO Device-Specific Capability Structure (Optional) | ||
033 | Device Specific Capability Register | Capability ID, next capability pointer, capability length |
034 | BAR Indicator Register | Lower 8 bits indicate which BAR holds the structure |
035 | BAR Offset Register | Indicates starting address of the structure within the BAR |
036 | Structure Length Register | Indicates length of structure |
VirtIO PCI Configuration Access Structure | ||
037 | PCI Configuration Access Capability Register | Capability ID, next capability pointer, capability length |
038 | BAR Indicator Register | Lower 8 bits indicate which BAR holds the structure |
039 | BAR Offset Register | Indicates starting address of the structure within the BAR |
03A | Structure Length Register | Indicates length of structure |
03B | PCI Configuration Data | Data for BAR access |
5.2.2.5.1. VirtIO Common Configuration Capability Register (Address: 0x012)
The capability register identifies that this is a vendor-specific capability. It also identifies the structure type.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Configuration Type | RO | 0x01 |
23:16 | Capability Length | RO | 0x10 |
15:8 | Next Capability Pointer | RO | 0x58 |
7:0 | Capability ID | RO | 0x09 |
5.2.2.5.2. VirtIO Common Configuration BAR Indicator Register (Address: 0x013)
The Bar Indicator field holds the values 0x0 to 0x5 specifying a Base Address register (BAR) belonging to the function located beginning at 10h in PCI Configuration Space. The BAR is used to map the structure into the memory space. Any other value is reserved for future use.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Padding | RO | 0x00 |
23:16 | Padding | RO | 0x00 |
15:8 | Padding | RO | 0x00 |
7:0 | BAR Indicator | RO | Settable through Platform Designer |
5.2.2.5.3. VirtIO Common Configuration BAR Offset Register (Address: 0x014)
This register indicates where the structure begins relative to the base address associated with the BAR. The alignment requirements of the offset are indicated in each structure-specific section.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | BAR Offset | RO | Settable through Platform Designer |
5.2.2.5.4. VirtIO Common Configuration Structure Length Register (Address 0x015)
The length register indicates the length of the structure. The length may include padding, fields unused by the driver, or future extensions.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | Structure Length | RO | Settable through Platform Designer |
5.2.2.5.5. VirtIO Notifications Capability Register (Address: 0x016)
The capability register identifies that this is a vendor-specific capability. It also identifies the structure type.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Configuration Type | RO | 0x02 |
23:16 | Capability Length | RO | 0x14 |
15:8 | Next Capability Pointer | RO | 0xBC |
7:0 | Capability ID | RO | 0x09 |
5.2.2.5.6. VirtIO Notifications BAR Indicator Register (Address: 0x017)
The Bar Indicator field holds the values 0x0 to 0x5 specifying a Base Address register (BAR) belonging to the function located beginning at 10h in PCI Configuration Space. The BAR is used to map the structure into memory space. Any other value is reserved for future use.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Padding | RO | 0x00 |
23:16 | Padding | RO | 0x00 |
15:8 | Padding | RO | 0x00 |
7:0 | BAR Indicator | RO | Settable through Platform Designer |
5.2.2.5.7. VirtIO Notifications BAR Offset Register (Address: 0x018)
This register indicates where the structure begins relative to the base address associated with the BAR. The alignment requirements of the offset are indicated in each structure-specific section.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | BAR Offset | RO | Settable through Platform Designer |
5.2.2.5.8. VirtIO Notifications Structure Length Register (Address: 0x019)
The length register indicates the length of the structure. The length may include padding, fields unused by the driver, or future extensions.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | Structure Length | RO | Settable through Platform Designer |
5.2.2.5.9. VirtIO Notifications Notify Off Multiplier Register (Address: 0x01A)
The notify off multiplier register indicates the multiplier for queue_notify_off in the structure.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | Multiplier for queue_notify_off | RO | Settable through Platform Designer |
5.2.2.5.10. VirtIO ISR Status Capability Register (Address: 0x02F)
The capability register identifies that this is a vendor-specific capability. It also identifies the structure type.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Configuration Type | RO | 0x03 |
23:16 | Capability Length | RO | 0x10 |
15:8 | Next Capability Pointer | RO | If Device-Specific Capability is present, then points to 0xCC, else points to 0xDC. |
7:0 | Capability ID | RO | 0x09 |
5.2.2.5.11. VirtIO ISR Status BAR Indicator Register (Address: 0x030)
The Bar Indicator field holds the values 0x0 to 0x5 specifying a Base Address register (BAR) belonging to the function located beginning at 10h in PCI Configuration Space. The BAR is used to map the structure into memory space. Any other value is reserved for future use.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Padding | RO | 0x00 |
23:16 | Padding | RO | 0x00 |
15:8 | Padding | RO | 0x00 |
7:0 | BAR Indicator | RO | Settable through Platform Designer |
5.2.2.5.12. VirtIO ISR Status BAR Offset Register (Address: 0x031)
This register indicates where the structure begins relative to the base address associated with the BAR. The alignment requirements of the offset are indicated in each structure-specific section.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | BAR Offset | RO | Settable through Platform Designer |
5.2.2.5.13. VirtIO ISR Status Structure Length Register (Address: 0x032)
The length register indicates the length of the structure. The length may include padding, fields unused by the driver, or future extensions.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | Structure Length | RO | Settable through Platform Designer |
5.2.2.5.14. VirtIO Device Specific Capability Register (Address: 0x033)
The capability register identifies that this is vendor-specific capability. It also identifies the structure type.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Configuration Type | RO | 0x04 |
23:16 | Capability Length | RO | 0x10 |
15:8 | Next Capability Pointer | RO | If this capability is present, then points to 0xDC. |
7:0 | Capability ID | RO | 0x09 |
5.2.2.5.15. VirtIO Device Specific BAR Indicator Register (Address: 0x034)
The BAR Indicator field holds the values 0x0 to 0x5 specifying a Base Address register (BAR) belonging to the function located beginning at 10h in PCI Configuration Space. The BAR is used to map the structure into memory space. Any other value is reserved for future use.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Padding | RO | 0x00 |
23:16 | Padding | RO | 0x00 |
15:8 | Padding | RO | 0x00 |
7:0 | BAR Indicator | RO | Settable through Platform Designer |
5.2.2.5.16. VirtIO Device Specific BAR Offset Register (Address 0x035 )
This register indicates where the structure begins relative to the base address associated with the BAR. The alignment requirements of the offset are indicated in each structure-specific section.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | BAR Offset | RO | Settable through Platform Designer |
5.2.2.5.17. VirtIO Device Specific Structure Length Register (Address: 0x036)
The length register indicates the length of the structure. The length may include padding, fields unused by the driver, or future extensions.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | Structure Length | RO | Settable through Platform Designer |
5.2.2.5.18. VirtIO PCI Configuration Access Capability Register (Address: 0x037)
The capability register identifies that this is a vendor-specific capability. It also identifies the structure type.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Configuration Type | RO | 0x05 |
23:16 | Capability Length | RO | 0x14 |
15:8 | Next Capability Pointer | RO | 0x00 |
7:0 | Capability ID | RO | 0x09 |
5.2.2.5.19. VirtIO PCI Configuration Access BAR Indicator Register (Address: 0x038)
The BAR Indicator field holds the values 0x0 to 0x5 specifying a Base Address register (BAR) belonging to the function located beginning at 10h in PCI Configuration Space. The BAR is used to map the structure into memory space. Any other value is reserved for future use.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:24 | Padding | RO | 0x00 |
23:16 | Padding | RO | 0x00 |
15:8 | Padding | RO | 0x00 |
7:0 | BAR Indicator | RW | Settable through Platform Designer |
5.2.2.5.20. VirtIO PCI Configuration Access BAR Offset Register (Address: 0x039)
This register indicates where the structure begins relative to the base address associated with the BAR. The alignment requirements of the offset are indicated in each structure-specific section.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | BAR Offset | RW | Settable through Platform Designer |
5.2.2.5.21. VirtIO PCI Configuration Access Structure Length Register (Address: 0x03A)
The length register indicates the length of the structure. The length may include padding, fields unused by the driver, or future extensions.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | Structure Length | RW | Settable through Platform Designer |
5.2.2.5.22. VirtIO PCI Configuration Access Data Register (Address: 0x03B)
The PCI configuration data register indicates the data for BAR access.
Bit Location | Description | Access Type | Default Value |
---|---|---|---|
31:0 | PCI Configuration Data | RW | Settable through Platform Designer |
5.3. TLP Bypass Mode
- The upstream port or the downstream port of a switch.
- A custom implementation of a Transaction Layer to meet specific user requirements.
IP Mode | Port Mode |
---|---|
X16 |
UP DN |
X8 |
UP/UP UP/DN DN/UP DN/DN |
X4 |
UP/UP/UP/UP DN/DN/DN/DN |
5.3.1. Overview
When the TLP Bypass feature is enabled, the P-Tile Avalon® -ST IP does not process received TLPs internally but outputs them to the user application. This allows the application to implement a custom Transaction Layer.
Note that in TLP Bypass mode, the PCIe Hard IP does not generate/check the ECRC and will not remove it if the received TLP has the ECRC.
The P-tile Avalon® -ST IP in TLP Bypass mode still includes some of the PCIe configuration space registers related to link operation (refer to the Configuration Space Registers chapter for the list of registers).
In TLP bypass mode, P-Tile supports the autonomous Hard IP feature. It responds to configuration accesses before the FPGA fabric enters user mode with Completions with a CRS code.
However, in TLP bypass mode, CvP init and update are not supported.
5.3.2. Register Settings for the TLP Bypass Mode
When TLP Bypass mode is enabled, some error detection is still performed in the Physical and Link Layers inside the Hard IP. Per PCIe specification, the Hard IP must report these errors on the configuration space registers (in the AER Capability Structure). The P-tile IP for PCIe includes two registers called TLPBYPASS_ERR_EN and TLPBYPASS_ERR_STATUS to report errors detected while in TLP Bypass mode.
TLPBYPASS_ERR_EN and TLPBYPASS_ERR_STATUS are part of the configuration and status register.
5.3.2.1. TLPBYPASS_ERR_EN (Address 0x104194)
This register allows you to enable or disable error reporting. When this feature is disabled, the TLPBYPASS_ERR_ STATUS bits associated with an error are not set when the error is detected.
Name | Bits | Reset Value | Access Mode | Register Description |
---|---|---|---|---|
Reserved | [31:20] | 12’b0 | RO | Reserved |
k_cfg_uncor_internal_err_sts_en | [19] | 1'b1 | RW | Enable error indication on serr_out_o for Uncorrectable Internal Error. |
k_cfg_corrected_internal_err_sts_en | [18] | 1'b1 | RW | Enable error indication on serr_out_o for Corrected Internal Error. |
k_cfg_rcvr_overflow_err_sts_en | [17] | 1'b1 | RW | Enable error indication on serr_out_o for Receiver Overflow Error. |
k_cfg_fc_protocol_err_sts_en | [16] | 1'b1 | RW | Enable error indication on serr_out_o for Flow Control Protocol Error. |
k_cfg_mlf_tlp_err_sts_en | [15] | 1'b1 | RW | Enable error indication on serr_out_o for Malformed TLP Error. |
k_cfg_surprise_down_err_sts_en | [14] | 1'b1 | RW | Enable error indication on serr_out_o for Surprise Down Error. |
k_cfg_dl_protocol_err_sts_en | [13] | 1'b1 | RW | Enable error indication on serr_out_o for Data Link Protocol Error. |
k_cfg_replay_number_rollover_err_sts_en | [12] | 1'b1 | RW | Enable error indication on serr_out_o for REPLAY_NUM Rollover Error. |
k_cfg_replay_timer_timeout_err_st_en | [11] | 1'b1 | RW | Enable error indication on serr_out_o for Replay Timer Timeout Error. |
k_cfg_bad_dllp_err_sts_en | [10] | 1'b1 | RW | Enable error indication on serr_out_o for Bad DLLP Error. |
k_cfg_bad_tlp_err_sts_en | [9] | 1'b1 | RW | Enable error indication on serr_out_o for Bad TLP Error. |
k_cfg_rcvr_err_sts_en | [8] | 1'b1 | RW | Enable error indication on serr_out_o for Receiver Error. |
Reserved | [7:1] | 7'b0 | RO | Reserved |
k_cfg_ecrc_err_sts_en | [0] | 1'b1 | RW | Enable error indication on serr_out_o for ECRC Error. |
5.3.2.2. TLPBYPASS_ERR_STATUS (Address 0x104190)
When an error is detected, Intel recommends that you read the PF0 AER register inside P-Tile to get detailed information about the error. To clear the previous error status, you need to clear TLPBYPASS_ERR_STATUS and the corresponding correctable and uncorrectable error status registers in the AER capability structure. After doing that, you can get the new error update from this register.
Name | Bits | Reset Value | Access Mode | Register Description |
---|---|---|---|---|
Reserved | [31:20] | 12’b0 | RO | Reserved |
cfg_uncor_internal_err_sts | [19] | 1'b0 | W1C | Uncorrectable Internal Error |
cfg_corrected_internal_err_sts | [18] | 1'b0 | W1C | Corrected Internal Error |
cfg_rcvr_overflow_err_sts | [17] | 1'b0 | W1C | Receiver Overflow Error |
cfg_fc_protocol_err_sts | [16] | 1'b0 | W1C | Flow Control Protocol Error |
cfg_mlf_tlp_err_sts | [15] | 1'b0 | W1C | Malformed TLP Error |
cfg_surprise_down_err_sts | [14] | 1'b0 | W1C | Surprise Down Error. Available in downstream mode only. |
cfg_dl_protocol_err_sts | [13] | 1'b0 | W1C | Data Link Protocol Error |
cfg_replay_number_rollover_err_sts | [12] | 1'b0 | W1C | REPLAY_NUM Rollover Error |
cfg_replay_timer_timeout_err_sts | [11] | 1'b0 | W1C | Replay Timer Timeout Error |
cfg_bad_dllp_err_sts | [10] | 1'b0 | W1C | Bad DLLP Error |
cfg_bad_tlp_err_sts | [9] | 1'b0 | W1C | Bad TLP Error |
cfg_rcvr_err_sts | [8] | 1'b0 | W1C | Receiver Error |
Reserved | [7:1] | 7'b0 | RO | Reserved |
cfg_ecrc_err_sts | [0] | 1'b0 | W1C | ECRC Error |
5.3.3. User Avalon -MM Interface
For more details on the signals in this interface, refer to the section Hard IP Reconfiguration Interface.
The majority of the PCIe standard registers are implemented in the user’s logic outside of the P-Tile Avalon® -ST IP.
- Power management capability
- PCI Express capability
- Secondary PCI Express capability
- Data link feature extended capability
- Physical layer 16.0GT/s extended capability
- Lane margining at the receiver extended capability
- Advanced error reporting capability
The application can only access PCIe controller registers through the User Avalon® -MM interface.
Capability | Comments |
---|---|
Power Management Capability | Need to write back since it is required to trigger a PCI-PM entry. |
PCI Express Capability | All the PCIe capabilities, control and status registers are for configuring the device. Write-back is required. |
Secondary PCI Express Capability | Secondary PCIe Capability is required for configuring the device. |
Data Link Feature Extended Capability | Data Link Capability is device specific. |
Physical Layer 16.0 GT/s Extended Capability | Physical Layer 16G Capability is device specific. |
Lane Margining at the Receiver Extended Capability | Margining Extended Capability is device specific. |
Advanced Error Reporting Capability | Write-back to error status registers is required for TLP Bypass. |
5.3.4. Avalon -ST Interface
For more details on the signals in this interface, refer to the section Avalon-ST Interface.
5.3.4.1. Configuration TLP
The P-Tile IP forwards any received Type0/1 Configuration TLP to the Avalon® -ST RX streaming interface. User’s logic has the responsibility to respond with a Completion TLP with a Completion code of Successful Completion (SC), Unsupported Request (UR), Configuration Request Retry Status (CRS), or Completer Abort (CA).
If a Configuration TLP needs to update a register in the PCIe configuration space in the P-Tile PCIe Hard IP, you need to use the User Avalon® -MM interface.
The application needs to prevent link programming side effects such as writing into low-power states before sending the Completion associated with the request. The application logic can check the TX FIFO empty flag in the tx_cdts_limit_o after the Completion enters the TX streaming interface to confirm that the TLP has been sent. For more details on the User Avalon® -MM interface, refer to the section Hard IP Reconfiguration Interface (User Avalon® -MM Interface).
5.3.4.2. Transmit Interface
All TLPs transmitted by the application through the TX streaming interface are sent out as-is, without any tracking for completion. The P-Tile IP for PCIe does not perform any check on the TLPs. Your application logic is responsible for sending TLPs that comply with the PCIe specifications.
5.3.4.3. Receive Interface
ALL TLPs received by the IP are transmitted to the application through the RX streaming interface (except Malformed TLPs).
Please refer to the Packets Forwarded to the User Application in TLP Bypass Mode Appendix for detailed information.
All PCIe protocol errors leading up to designating a TLP packet as a good packet or not will be detected by the Hard IP and communicated to user logic to take appropriate action in terms of error logging and escalation. The IP does not generate any error message internally, since this is the responsibility of the user logic.
5.3.4.4. Malformed TLP
In TLP Bypass mode, a malformed TLP is dropped in the P-Tile IP for PCIe and its event is logged in the AER capability registers. P-Tile also notifies you of this event by asserting the serr_out_o signal.
Refer to the PCI Express Base Specification for the definition of a malformed TLP.
5.3.4.5. ECRC
In TLP bypass mode, the ECRC is not generated or stripped by the P-Tile Avalon® -ST IP for PCIe.
6. Testbench
This chapter introduces the testbench for an Endpoint design example and a test driver module. You can create this design example using design flows described in Quick Start Guide chapter of the Intel FPGA P-Tile Avalon streaming IP for PCI Express Design Example User Guide.
The testbench in this design example simulates up to a Gen4 x16 variant.
When configured as an Endpoint variation, the testbench instantiates a design example with a P-Tile Endpoint and a Root Port BFM containing a second P-Tile (configured as a Root Port) to interface with the Endpoint. The Root Port BFM provides the following functions:
- A configuration routine that sets up all the basic configuration registers in the Endpoint. This configuration allows the Endpoint application to be the target and initiator of PCI Express transactions.
- A Verilog HDL procedure interface to initiate PCI Express* transactions to the Endpoint.
This testbench simulates the scenario of a single Root Port talking to a single Endpoint.
The testbench uses a test driver module, altpcietb_bfm_rp_gen4_x16.sv, to initiate the configuration and memory transactions. At startup, the test driver module displays information from the Root Port and Endpoint Configuration Space registers, so that you can correlate to the parameters you specified using the Parameter Editor.
Your Application Layer design may need to handle at least the following scenarios that are not possible to create with the Intel testbench and the Root Port BFM, or are due to the limitations of the example design:
- It is unable to generate or receive Vendor Defined Messages. Some systems generate Vendor Defined Messages. The Hard IP block simply passes these messages on to the Application Layer. Consequently, you should make the decision, based on your application, whether to design the Application Layer to process them.
- It can only handle received read requests that are less than or equal to the currently set Maximum payload size option specified under the Device tab under the PCI Express/PCI Capabilities GUI using the parameter editor. Many systems are capable of handling larger read requests that are then returned in multiple completions.
- It always returns a single completion for every read request. Some systems split completions on every 64-byte address boundary.
- It always returns completions in the same order the read requests were issued. Some systems generate the completions out-of-order.
- It is unable to generate zero-length read requests that some systems generate as flush requests following some write transactions. The Application Layer must be capable of generating the completions to the zero-length read requests.
- It uses fixed credit allocation.
- It does not support parity.
- It does not support multi-function designs.
- It incorrectly responds to Type 1 vendor-defined messages with CplD packets.
6.1. Endpoint Testbench
The example design and testbench are dynamically generated based on the configuration that you choose for the P-Tile IP for PCIe. The testbench uses the parameters that you specify in the Parameter Editor in Intel® Quartus® Prime.
This testbench simulates up to a ×16 PCI Express link using the serial PCI Express interface. The testbench design does allow more than one PCI Express link to be simulated at a time. The following figure presents a high level view of the design example.
The top-level of the testbench instantiates the following main modules:
-
altpcietb_bfm_rp_gen4_x16.sv
—This is the Root Port
PCIe*
BFM.
//Directory path <project_dir>/intel_pcie_ptile_ast_0_example_design/pcie_ed_tb/ip/pcie_ed_tb/dut_pcie_tb_ip/intel_pcie_ptile_tbed_<ver>/sim
-
pcie_ed_dut.ip:
This is the Endpoint design with the parameters that you
specify.
//Directory path <project_dir>/intel_pcie_ptile_ast_0_example_design/ip/pcie_ed
-
pcie_ed_pio0.ip:
This module is a target and initiator of
transactions for the
PIO design
example.
//Directory path <project_dir>/intel_pcie_ptile_ast_0_example_design/ip/pcie_ed
-
pcie_ed_sriov0.ip: This module is a
target and initiator of transactions for the SR-IOV design
example.
//Directory path <project_dir>/intel_pcie_ptile_ast_0_example_design/ip/pcie_ed
In addition, the testbench has routines that perform the following tasks:
- Generates the reference clock for the Endpoint at the required frequency.
- Provides a PCI Express reset at start up.
The SR-IOV design example testbench supports up to two Physical Functions (PFs) and 32 Virtual Functions (VFs) per PF.
For more details on the PIO design example testbench and SR-IOV design example testbench, refer to the Intel FPGA P-Tile Avalon® streaming IP for PCI Express Design Example User Guide.
6.2. Test Driver Module
The test driver module, intel_pcie_ptile_tbed_hwtcl.v, instantiates the top-level BFM,altpcietb_bfm_top_rp.v.
The top-level BFM completes the following tasks:
- Instantiates the driver and monitor.
- Instantiates the Root Port BFM.
- Instantiates the serial interface.
The configuration module, altpcietb_g3bfm_configure.v, performs the following tasks:
- Configures and assigns the BARs.
- Configures the Root Port and Endpoint.
- Displays comprehensive Configuration Space, BAR, MSI, MSI-X, and AER settings.
6.3. Root Port BFM
The basic Root Port BFM provides a Verilog HDL task‑based interface to request transactions to issue on the PCI Express link. The Root Port BFM also handles requests received from the PCI Express link. The following figure shows the major modules in the Root Port BFM.
These modules implement the following functionality:
- BFM Log Interface, altpcietb_g3bfm_log.v and altpcietb_bfm_rp_gen3_x8.sv: The BFM Log Interface provides routines for writing commonly formatted messages to the simulator standard output and optionally to a log file. It also provides controls that stop simulations on errors.
- BFM Read/Write Request Functions, altpcietb_bfm_rp_gen3_x8.sv: These functions provide the basic BFM calls for PCI Express read and write requests.
- BFM Configuration Functions, altpcietb_g3bfm_configure.v : These functions provide the BFM calls to request a configuration of the PCI Express link and the Endpoint Configuration Space registers.
- BFM shared memory, altpcietb_g3bfm_shmem.v: This module provides the Root Port BFM shared memory.
It implements the following functionality:
- Provides data for TX write operations
- Provides data for RX read operations
- Receives data for RX write operations
- Receives data for received completions
- BFM Request Interface, altpcietb_g3bfm_req_intf.v: This interface provides the low-level interface between the altpcietb_g3bfm_rdwr and altpcietb_g3bfm_configure procedures or functions and the Root Port RTL Model. This interface stores a write-protected data structure containing the sizes and values programmed in the BAR registers of the Endpoint. It also stores other critical data used for internal BFM management.
- altpcietb_g3bfm_rdwr.v: This module contains the low-level read and write tasks.
- Avalon‑ST Interfaces, altpcietb_g3bfm_vc_intf_ast_common.v: These interface modules handle the Root Port interface model. They take requests from the BFM request interface and generate the required PCI Express transactions. They handle completions received from the PCI Express link and notify the BFM request interface when requests are complete. Additionally, they handle any requests received from the PCI Express link, and store or fetch data from the shared memory before generating the required completions.
In the PIO design example, the apps_type_hwtcl parameter is set to 3. The tests run under this parameter value are defined in ebfm_cfg_rp_ep_rootport, find_mem_bar and downstream_loop.
- Root port memory allocation
- Root port configuration space (base limit, bus number, etc.)
- Endpoint configuration (BAR, Bus Master enable, maxpayload size, etc.)
The functions find_mem_bar and downstream_loop in altpcietb_bfm_rp_gen3_x8.sv return the BAR implemented and perform the memory Write and Read accesses to the BAR, respectively.
6.3.1. BFM Memory Map
The BFM shared memory is 2 MBs. The BFM shared memory maps to the first 2 MBs of I/O space and also the first 2 MBs of memory space. When the Endpoint application generates an I/O or memory transaction in this range, the BFM reads or writes the shared memory.
6.3.2. Configuration Space Bus and Device Numbering
Enumeration assigns the Root Port interface device number 0 on internal bus number 0. Use the ebfm_cfg_rp_ep procedure to assign the Endpoint to any device number on any bus number (greater than 0). The specified bus number is the secondary bus in the Root Port Configuration Space.
6.3.3. Configuration of Root Port and Endpoint
Before you issue transactions to the Endpoint, you must configure the Root Port and Endpoint Configuration Space registers.
The ebfm_cfg_rp_ep procedure in altpcietb_g3bfm_configure.v executes the following steps to initialize the Configuration Space:
- Sets the Root Port Configuration Space to enable the Root Port to send transactions on the PCI Express link.
- Sets the Root Port and Endpoint
PCI Express Capability Device Control registers as follows:
- Disables Error Reporting in both the Root Port and Endpoint. The BFM does not have error handling capability.
- Enables Relaxed Ordering in both Root Port and Endpoint.
- Enables Extended Tags for the Endpoint if the Endpoint has that capability.
- Disables Phantom Functions, Aux Power PM, and No Snoop in both the Root Port and Endpoint.
- Sets the Max Payload Size to the value that the Endpoint supports because the Root Port supports the maximum payload size.
- Sets the Root Port Max Read Request Size to 4 KB because the example Endpoint design supports breaking the read into as many completions as necessary.
- Sets the Endpoint Max Read Request Size equal to the Max Payload Size because the Root Port does not support breaking the read request into multiple completions.
- Assigns values to all the
Endpoint BAR registers. The BAR addresses are assigned by the algorithm outlined below.
- I/O BARs are assigned smallest to largest starting just above the ending address of the BFM shared memory in I/O space and continuing as needed throughout a full 32-bit I/O space.
- The 32-bit non-prefetchable memory BARs are assigned smallest to largest, starting just above the ending address of the BFM shared memory in memory space and continuing as needed throughout a full 32-bit memory space.
- The value of the addr_map_4GB_limit input to the ebfm_cfg_rp_ep procedure controls the assignment of the 32-bit prefetchable
and 64-bit prefetchable memory BARS. The default value of the addr_map_4GB_limit is 0.
If the addr_map_4GB_limit input to the ebfm_cfg_rp_ep procedure is set to 0, then the ebfm_cfg_rp_ep procedure assigns the 32‑bit prefetchable memory BARs largest to smallest, starting at the top of 32-bit memory space and continuing as needed down to the ending address of the last 32-bit non-prefetchable BAR.
However, if the addr_map_4GB_limit input is set to 1, the address map is limited to 4 GB. The ebfm_cfg_rp_ep procedure assigns 32-bit and 64-bit prefetchable memory BARs largest to smallest, starting at the top of the 32-bit memory space and continuing as needed down to the ending address of the last 32-bit non-prefetchable BAR.
- If the addr_map_4GB_limit input to the ebfm_cfg_rp_ep procedure is set to 0, then the ebfm_cfg_rp_ep procedure assigns the 64-bit prefetchable memory BARs
smallest to largest starting at the 4 GB
address assigning memory ascending above the 4
GB limit throughout the full 64-bit memory space.
If the addr_map_4 GB_limit input to the ebfm_cfg_rp_ep procedure is set to 1, the ebfm_cfg_rp_ep procedure assigns the 32-bit and the 64-bit prefetchable memory BARs largest to smallest starting at the 4 GB address and assigning memory by descending below the 4 GB address to memory addresses as needed down to the ending address of the last 32-bit non-prefetchable BAR.
The above algorithm cannot always assign values to all BARs when there are a few very large (1 GB or greater) 32-bit BARs. Although assigning addresses to all BARs may be possible, a more complex algorithm would be required to effectively assign these addresses. However, such a configuration is unlikely to be useful in real systems. If the procedure is unable to assign the BARs, it displays an error message and stops the simulation.
- Based on the above BAR assignments, the ebfm_cfg_rp_ep procedure assigns the Root Port Configuration Space address windows to encompass the valid BAR address ranges.
- The ebfm_cfg_rp_ep procedure enables master transactions, memory address decoding, and I/O address decoding in the Endpoint PCIe* control register.
The ebfm_cfg_rp_ep procedure also sets up a bar_table data structure in BFM shared memory that lists the sizes and assigned addresses of all Endpoint BARs. This area of BFM shared memory is write-protected. Consequently, application logic write accesses to this area cause a fatal simulation error.
BFM procedure calls to generate full PCIe* addresses for read and write requests to particular offsets from a BAR use this data structure. This procedure allows the testbench code that accesses the Endpoint application logic to use offsets from a BAR and avoid tracking specific addresses assigned to the BAR. The following table shows how to use those offsets.
Offset (Bytes) |
Description |
---|---|
+0 |
PCI Express address in BAR0 |
+4 |
PCI Express address in BAR1 |
+8 |
PCI Express address in BAR2 |
+12 |
PCI Express address in BAR3 |
+16 |
PCI Express address in BAR4 |
+20 |
PCI Express address in BAR5 |
+24 |
PCI Express address in Expansion ROM BAR |
+28 |
Reserved |
+32 |
BAR0 read back value after being written with all 1’s (used to compute size) |
+36 |
BAR1 read back value after being written with all 1’s |
+40 |
BAR2 read back value after being written with all 1’s |
+44 |
BAR3 read back value after being written with all 1’s |
+48 |
BAR4 read back value after being written with all 1’s |
+52 |
BAR5 read back value after being written with all 1’s |
+56 |
Expansion ROM BAR read back value after being written with all 1’s |
+60 |
Reserved |
The configuration routine does not configure any advanced PCI Express capabilities such as the AER capability.
Besides the ebfm_cfg_rp_ep procedure in altpcietb_bfm_rp_gen3_x8.sv, routines to read and write Endpoint Configuration Space registers directly are available in the Verilog HDL include file. After the ebfm_cfg_rp_ep procedure runs, the PCI Express I/O and Memory Spaces have the layout shown in the following three figures. The memory space layout depends on the value of the addr_map_4GB_limit input parameter. The following figure shows the resulting memory space map when the addr_map_4GB_limit is 1.
The following figure shows the resulting memory space map when the addr_map_4GB_limit is 0.
The following figure shows the I/O address space.
6.3.4. Issuing Read and Write Transactions to the Application Layer
The Root Port Application Layer issues read and write transactions by calling one of the ebfm_bar procedures in altpcietb_g3bfm_rdwr.v. The procedures and functions listed below are available in the Verilog HDL include file altpcietb_g3bfm_rdwr.v. The complete list of available procedures and functions is as follows:
- ebfm_barwr: writes data from BFM shared memory to an offset from a specific Endpoint BAR. This procedure returns as soon as the request has been passed to the VC interface module for transmission.
- ebfm_barwr_imm: writes a maximum of four bytes of immediate data (passed in a procedure call) to an offset from a specific Endpoint BAR. This procedure returns as soon as the request has been passed to the VC interface module for transmission.
- ebfm_barrd_wait: reads data from an offset of a specific Endpoint BAR and stores it in BFM shared memory. This procedure blocks waiting for the completion data to be returned before returning control to the caller.
- ebfm_barrd_nowt: reads data from an offset of a specific Endpoint BAR and stores it in the BFM shared memory. This procedure returns as soon as the request has been passed to the VC interface module for transmission, allowing subsequent reads to be issued in the interim.
These routines take as parameters a BAR number to access the memory space and the BFM shared memory address of the bar_table data structure that was set up by the ebfm_cfg_rp_ep procedure. (Refer to Configuration of Root Port and Endpoint.) Using these parameters simplifies the BFM test driver routines that access an offset from a specific BAR and eliminates calculating the addresses assigned to the specified BAR.
The Root Port BFM does not support accesses to Endpoint I/O space BARs.
6.4. BFM Procedures and Functions
The BFM includes procedures, functions, and tasks to drive Endpoint application testing. It also includes procedures to run the chaining DMA design example.
The BFM read and write procedures read and write data to BFM shared memory, Endpoint BARs, and specified configuration registers. The procedures and functions are available in the Verilog HDL. These procedures and functions support issuing memory and configuration transactions on the PCI Express link.
6.4.1. ebfm_barwr Procedure
The ebfm_barwr procedure writes a block of data from BFM shared memory to an offset from the specified Endpoint BAR. The length can be longer than the configured MAXIMUM_PAYLOAD_SIZE. The procedure breaks the request up into multiple transactions as needed. This routine returns as soon as the last transaction has been accepted by the VC interface module.
Location |
||
---|---|---|
Syntax |
ebfm_barwr(bar_table, bar_num, pcie_offset, lcladdr, byte_len, tclass) |
|
Arguments |
bar_table |
Address of the Endpoint bar_table structure in BFM shared memory. The bar_table structure stores the address assigned to each BAR so that the driver code does not need to be aware of the actual assigned addresses only the application specific offsets from the BAR. |
bar_num |
Number of the BAR used with pcie_offset to determine PCI Express address. |
|
pcie_offset |
Address offset from the BAR base. |
|
lcladdr |
BFM shared memory address of the data to be written. |
|
byte_len |
Length, in bytes, of the data written. Can be 1 to the minimum of the bytes remaining in the BAR space or BFM shared memory. |
|
tclass |
Traffic class used for the PCI Express transaction. |
6.4.2. ebfm_barwr_imm Procedure
The ebfm_barwr_imm procedure writes up to four bytes of data to an offset from the specified Endpoint BAR.
Location |
||
---|---|---|
Syntax |
ebfm_barwr_imm(bar_table, bar_num, pcie_offset, imm_data, byte_len, tclass) |
|
Arguments |
bar_table |
Address of the Endpoint bar_table structure in BFM shared memory. The bar_table structure stores the address assigned to each BAR so that the driver code does not need to be aware of the actual assigned addresses only the application specific offsets from the BAR. |
bar_num |
Number of the BAR used with pcie_offset to determine PCI Express address. |
|
pcie_offset |
Address offset from the BAR base. |
|
imm_data |
Data to be written. In Verilog HDL, this argument is reg [31:0].In both languages, the bits written depend on the length as follows: Length Bits Written
|
|
byte_len |
Length of the data to be written in bytes. Maximum length is 4 bytes. |
|
tclass |
Traffic class to be used for the PCI Express transaction. |
6.4.3. ebfm_barrd_wait Procedure
The ebfm_barrd_wait procedure reads a block of data from the offset of the specified Endpoint BAR and stores it in BFM shared memory. The length can be longer than the configured maximum read request size; the procedure breaks the request up into multiple transactions as needed. This procedure waits until all of the completion data is returned and places it in shared memory.
Location |
||
---|---|---|
Syntax |
ebfm_barrd_wait(bar_table, bar_num, pcie_offset, lcladdr, byte_len, tclass) |
|
Arguments |
bar_table |
Address of the Endpoint bar_table structure in BFM shared memory. The bar_table structure stores the address assigned to each BAR so that the driver code does not need to be aware of the actual assigned addresses only the application specific offsets from the BAR. |
bar_num |
Number of the BAR used with pcie_offset to determine PCI Express address. |
|
pcie_offset |
Address offset from the BAR base. |
|
lcladdr |
BFM shared memory address where the read data is stored. |
|
byte_len |
Length, in bytes, of the data to be read. Can be 1 to the minimum of the bytes remaining in the BAR space or BFM shared memory. |
|
tclass |
Traffic class used for the PCI Express transaction. |
6.4.4. ebfm_barrd_nowt Procedure
The ebfm_barrd_nowt procedure reads a block of data from the offset of the specified Endpoint BAR and stores the data in BFM shared memory. The length can be longer than the configured maximum read request size; the procedure breaks the request up into multiple transactions as needed. This routine returns as soon as the last read transaction has been accepted by the VC interface module, allowing subsequent reads to be issued immediately.
Location |
||
---|---|---|
Syntax |
ebfm_barrd_nowt(bar_table, bar_num, pcie_offset, lcladdr, byte_len, tclass) |
|
Arguments |
bar_table |
Address of the Endpoint bar_table structure in BFM shared memory. |
bar_num |
Number of the BAR used with pcie_offset to determine PCI Express address. |
|
pcie_offset |
Address offset from the BAR base. |
|
lcladdr |
BFM shared memory address where the read data is stored. |
|
byte_len |
Length, in bytes, of the data to be read. Can be 1 to the minimum of the bytes remaining in the BAR space or BFM shared memory. |
|
tclass |
Traffic Class to be used for the PCI Express transaction. |
6.4.5. ebfm_cfgwr_imm_wait Procedure
The ebfm_cfgwr_imm_wait procedure writes up to four bytes of data to the specified configuration register. This procedure waits until the write completion has been returned.
Location |
||
---|---|---|
Syntax |
ebfm_cfgwr_imm_wait(bus_num, dev_num, fnc_num, imm_regb_ad, regb_ln, imm_data, compl_status |
|
Arguments |
bus_num |
PCI Express bus number of the target device. |
dev_num |
PCI Express device number of the target device. |
|
fnc_num |
Function number in the target device to be accessed. |
|
regb_ad |
Byte-specific address of the register to be written. |
|
regb_ln |
Length, in bytes, of the data written. Maximum length is four bytes. The regb_ln and the regb_ad arguments cannot cross a DWORD boundary. |
|
imm_data |
Data to be written. This argument is reg [31:0]. The bits written depend on the length:
|
|
compl_status |
This argument is reg [2:0]. This argument is the completion status as specified in the PCI Express specification. The following encodings are defined:
|
6.4.6. ebfm_cfgwr_imm_nowt Procedure
The ebfm_cfgwr_imm_nowt procedure writes up to four bytes of data to the specified configuration register. This procedure returns as soon as the VC interface module accepts the transaction, allowing other writes to be issued in the interim. Use this procedure only when successful completion status is expected.
Location |
||
---|---|---|
Syntax |
ebfm_cfgwr_imm_nowt(bus_num, dev_num, fnc_num, imm_regb_adr, regb_len, imm_data) |
|
Arguments |
bus_num |
PCI Express bus number of the target device. |
dev_num |
PCI Express device number of the target device. |
|
fnc_num |
Function number in the target device to be accessed. |
|
regb_ad |
Byte-specific address of the register to be written. |
|
regb_ln |
Length, in bytes, of the data written. Maximum length is four bytes, The regb_ln the regb_ad arguments cannot cross a DWORD boundary. |
|
imm_data |
Data to be written This argument is reg [31:0]. In both languages, the bits written depend on the length. The following encodes are defined.
|
6.4.7. ebfm_cfgrd_wait Procedure
The ebfm_cfgrd_wait procedure reads up to four bytes of data from the specified configuration register and stores the data in BFM shared memory. This procedure waits until the read completion has been returned.
Location |
||
---|---|---|
Syntax |
ebfm_cfgrd_wait(bus_num, dev_num, fnc_num, regb_ad, regb_ln, lcladdr, compl_status) |
|
Arguments |
bus_num |
PCI Express bus number of the target device. |
dev_num |
PCI Express device number of the target device. |
|
fnc_num |
Function number in the target device to be accessed. |
|
regb_ad |
Byte-specific address of the register to be written. |
|
regb_ln |
Length, in bytes, of the data read. Maximum length is four bytes. The regb_ln and the regb_ad arguments cannot cross a DWORD boundary. |
|
lcladdr |
BFM shared memory address of where the read data should be placed. |
|
compl_status |
Completion status for the configuration transaction. This argument is reg [2:0]. In both languages, this is the completion status as specified in the PCI Express specification. The following encodings are defined.
|
6.4.8. ebfm_cfgrd_nowt Procedure
The ebfm_cfgrd_nowt procedure reads up to four bytes of data from the specified configuration register and stores the data in the BFM shared memory. This procedure returns as soon as the VC interface module has accepted the transaction, allowing other reads to be issued in the interim. Use this procedure only when successful completion status is expected and a subsequent read or write with a wait can be used to guarantee the completion of this operation.
Location |
||
---|---|---|
Syntax |
ebfm_cfgrd_nowt(bus_num, dev_num, fnc_num, regb_ad, regb_ln, lcladdr) |
|
Arguments |
bus_num |
PCI Express bus number of the target device. |
dev_num |
PCI Express device number of the target device. |
|
fnc_num |
Function number in the target device to be accessed. |
|
regb_ad |
Byte-specific address of the register to be written. |
|
regb_ln |
Length, in bytes, of the data written. Maximum length is four bytes. The regb_ln and regb_ad arguments cannot cross a DWORD boundary. |
|
lcladdr |
BFM shared memory address where the read data should be placed. |
6.4.9. BFM Configuration Procedures
All Verilog HDL arguments are type integer and are input‑only unless specified otherwise.
6.4.9.1. ebfm_cfg_rp_ep Procedure
The ebfm_cfg_rp_ep procedure configures the Root Port and Endpoint Configuration Space registers for operation.
Location |
|
|
---|---|---|
Syntax |
ebfm_cfg_rp_ep(bar_table, ep_bus_num, ep_dev_num, rp_max_rd_req_size, display_ep_config, addr_map_4GB_limit) |
|
Arguments |
bar_table |
Address of the Endpoint bar_table structure in BFM shared memory. This routine populates the bar_table structure. The bar_table structure stores the size of each BAR and the address values assigned to each BAR. The address of the bar_table structure is passed to all subsequent read and write procedure calls that access an offset from a particular BAR. |
ep_bus_num |
PCI Express bus number of the target device. This number can be any value greater than 0. The Root Port uses this as the secondary bus number. |
|
ep_dev_num |
PCI Express device number of the target device. This number can be any value. The Endpoint is automatically assigned this value when it receives the first configuration transaction. |
|
rp_max_rd_req_size |
Maximum read request size in bytes for reads issued by the Root Port. This parameter must be set to the maximum value supported by the Endpoint Application Layer. If the Application Layer only supports reads of the MAXIMUM_PAYLOAD_SIZE, then this can be set to 0 and the read request size is set to the maximum payload size. Valid values for this argument are 0, 128, 256, 512, 1,024, 2,048 and 4,096. |
|
display_ep_config |
When set to 1 many of the Endpoint Configuration Space registers are displayed after they have been initialized, causing some additional reads of registers that are not normally accessed during the configuration process such as the Device ID and Vendor ID. |
|
addr_map_4GB_limit |
When set to 1 the address map of the simulation system is limited to 4 GB. Any 64-bit BARs are assigned below the 4 GB limit. |
6.4.9.2. ebfm_cfg_decode_bar Procedure
The ebfm_cfg_decode_bar procedure analyzes the information in the BAR table for the specified BAR and returns details about the BAR attributes.
Location |
|
|
---|---|---|
Syntax |
ebfm_cfg_decode_bar(bar_table, bar_num, log2_size, is_mem, is_pref, is_64b) |
|
Arguments |
bar_table |
Address of the Endpoint bar_table structure in BFM shared memory. |
bar_num |
BAR number to analyze. |
|
log2_size |
This argument is set by the procedure to the log base 2 of the size of the BAR. If the BAR is not enabled, this argument is set to 0. |
|
is_mem |
The procedure sets this argument to indicate if the BAR is a memory space BAR (1) or I/O Space BAR (0). |
|
is_pref |
The procedure sets this argument to indicate if the BAR is a prefetchable BAR (1) or non-prefetchable BAR (0). |
|
is_64b |
The procedure sets this argument to indicate if the BAR is a 64-bit BAR (1) or 32-bit BAR (0). This is set to 1 only for the lower numbered BAR of the pair. |
6.4.10. BFM Shared Memory Access Procedures
These procedures and functions support accessing the BFM shared memory.
6.4.10.1. Shared Memory Constants
Constant |
Description |
---|---|
SHMEM_FILL_ZEROS |
Specifies a data pattern of all zeros |
SHMEM_FILL_BYTE_INC |
Specifies a data pattern of incrementing 8-bit bytes (0x00, 0x01, 0x02, etc.) |
SHMEM_FILL_WORD_INC |
Specifies a data pattern of incrementing 16-bit words (0x0000, 0x0001, 0x0002, etc.) |
SHMEM_FILL_DWORD_INC |
Specifies a data pattern of incrementing 32-bit DWORDs (0x00000000, 0x00000001, 0x00000002, etc.) |
SHMEM_FILL_QWORD_INC |
Specifies a data pattern of incrementing 64-bit qwords (0x0000000000000000, 0x0000000000000001, 0x0000000000000002, etc.) |
SHMEM_FILL_ONE |
Specifies a data pattern of all ones |
6.4.10.2. shmem_write Task
The shmem_write procedure writes data to the BFM shared memory.
Location |
||
---|---|---|
Syntax |
shmem_write(addr, data, leng) |
|
Arguments |
addr |
BFM shared memory starting address for writing data |
data |
Data to write to BFM shared memory. This parameter is implemented as a 64‑bit vector. leng is 1–8 bytes. Bits 7 down to 0 are written to the location specified by addr; bits 15 down to 8 are written to the addr+1 location, etc. |
|
length |
Length, in bytes, of data written |
6.4.10.3. shmem_read Function
The shmem_read function reads data to the BFM shared memory.
Location |
||
---|---|---|
Syntax |
data:= shmem_read(addr, leng) |
|
Arguments |
addr |
BFM shared memory starting address for reading data |
leng |
Length, in bytes, of data read |
|
Return |
data |
Data read from BFM shared memory. This parameter is implemented as a 64-bit vector. leng is 1- 8 bytes. If leng is less than 8 bytes, only the corresponding least significant bits of the returned data are valid. Bits 7 down to 0 are read from the location specified by addr; bits 15 down to 8 are read from the addr+1 location, etc. |
6.4.10.4. shmem_display Verilog HDL Function
The shmem_display Verilog HDL function displays a block of data from the BFM shared memory.
Location |
||
---|---|---|
Syntax |
Verilog HDL: dummy_return:=shmem_display(addr, leng, word_size, flag_addr, msg_type); |
|
Arguments |
addr |
BFM shared memory starting address for displaying data. |
leng |
Length, in bytes, of data to display. |
|
word_size |
Size of the words to display. Groups individual bytes into words. Valid values are 1, 2, 4, and 8. |
|
flag_addr |
Adds a <== flag to the end of the display line containing this address. Useful for marking specific data. Set to a value greater than 2**21 (size of BFM shared memory) to suppress the flag. |
|
msg_type |
Specifies the message type to be displayed at the beginning of each line. See “BFM Log and Message Procedures” on page 18–37 for more information about message types. Set to one of the constants defined in Table 18–36 on page 18–41. |
6.4.10.5. shmem_fill Procedure
The shmem_fill procedure fills a block of BFM shared memory with a specified data pattern.
Location |
||
---|---|---|
Syntax |
shmem_fill(addr, mode, leng, init) |
|
Arguments |
addr |
BFM shared memory starting address for filling data. |
mode |
Data pattern used for filling the data. Should be one of the constants defined in section Shared Memory Constants. |
|
leng |
Length, in bytes, of data to fill. If the length is not a multiple of the incrementing data pattern width, then the last data pattern is truncated to fit. |
|
init |
Initial data value used for incrementing data pattern modes. This argument is reg [63:0]. The necessary least significant bits are used for the data patterns that are smaller than 64 bits. |
6.4.10.6. shmem_chk_ok Function
The shmem_chk_ok function checks a block of BFM shared memory against a specified data pattern.
Location |
||
---|---|---|
Syntax |
result:= shmem_chk_ok(addr, mode, leng, init, display_error) |
|
Arguments |
addr |
BFM shared memory starting address for checking data. |
mode |
Data pattern used for checking the data. Should be one of the constants defined in section “Shared Memory Constants” on page 18–35. |
|
leng |
Length, in bytes, of data to check. |
|
init |
This argument is reg [63:0].The necessary least significant bits are used for the data patterns that are smaller than 64-bits. |
|
display_error |
When set to 1, this argument displays the data failing comparison on the simulator standard output. |
|
Return |
Result |
Result is 1-bit.
|
6.4.11. BFM Log and Message Procedures
These procedures provide support for displaying messages in a common format, suppressing informational messages, and stopping simulation on specific message types.
The following constants define the type of message and their values determine whether a message is displayed or simulation is stopped after a specific message. Each displayed message has a specific prefix, based on the message type in the following table.
You can suppress the display of certain message types. The default values determining whether a message type is displayed are defined in the following table. To change the default message display, modify the display default value with a procedure call to ebfm_log_set_suppressed_msg_mask.
Certain message types also stop simulation after the message is displayed. The following table shows the default value determining whether a message type stops simulation. You can specify whether simulation stops for particular messages with the procedure ebfm_log_set_stop_on_msg_mask.
All of these log message constants type integer.
Constant (Message Type) |
Description |
Mask Bit No |
Display by Default |
Simulation Stops by Default |
Message Prefix |
---|---|---|---|---|---|
EBFM_MSG_DEBUG |
Specifies debug messages. |
0 |
No |
No |
DEBUG: |
EBFM_MSG_INFO |
Specifies informational messages, such as configuration register values, starting and ending of tests. |
1 |
Yes |
No |
INFO: |
EBFM_MSG_WARNING |
Specifies warning messages, such as tests being skipped due to the specific configuration. |
2 |
Yes |
No |
WARNING: |
EBFM_MSG_ERROR_INFO |
Specifies additional information for an error. Use this message to display preliminary information before an error message that stops simulation. |
3 |
Yes |
No |
ERROR: |
EBFM_MSG_ERROR_CONTINUE |
Specifies a recoverable error that allows simulation to continue. Use this error for data comparison failures. |
4 |
Yes |
No |
ERROR: |
EBFM_MSG_ERROR_FATAL |
Specifies an error that stops simulation because the error leaves the testbench in a state where further simulation is not possible. |
N/A |
Yes Cannot suppress |
Yes Cannot suppress |
FATAL: |
EBFM_MSG_ERROR_FATAL_TB_ERR |
Used for BFM test driver or Root Port BFM fatal errors. Specifies an error that stops simulation because the error leaves the testbench in a state where further simulation is not possible. Use this error message for errors that occur due to a problem in the BFM test driver module or the Root Port BFM, that are not caused by the Endpoint Application Layer being tested. |
N/A |
Y Cannot suppress |
Y Cannot suppress |
FATAL: |
6.4.11.1. ebfm_display Verilog HDL Function
The ebfm_display procedure or function displays a message of the specified type to the simulation standard output and also the log file if ebfm_log_open is called.
A message can be suppressed, simulation can be stopped or both based on the default settings of the message type and the value of the bit mask when each of the procedures listed below is called. You can call one or both of these procedures based on what messages you want displayed and whether or not you want simulation to stop for specific messages.
- When ebfm_log_set_suppressed_msg_mask is called, the display of the message might be suppressed based on the value of the bit mask.
- When ebfm_log_set_stop_on_msg_mask is called, the simulation can be stopped after the message is displayed, based on the value of the bit mask.
Location |
||
---|---|---|
Syntax |
Verilog HDL: dummy_return:=ebfm_display(msg_type, message); |
|
Argument |
msg_type |
Message type for the message. Should be one of the constants defined in Table 106. |
message |
The message string is limited to a maximum of 100 characters. Also, because Verilog HDL does not allow variable length strings, this routine strips off leading characters of 8’h00 before displaying the message. |
|
Return |
always 0 |
Applies only to the Verilog HDL routine. |
6.4.11.2. ebfm_log_stop_sim Verilog HDL Function
The ebfm_log_stop_sim procedure stops the simulation.
Location |
||
---|---|---|
Syntax |
Verilog HDL: return:=ebfm_log_stop_sim(success); |
|
Argument |
success |
When set to a 1, this process stops the simulation with a message indicating successful completion. The message is prefixed with SUCCESS. Otherwise, this process stops the simulation with a message indicating unsuccessful completion. The message is prefixed with FAILURE. |
Return |
Always 0 |
This value applies only to the Verilog HDL function. |
6.4.11.3. ebfm_log_set_suppressed_msg_mask Task
The ebfm_log_set_suppressed_msg_mask procedure controls which message types are suppressed.
Location |
||
---|---|---|
Syntax |
ebfm_log_set_suppressed_msg_mask (msg_mask) |
|
Argument |
msg_mask |
This argument is reg [EBFM_MSG_ERROR_CONTINUE: EBFM_MSG_DEBUG]. A 1 in a specific bit position of the msg_mask causes messages of the type corresponding to the bit position to be suppressed. |
6.4.11.4. ebfm_log_set_stop_on_msg_mask Verilog HDL Task
The ebfm_log_set_stop_on_msg_mask procedure controls which message types stop simulation. This procedure alters the default behavior of the simulation when errors occur as described in the BFM Log and Message Procedures.
Location |
||
---|---|---|
Syntax |
ebfm_log_set_stop_on_msg_mask (msg_mask) | |
Argument |
msg_mask |