Low Latency 40G for ASIC Proto Ethernet Intel® FPGA IP User Guide

ID 683221
Date 11/10/2022
Public
Document Table of Contents

8. Debugging the Link

The following steps should help you identify and resolve common problems that occur when bringing up a Low Latency 40G for ASIC Proto Ethernet core link:

  1. Establish word lock—The RX lanes should be able to achieve word lock even in the presence of extreme bit error rates. If the IP core is unable to achieve word lock, check the transceiver clocking and data rate configuration. Check for cabling errors such as the reversal of the TX and RX lanes. Check the clock frequency monitors ( KHZ_TX, KHZ_RX PHY registers) in the Control and Status registers.

    To check for word lock: Clear the FRM_ERR register by writing the value of 1 followed by another write of 0 to the SCLR_FRM_ERR register at offset 0x324.Then read the FRM_ERR register at offset 0x323. If the value is zero, the core has word lock. If non-zero the status is indeterminate

  2. When having problems with word lock, check the EIO_FREQ_LOCK register at address 0x321. The values in this register define the status of the recovered clock. In normal operation, all the bits should be asserted. A non-asserted (value-0) or toggling logic value on the bit that corresponds to any lane, indicates a clock recovery problem. Clock recovery difficulties are typically caused by the following problems:
    • Bit errors
    • Failure to establish the link
    • Incorrect clock inputs to the IP core
  3. Check the PMA FIFO levels by selecting appropriate bits in the EIO_FLAG_SEL register and reading the values in the EIO_FLAGS register. During normal operation, the TX and RX FIFOs should be nominally filled. Observing the TX FIFO is either empty or full typically indicates a problem with clock frequencies. The RX FIFO should never be full, although an empty RX FIFO can be tolerated.
  4. Establish lane integrity—When operating properly, the lanes should not experience bit errors at a rate greater than roughly one per hour per day. Bit errors within data packets are identified as FCS errors. Bit errors in control information, including IDLE frames, generally cause errors in XL/CGMII decoding.
  5. Verify packet traffic—The Ethernet protocol includes automatic lane reordering so the higher levels should follow the PCS. If the PCS is locked, but higher level traffic is corrupted, there may be a problem with the remote transmitter virtual lane tags.
  6. Tuning—You can adjust transceiver analog parameters to improve the bit error rate. IDLE traffic is representative for analog purposes.

In addition, your IP core can experience loss of signal on the Ethernet link after it is established. In this case, the TX functionality is unaffected, but the RX functionality is disrupted. The following symptoms indicate a loss of signal on the Ethernet link:

  • The IP core deasserts the rx_pcs_ready signal, indicating the IP core has lost alignment marker lock.
  • The IP core deasserts the RX PCS fully aligned status bit (bit [0]) of the RX_PCS_FULLY_ALIGNED_S register at offset 0x326. This change is linked to the change in value of the rx_pcs_ready signal.
  • If Enable link fault generation is turned on, the IP core sets local_fault_status to the value of 1.
  • The IP core triggers the RX digital reset process.