AN 737: SEU Detection and Recovery in Intel® Arria® 10 Devices

ID 683064
Date 10/21/2021
Document Table of Contents

4.3.1. CRC_ERROR Pin Behavior

The Intel® Arria® 10 fast EDCRC feature runs all the column-based check-bits engine in parallel. When an SEU is detected, the column-based check-bits asserts the CRC_ERROR, the detected frame location is then passed to the frame-based check-bits to further localize the affected bit. This process causes the CRC_ERROR pin to assert twice. Column-based check-bits assert the first CRC_ERROR pulse and followed by the frame-based check-bits asserting the second pulse.

In Intel® Arria® 10, as soon as an SEU is detected, the CRC_ERROR is asserted high and remains high until the EMR is ready to be read. You can unload the EMR data as soon as the CRC_ERROR pin goes low. Once EMR data is unloaded, can determine the error type and the affected location. With these information you can decide how your system should respond to the specific SEU event.

Figure 7. Fast EDCRC Process Flow Chart
Figure 8. Timing Diagram for Column-Based Check-BitsIf the error is correctable, in most cases, there is a second pulse in a single SEU event .There are cases where the error is uncorrectable when the CRC_ERROR pin asserts 2 pulses, refer to Correctable and Uncorrectable Error for complete correctable and uncorrectable error cases. The complete EMR is only available at the falling edge of the second pulse.

In the rare event of an uncorrectable and un-locatable error, the CRC_ERROR signal is asserted only once. There is no second pulse assertion by frame-based check-bits due to the uncorrectable error location cannot be located. The statistical likelihood of uncorrectable multi-bit SEU is less than one in 10,000 years for a device in typical environmental conditions.

Figure 9. Timing Diagram for Column-Based or Frame-Based Check-Bits

Example of CRC_ERROR pin behavior for column-based/frame-based check-bits with a single pulse observed in one SEU event.