Single event upsets (SEUs) are rare and unintended changes in the internal memory
elements of an FPGA caused by cosmic radiation. The memory state change is a soft error with
no permanent damage but the FPGA may operate erroneously until background scrubbing fixes the
Because of the low chance of occurrence, your design may not require SEU
mitigation. However, if your system includes multiple FPGAs and requires very high reliability
and availability, consider using mitigation techniques to detect and recover from SEU
Quartus® Prime software offers several features to detect, correct,
and characterize the effects of SEU on your designs. Additionally,
FPGAs contain dedicated circuitry to help detect and correct
SEU mitigation features can benefit the system
Ensuring the system functions properly at all time
Preventing a system malfunction caused by an SEU event
Handling the SEU event if it is critical to the system
Table 1. SEU Mitigation Areas and Approaches for
SEU Mitigation Approach
Error detection and correction
Enable the error detection and correction (EDC) feature to detect
CRAM SEU events and automatically correct the CRAM contents.
Memory block error correction code
Take advantage of the error correction code (ECC) feature and the
special layout design of the
memory blocks to reduce SEU failures in time (FIT) rate to almost zero.
Use hierarchy tagging, together with sensitivity processing and
fault injection, to report SEU and constrain error injection to specific portions of
your design logic.
Triple modular redundancy
Use triple modular redundancy (TMR) technique on critical logic
such as state machines to improve hardware fault tolerance.
The feature will be available in a future
1.2. Configuration RAM
FPGAs use memory in user logic (bulk memory and registers) and in
configuration RAM (CRAM). The
Quartus® Prime Programmer loads the CRAM with your
design (.sof file). During device configuration, the CRAM configures all
FPGA logic and routing.
If an SEU strikes a CRAM bit that is not in use, the effect can be harmless. However, if the
affected CRAM bit is in use for critical internal signal routing or lookup table logic bits,
the device may experience a functional error.
devices contain three
types of memory blocks: Embedded SRAM (eSRAM) blocks, M20K blocks, and memory logic array
blocks (MLABs). The M20K blocks and eSRAM blocks support ECC. The ECC feature detects and
corrects data errors at the output of the memory.
Note: When you engage the ECC feature, you cannot use the byte enable and coherent read
Table 2. ECC for M20K and eSRAM Blocks
In ×32-wide simple dual-port mode
In ×64-wide simple dual-port mode.
32-bit word error detection and correction:
The ECC cannot guarantee detection or correction of non-adjacent two-bit (or more)
64-bit word error detection or correction:
Flags indicating memory status
The status flags are part of the regular outputs from the memory block.
When you engage ECC, the M20K memory runs slower than in non-ECC simple
dual-port mode. To achieve a higher performance—compared to non-pipeline ECC mode—at the
expense of a one-cycle latency, enable the optional ECC pipeline registers before the output
Triple modular redundancy (TMR) is an established SEU mitigation
technique for improving hardware fault tolerance. Use TMR if your system cannot suffer
downtime caused by an SEU.
A TMR design has three identical instances of hardware with a voting hardware at the output.
If an SEU affects one of the hardware instances, the voting logic notes the majority output.
This operation masks malfunctioning hardware.
With TMR, your design does not suffer downtime in the case of a single SEU:
When the system detects a faulty module, the system scrubs the error by reprogramming the
The error detection and correction time is many orders of magnitude less than the mean
time between failures (MTBF) of SEU events.
The system can repair a soft interrupt before another SEU affects another instance in the
The disadvantage of TMR is that, in addition to voting logic, it requires three times more
hardware cost than a non-TMR design. To minimize the hardware cost, implement TMR for only the
most critical parts of your design.
You can automate generation of TMR designs by automatically replicating designated functions
and synthesizing the required voting logic. For example,
offers a tool that automate TMR synthesis.
The soft error rate (SER) or SEU reliability is expressed in Failures In Time
(FIT)—the number of failures you can expect in one billion operation hours.
For example, a design with 5,000 FIT experiences a mean of 5,000 SEU events in 109
hours (114,155.25 years). Because SEU events are statistically independent, FIT is additive.
If a single FPGA has 5,000 FIT, then ten FPGAs have 50,000 FIT (50,000 failures in 114,155.25
Another reliability measurement is the mean time to failure (MTTF), which is
the reciprocal of the FIT or 1/FIT. For a FIT of 5,000 in standard units of failures per
billion hours, MTTF is 1/(5,000/1Bh) = 1 billion/5,000 = 200,000 hours = 22.83 years.
SEU events follow a Poisson distribution, and the cumulative distribution function (CDF) for
mean time between failures (MTBF) is an exponential distribution. For more information about
failure rate calculation, refer to the
Intel® FPGA Reliability Report (available upon request).
Neutron SEU incidence varies by altitude, latitude, and other environmental factors. The
Quartus® Prime software provides SEU FIT reports based on
compiles for sea level in Manhattan, New York. The JESD89A specification defines the test
Tip: You can convert the data
to other locations and altitudes using calculators, such as the one at seutest.com. You can
adjust the SEU rates in your project by including the relative neutron flux in your project's
devices feature on-chip EDC circuitry to detect soft
errors. If you enable the internal scrubbing feature, the
corrects an error caused by an SEU event if it is correctable.
Table 3. Detection and Correction of Error Types
Single bit error
Double adjacent errors
Multiple bit errors
The feature will be available in a future
Quartus® Prime release.
Note: For information about the embedded memory ECC feature, refer to the related
When it detects an SEU error, the
device stores the error information in the error message queue. The queue can store up to four
different messages. Each error message records the sector address, type, and location of the
The SEU_ERROR signal goes high whenever the
error message queue contains one or more error messages. The signal stays high if there is an
error message in the queue. The SEU_ERROR signal goes low only
when the SEU error message queue is empty—after you shift out all the error messages. You must
set the SEU_ERROR pin function to observe the SEU_ERROR pin behavior.
To retrieve the error message queue contents, use these tools:
2 The feature will be available in a future
Quartus® Prime release.
3 For single
bit error with internal scrubbing,
error location provides the
multiple bit errors or single bit error without internal
bit [23:0] returns 0.
devices support automatic CRAM
error correction without reloading the original CRAM contents from an external copy of the
original programming bit-stream.
You can also choose to perform scrubbing using partial reconfiguration by reloading the
impacted sector. Although scrubbing corrects the SEU error, the SEU error message queue keeps
the SEU error message until you retrieve it.
The internal scrubbing feature automatically corrects single-bit errors.
Intel® recommends that you turn on internal scrubbing. If you do not
enable internal scrubbing, the device turns off the SEU mitigation feature for a sector
after an error occurs in the sector. Subsequently, the device stops detection of correctable
or uncorrectable SEU occurrence in the affected sector.
If you enable the internal scrubbing feature, you must still plan your
recovery sequence. Although the scrubbing feature can restore the CRAM array to the intended
configuration, a latency period exists between detection and correction of the soft error.
During this latency period, the
be operating with errors.
You can specify portions of the design as high priority sectors for internal scrubbing. The
EDC circuitry detects and corrects
errors that occur in the high priority sectors first before detecting and correcting errors
in other sectors.
To specify areas for high priority internal scrubbing, use the
Quartus® PrimeLogic Lock region and design partition features.
Quartus® Prime menu,
select Assignments > Logic Lock Regions Window.
In the Logic Lock Regions
Window, create a region and place it within a design
Add your critical design modules, entities, or group of logic
to preserve and lock them to the region.
Quartus® Prime menu,
select Assignments > Assignment Editor.
In the Assignment Editor
window, assign Priority SEU Area to the
design partition where you place the Logic Lock region.
Alternatively, you can include the following instruction in
the project's Quartus settings file (.qsf):
set_instance_assignment -name PRIORITY_SEU_AREA ON -to <partition name>
Quartus® Prime software sets the internal scrubbing schedule of the priority
sectors to "as fast as possible". The internal scrubbing schedule for other sectors
follows the project's Minimum SEU interval global