Steps to follow when dealing with ECC correctable error event logged in System Event Log (SEL)
ECC correctable errors represent a threshold overflow for a given Dual In-line Memory Modules (DIMM) within a given timeframe.
Notes |
The Error Correction Code (ECC) errors are self-correcting. Depending on the Reliability Availability Serviceability (RAS) configuration of the memory, the Integrated Memory Controller (IMC) may take the affected DIMM offline. |
For different Intel server platforms, there are some differences in their event definition, refer to System Event Log Troubleshooting Guide for your server platform | |
Intel recommends to download and update the system BIOS to the latest available version for your server platform. | |
If the system is an Intel® Data Center Systems certified for Nutanix* Enterprise Cloud Platform, visit the Nutanix* Life Cycle Manager page. For a list of hardware and firmware compatibility, visit the Nutanix* Hardware and Firmware compatibility page. |