What am I seeing?
An IERR is a catastrophic error reported by the processor but generally caused by devices outside of the processor core (e.g., memory, PCIe).
- The processor execution has stalled due typically to an event outside of the processor.
- This issue is often accompanied by a CATERR event that can be cross-referenced for additional information.
How to fix it:
Follow these steps in order:
- Review the System Event Log (SEL) for Error correction code (ECC) events. Defective memory can trigger an IERR.
- Review the SEL for any PCIe events. Malfunctioning PCIe devices can trigger an IERR.
- Ensure that Operating System (OS) drivers are up to date for the server as well as for any recently added hardware devices. Out-of-date OS drivers can trigger an IERR.
- Check the OS logs for any Machine Check Architecture (MCA) entries that may indicate a hardware fault that could have triggered the IERR.
- Confirm that you have the latest BIOS for the server system.
- Go to Baseboard Management Controller Web Console > Configuration > Memory Configuration > PPR Type and set PPR settings to Hard.
- If the logs confirm that there is a specific memory module(s) that can be causing the issue, proceed to reseat the memory stick(s) and monitor the server for 24 hours.
|My server crashes and shows this error: Processor CPU Machine Chk|
|For firmware updates and troubleshooting tips|
|System Event Log Troubleshooting Guides for Intel® Server Boards|