The following events are reported:
Memory Mem P0D1 Th Trip | Critical Overtemperature
Memory Mem P1D1 Th Trip | Critical Overtemperature
Each S9200WK node has two die per CPU to achieve the max core count:
CPU 0
P0D0 - processor 0, die 0
P0D1 - processor 0, die 1
CPU1
P1D0 - processor 1, die 0
P1D1 - processor 1, die 1
Therefore, the message "Memory Mem P1D1 Th Trip" means Processor 1, Die 1 was overheated and a thermal trip (Th Trip) event occurred.
Note | It does not mean the memory DIMM in D1 slot has is defective. |
Memory (DIMMs) are sensitive to excessive heat and may cause the server to be unstable; therefore, a thermal trip occurs when the internal server temperature is too high.
Check the following and ensure the server node is receiving adequate cooling:
For more information, refer to the following documents: