22.214.171.124. Fault Management and Error Reporting
The SDM firmware has the capability to detect error, fault, or warning in the PMBus throughout the initialization and monitor states. The firmware analyzes any error and put it into the Error Message Queue (EMQ). During configuration, the CONFIG_STATUS mailbox command notifies you about the error.
In the master mode while running the monitor state, the SDM firmware queries the voltage regulator with a STATUS_BYTE command for every 500 ms. If the value returned from the STATUS_BYTE is not equal to zero, it indicates an error, fault, or warning within the voltage regulator. This firmware reports the error through the EMQ and assert the SEU_ERROR pin to notify you of this error.
In the slave mode, the SDM firmware asserts the PWRMGT_ALERT signal whenever an error occurs. The external PMBus master has to initiate the ARA flow to handshake with the FPGA to read the error from the firmware.
The STATUS_BYTE Polling
The STATUS_BYTE polling is an optional feature. To change the setting of the STATUS_BYTE polling, refer to the Specifying Power Management and VID Parameters and Option section and Table: Power Management and VID Parameters.
The following table shows the error of the STATUS_BYTE based on the return bit.
|STATUS_BYTE (78h)||Bit: Busy, unable to respond|
|Bit: Off, not enabled|
|Bit: Output over voltage fault occurred|
|Bit: Output over current fault occurred|
|Bit: Input under voltage fault occurred|
|Bit: Temperature fault or warning occurred|
|Bit: Communication, memory, or logic fault occurred|
|Bit: The fault occurred that are not listed above|
Each bit returns from the STATUS_BYTE indicate a different error occurring in the voltage regulator and firmware reports each of them into the EMQ. For example, a value of 0x6 (b'0000_0110) returns from the STATUS_BYTE read reports that the voltage regulator is having communication, memory, or logic fault and temperature fault or warning asserted, the firmware inputs there 2 error entries into the EMQ for each of the error or fault occurred.
The Importance of Safety Limits Settings in the Voltage Regulator
You must program non-volatile memory (NVM) in the voltage regulator correctly to ensure the error flag is not asserted incorrectly for the expected operating condition.
For Intel Agilex® 7 SmartVID devices, VCC and VCCP operate within the 0.70 V to 0.90 V voltage range. The following is the example settings that works for this voltage range. You may revise the settings based on your system requirements.
VOUT_OV_WARN_LIMIT to VID_MAX 927mV VOUT_OV_FAULT_LIMIT to VID_MAX 930mV VOUT_MAX to VID_MAX 950mV VOUT_UV_WARN_LIMIT to VID_MIN 690mV VOUT_UV_FAULT_LIMIT to VID_MIN 680mV
Limit should be wider or larger than the expected operating conditions, but within the absolute maximum rating for the device. For more information, refer to the Intel Agilex® 7 FPGAs and SoCs Device Data Sheet: F-Series and I-Series .