Troubleshooting
This topic covers troubleshooting for the data streams optimizer.
Problem | Possible Cause / Solution |
---|---|
Environment file parsing error due to misformatted JSON. | Cause: File contains one or more invalid JSON characters. Solution: Escape JSON-specific characters, such as double-quotes and backslash:
|
Workload exit status is 127. | Cause: Permission issue. Solution: Make sure you have “execute” permissions for the workload validation script. |
Capsule generation script exit status is 127. | Cause: Permission issue. Solution: Make sure you have “execute” permissions for the subregion capsule script (subregion_capsule.py). |
“no module named …” | Cause: Prerequisites are not satisfied. Solution: Follow the steps in the prerequisites section of README.md located in tools/tcc_date_streams_optimizer folder. |
Failed to reconnect via SSH after reboot. | Cause: IP address changes after reboot. Solution: Use a static IP address for the target system or use the full hostname to establish the SSH connection. |
“Failed to generate capsule.” | Cause: Subregion capsule tool issue. Solution: Check that instructions from ${TCC_TOOLS_PATH}/capsule were executed correctly and check paths in environment file. |
The data streams optimizer hangs after the “Rebooting <hostname>” output message during target reboot. | Cause: Most likely you have a “Broken pipe” issue in the case of an unexpected exit from the SSH session to the target system. Solutions:
After trying a solution, rerun the tuning flow from the beginning. |
On 11th Gen Intel® Core™ processors, a system hang may occur intermittently when running the reboot command. | Cause: If the system detects hardware errors, the Functional Safety (FuSa) feature, PCIe Interrupt Error Handling (IEH), may attempt an additional system reset which can get stuck at postcode 0x0b7f. Solution: Hard reset to regain control of the system. Temporary resolution for system hang after reboot: Disable IEH in the BIOS menu: Intel Advanced Menu/PCH-IO Configuration/IEH Mode = Bypass Mode |
The MRL application freezes on “Start validation.” | Cause: Application may freeze due to a high volume of interrupts. Solution: Increase the --outliers argument to 400 or higher. It is an optional argument, so you may need to add it to the command. |
The MRL application cannot detect the processor automatically. | Cause: Your processor name does not correspond with any processor in the list of known processors. Solution: Add the --processor {TGL-U|TGL-H|EHL} argument to the workload command in the requirements file. With this option, you can specify your processor manually. Using unsupported processors can cause errors and odd results. |
The MRL application shows “Unable to mmap memory”.. | Cause: Some drivers blocks /dev/mem from using Solution: Unload the stmmac_pci and stmmac drivers if you are using a TSN device, or the igc driver if you are using an I225 device. |
The data streams optimizer applied a tuning configuration to your system, but you want to reset your system to default settings. | Solution 1 will set your system back to the original tuning settings, so you can continue using the Data Streams Optimizer.
Solution 2 will set all tuning options to the default state and turn off the Data Streams Optimizer feature in the BIOS, so you cannot use the Data Streams Optimizer. If you enable the switch for Data Streams Optimizer in the BIOS again, it will re-apply the last tuning configuration - the information is still contained in the capsule. To completely erase the tuning use Solution 1. Solution 1: Follow steps from Generate a Configuration for Your Workload to reset your system to the original state. Solution 2: Disable the Data Streams Optimizer , Software SRAM , and Intel® TCC Mode options in the BIOS.
After reboot, your system will be reset. For more information, see Apply Reset Configuration. |
After disabling RTCM, the system freezes or the following error occurs: “Could not set up firmware update: Invalid argument. ERROR: Failed to apply buffer capsule”. Unexpected performance results occur, fwupdate software does not work so capsules are not applying, or enable/disable RTCM does not work.Some cores are offline according to the lscpu command output. Output example: “Off-line CPU(s) list: 1-3.” | Cause: Combining RTCM and data streams optimizer may result in offline cores and a number of different errors. Possible solutions: Solution 1: Follow steps from Generate a Configuration for Your Workload to reset your system to the original state. Solution 2: Reflash the BIOS. Solution 3:
Now your system is ready to use the data streams optimizer with enabled RTCM. |
Register check failed. | Cause: Configuration is not applied properly and registers detected unexpected values. Solution:
|
Data Streams Optimizer on Windows* OS has issue with Capsule Apply. | Cause: Windows* OS can not find private certificate to sign Driver. Solution: Log in with your user to Windows* OS, private certificates will be autoloaded. |
On Intel® Xeon® W-11000E Series processors, if the producer or consumer device is attached to the root port 0:01.0, data streams optimizer fails to find any tuning configurations. | Cause: Error in configuration search algorithm. Solution: As a workaround, you can specify 0:1c.0 as a producer or consumer, respectively, in the requirements file. |