Developer Guide

  • 2021.3
  • 11/18/2021
  • Public
Contents

Troubleshooting

This topic covers troubleshooting for the data streams optimizer.
Problem
Possible Cause / Solution
Environment file parsing error due to misformatted JSON.
Cause: File contains one or more invalid JSON characters.
Solution: Escape JSON-specific characters, such as double-quotes and backslash:
  • For double-quotes
    "
    , replace with escape symbol:
    \"
  • For backslash
    \
    , replace with escape symbol:
    \\
Workload exit status is 127.
Cause: Permission issue.
Solution: Make sure you have “execute” permissions for the workload validation script.
Capsule generation script exit status is 127.
Cause: Permission issue.
Solution: Make sure you have “execute” permissions for the subregion capsule script (subregion_capsule.py).
“no module named …”
Cause: Prerequisites are not satisfied.
Solution: Follow the steps in the prerequisites section of README.md located in
tools/tcc_date_streams_optimizer
folder.
Failed to reconnect via SSH after reboot.
Cause: IP address changes after reboot.
Solution: Use a static IP address for the target system or use the full hostname to establish the SSH connection.
“Failed to generate capsule.”
Cause: Subregion capsule tool issue.
Solution: Check that instructions from
${TCC_TOOLS_PATH}/capsule
were executed correctly and check paths in environment file.
The data streams optimizer hangs after the “Rebooting <hostname>” output message during target reboot.
Cause: Most likely you have a “Broken pipe” issue in the case of an unexpected exit from the SSH session to the target system.
Solutions:
  • Fix the “Broken pipe” issue (may be SSH settings or network issue).
    1. Fix the connection issue by reviewing your IP addresses, connection settings, and cable connections.
    2. Review SSH settings.
  • Change the reconnection timeout in target connection settings file by increasing
    RECONNECT_ATTEMPTS
    .
  • Use
    shutdown -r 1
    instead of
    reboot
    in
    target_reboot.sh
    script.
After trying a solution, rerun the tuning flow from the beginning.
On 11th Gen Intel® Core™ processors, a system hang may occur intermittently when running the
reboot
command.
Cause: If the system detects hardware errors, the Functional Safety (FuSa) feature, PCIe Interrupt Error Handling (IEH), may attempt an additional system reset which can get stuck at postcode 0x0b7f.
Solution: Hard reset to regain control of the system.
Temporary resolution for system hang after reboot: Disable IEH in the BIOS menu: Intel Advanced Menu/PCH-IO Configuration/IEH Mode = Bypass Mode
The MRL application freezes on “Start validation.”
Cause: Application may freeze due to a high volume of interrupts.
Solution: Increase the
--outliers
argument to 400 or higher. It is an optional argument, so you may need to add it to the command.
The MRL application cannot detect the processor automatically.
Cause: Your processor name does not correspond with any processor in the list of known processors.
Solution: Add the
--processor {TGL-U|TGL-H|EHL}
argument to the workload command in the requirements file. With this option, you can specify your processor manually. Using unsupported processors can cause errors and odd results.
The MRL application shows “Unable to mmap memory”..
Cause: Some drivers blocks /dev/mem from using
Solution: Unload the
stmmac_pci
and
stmmac
drivers if you are using a TSN device, or the
igc
driver if you are using an I225 device.
The data streams optimizer applied a tuning configuration to your system, but you want to reset your system to default settings.
Solution 1 will set your system back to the original tuning settings, so you can continue using the Data Streams Optimizer. Solution 2 will set all tuning options to the default state and turn off the Data Streams Optimizer feature in the BIOS, so you cannot use the Data Streams Optimizer. If you enable the switch for Data Streams Optimizer in the BIOS again, it will re-apply the last tuning configuration - the information is still contained in the capsule. To completely erase the tuning use Solution 1.
Solution 1: Follow steps from Generate a Configuration for Your Workload to reset your system to the original state.
Solution 2: Disable the
Data Streams Optimizer
,
Software SRAM
, and
Intel® TCC Mode
options in the BIOS. After reboot, your system will be reset. For more information, see Apply Reset Configuration.
After disabling RTCM, the system freezes or the following error occurs: “Could not set up firmware update: Invalid argument. ERROR: Failed to apply buffer capsule”.
Unexpected performance results occur,
fwupdate
software does not work so capsules are not applying, or enable/disable RTCM does not work.
Some cores are offline according to the
lscpu
command output. Output example: “Off-line CPU(s) list: 1-3.”
Cause: Combining RTCM and data streams optimizer may result in offline cores and a number of different errors.
Possible solutions:
Solution 1: Follow steps from Generate a Configuration for Your Workload to reset your system to the original state.
Solution 2: Reflash the BIOS.
Solution 3:
  1. Disable the
    Data Streams Optimizer
    option in the BIOS. Reboot.
  2. Disable RTCM. Reboot.
  3. Re-run the data streams optimizer with the “SoftwareSRAM” compatibility option set in the requirements file.
    Note:
    The performance effect of the data streams optimizer will not be visible, because tuning configurations will not be applied by the BIOS. Stop the tuning process after the first applied configuration.
  4. Enable the
    Data Streams Optimizer
    option in the BIOS. Reboot.
  5. Enable RTCM. Reboot.
Now your system is ready to use the data streams optimizer with enabled RTCM.
Register check failed.
Cause: Configuration is not applied properly and registers detected unexpected values.
Solution:
  1. Confirm
    Intel® TCC Mode
    and
    Data Streams Optimizer
    options are enabled in BIOS.
  2. During capsule generation, use the same keys for signing and verification to avoid security violations if different sets of keys are used.
Data Streams Optimizer on Windows* OS has issue with Capsule Apply.
Cause: Windows* OS can not find private certificate to sign Driver.
Solution: Log in with your user to Windows* OS, private certificates will be autoloaded.
On Intel® Xeon® W-11000E Series processors, if the producer or consumer device is attached to the root port 0:01.0, data streams optimizer fails to find any tuning configurations.
Cause: Error in configuration search algorithm.
Solution: As a workaround, you can specify 0:1c.0 as a producer or consumer, respectively, in the requirements file.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.