Developer Guide

  • 2021.3
  • 11/18/2021
  • Public
Contents

Step 3: Preproduction: Generate a Tuning Config

In this step, you will use the data streams optimizer preproduction workflow to generate a tuning configuration that meets the MRL latency requirement and power consumption optimization. The demo focuses on tuning the core-from-PCIe stream (MMIO reads) and power consumption.
This demo does not measure power consumption, but power consumption can be measured using tools outside of Intel® TCC Tools, such as Intel® SoC Watch. Get Intel® SoC Watch from VTune™ Profiler 2021.7.0.
The data streams optimizer requires the following input files: environment file, requirements file, and workload validation script. For this demo, you will use provided samples of these files.
The output examples shown here are for illustration only. Your output may vary.

Preproduction Workflow

These steps assume a host-target environment.
  1. On the host system
    , confirm that the host system and target system have a password-less connection.
    1. Generate a key with an empty passphrase:
      ssh-keygen -t rsa Press ``Enter`` when system prompt during ``Generating public/private rsa key pair`` step.
    2. Copy the key to the target system:
      ssh-copy-id <user>@<target> If you do not have ``ssh-copy-id`` on your host system, use the command: ``cat .ssh/id_rsa.pub | ssh <user>@<target> 'cat >> .ssh/authorized_keys'``
    3. Verify that a password is not required anymore, for example:
      ssh <user>@<target> ls
  2. On the target system
    , confirm that the
    Data Streams Optimizer
    setting in system firmware is enabled. For details, see Data Streams Optimizer Setting.
  3. On the target system
    , confirm that the real-time configuration driver (
    tcc_buffer
    ) is loaded.
    /usr/share/tcc_tools/scripts/setup_ssram/control_tcc_driver.sh enable
  4. On the host system
    , source the environment file to set up environment variables:
    source ~/intel/tcc_tools/latest/env/vars.sh
  5. Go to the
    tools
    directory:
    cd ${TCC_TOOLS_PATH}
  6. Review the target connection settings file:
    1. Open the target connection settings file. This command example uses nano, but you can use any text editor.
      nano ./host_scripts/target_connection_settings.sh
    2. Modify the following fields (see example file below):
      Field Name
      Description
      HOSTNAME
      Replace
      hostname
      with the IP address or hostname of the target system.
      USER
      Replace
      user
      with
      root
      .
      SSH_EXTRA
      Add any additional SSH command line options.
      This file contains the hostname and username to connect to the target board via SSH.
      Target connection settings file example:
      HOSTNAME='hostname' USER='user' SSH_EXTRA=''
    3. Save and close the file.
  7. Review the sample requirements file:
    The sample requirement file below (single_corepcierd_1.json) assumes using Intel Atom® x6000E Series Processors.
    If you are using 11th Gen Intel® Core™ Processors, change the requirement file to point to single_corepcierd_2.json.
    If you are using Intel® Xeon® W-11000E Series Processors, use single_corepcierd_0.json.
    1. Open the sample requirements file. This command example uses nano, but you can use any text editor.
      nano ./demo/requirements/single_corepcierd_1.json
    2. For your reference, note the
      "command"
      field. This is the same script you used in the previous step, Step 2: Run MRL on Untuned System. The tool will run the script to validate whether the tuning configuration meets the MRL latency requirement.
    3. Verify that the value in the
      --device
      field matches the name of the PCIe device. For example:
      I225
      or
      TSN
      . You can use the same value you used in Step 2: Run MRL on Untuned System. To check which device you have, run:
      lspci | grep -E 'Ethernet controller: Intel Corporation'
      The following example output shows the PCIe device is integrated TSN (4b32).
      00:1e.4 Ethernet controller: Intel Corporation Device 4b32 (rev 11)
      The current sample requirements file assumes an integrated Time-Sensitive Networking controller and uses the argument TSN for the device field in the json file. Update this for your setup as needed.
    4. Verify that the
      producer
      field matches your PCI device address. The example in the previous step shows the address is
      00:1e.4
      .
    5. Save and close the file.
  8. On the host system, run the preproduction tool to search for a tuning configuration.
    python3 tcc_data_streams_optimizer_preprod.py search --environment ./demo/environment/sample_environment_uefi.json --requirements ./demo/requirements/single_corepcierd_1.json
    If you are using a system with Slim Bootloader, you must choose the SBL environment file:
    python3 tcc_data_streams_optimizer_preprod.py search --environment ./demo/environment/sample_environment_sbl.json --requirements ./demo/requirements/single_corepcierd_1.json
    For your reference, the following table contains a description of each argument.
    Option
    Description
    --environment
    Path to the sample environment file.
    --requirements
    Path to the sample requirements file.
  9. Confirm that you see output similar to the example below. The output shows that the tool first checks for dependencies, such as input files and ability to connect to the target via SSH.
    Processing environment file: demo/environment/sample_environment_uefi.json ... Environment file parsed. Processing requirement file: demo/requirements/single_corepcierd_1.json ... Requirement: `00:1e.4` to `Core3` [9/4] Requirement file parsed. Creating output folder: /home/test/intel/tcc_tools/2021.3.1/tools/target_name_1/single_corepcierd_1_<date>
    The tool finds the first suitable tuning configuration.
    The tool prints a list of messages to the log file. The messages describe the affected settings. The level of detail in these messages balances the need to provide useful information vs. the need to protect Intel proprietary information. For more information about each message, see Tuning Configurations.
    The tool generates a capsule of the configuration. The capsule is used to apply the configuration to the target. The target reboots.
    Connecting to target_name_1... Connection to database tuning_ehl.db - successful. Searching for suitable tuning configuration... Stream 0:1e:4:0 -> Core3: configuration 1 out of 2 Generating the capsule(s) of the configuration to tune the system - for the settings of this configuration see /home/test/intel/tcc_tools/latest/tools/target_name_1/single_corepcierd_1_20210830094855/dso_log Capsule(s) were generated. Applying capsule(s) for target... Capsule(s) were applied for target.
  10. Wait for the target board to reboot. It may take 1 minute or more. While the target is rebooting, the tool will attempt to connect to the target repeatedly based on the
    <reconnection_attempts>
    field in the environment file. Output example:
    Rebooting target machine... ----------------------------- -- Starting reboot process -- ----------------------------- Connecting to <target_hostname> for reboot... Reboot... Attempt 1. Waiting for <target_hostname> respond... ssh: connect to host <target_hostname> port 22: Connection timed out Connection failed! Attempt 2. Waiting for <target_hostname> respond... ssh: connect to host <target_hostname> port 22: Connection timed out Connection failed! Attempt 3. Waiting for <target_hostname> respond... Reboot success!
    After reconnecting to the target, the tool runs the workload validation script.
    Starting validation script: bash ./host_scripts/target_runner.sh python3 /usr/share/tcc_tools/tools/demo/workloads/bin/mrl_validation_script.py. Return code: 1 Found CPU affinity for core 3 Running test ... Done. Test is complete! Results saved in data_mmio_read_latency_us.csv data_mmio_read_latency_ticks.csv data_avg_inst_count.csv Enabling userspace access to performance counters Removing stmmac_pci Memory regions: ['6001360000', '600136f000'] Using memory region 0 with address 6001360000 Start validation Validation stopped Restoring stmmac_pci Validation is finished. Please wait for results processing. Validation information: device: TSN address: 6001360000 core: 3 iterations: 10000000 processor: EHL Latency must be less than 9.0 us. Statistics: |Min |Max |Avg |Median ---------------------------------------------------------------- Microseconds|0.745 |10.203 |1.005 |0.987 ================================================================ Deadline |Iterations |Passed |Failed --------------------------------------------------- 9.0 us |10000000 |9997248 |2752 =================================================== Failed: at least one iteration failed ERROR: Program is exited with non-zero code 1. Validation script FAILED For details see the log file at: /home/<user>/intel/tcc_tools/latest/tools/target_name_1/single_corepcierd_1_<date>/log
    If you see
    VALIDATION ERROR
    with a path to a log, this log is located on your target board.
    If the validation script fails, the tool repeats the tuning flow. It finds another suitable tuning configuration or exits if none are found. In this case, the tool finds another configuration:
    Searching for suitable tuning configuration... Stream 0:1e:4:0 -> Core3: configuration 2 out of 2 Generating the capsule(s) of the configuration to tune the system - for the settings of this configuration see /home/<user>/intel/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/dso_log Capsule(s) were generated. Applying capsule(s) for target... Capsule(s) were applied for target.
  11. Wait for the target board to reboot. It may take 1 minute or more. While the target is rebooting, the tool will attempt to connect to the target repeatedly based on the
    <reconnection_attempts>
    field in the environment file. Output example:
    Rebooting target machine... ----------------------------- -- Starting reboot process -- ----------------------------- Connecting to <target_hostname> for reboot... Reboot... Attempt 1. Waiting for <target_hostname> respond... ssh: connect to host <target_hostname> port 22: Connection timed out Connection failed! Attempt 2. Waiting for <target_hostname> respond... ssh: connect to host <target_hostname> port 22: Connection timed out Connection failed! Attempt 3. Waiting for <target_hostname> respond... Reboot success!
    After reconnecting to the target, the tool runs the workload validation script. Now the script shows that the maximum latency measurement meets the deadline.
    Starting validation script: python3 /usr/share/tcc_tools/tools/demo/workloads/bin/mrl_validation_script.py. Validation script output: Found CPU affinity for core 3 Running test ... Done. Test is complete! Results saved in data_mmio_read_latency_us.csv data_mmio_read_latency_ticks.csv data_avg_inst_count.csv Enabling userspace access to performance counters Removing stmmac_pci Memory regions: ['6001360000', '600136f000'] Using memory region 0 with address 6001360000 Start validation Validation stopped Restoring stmmac_pci Validation is finished. Please wait for results processing. Validation information: device: TSN address: 6001360000 core: 3 iterations: 10000000 processor: EHL Latency must be less than 9.0 us. Statistics: |Min |Max |Avg |Median ---------------------------------------------------------------- Microseconds|0.691 |1.796 |0.759 |0.757 ================================================================ Deadline |Iterations |Passed |Failed --------------------------------------------------- 9.0 us |10000000 |10000000 |0 =================================================== Success: all iterations passed. Validation script PASSED
    After the validation passes, the tool shows a brief description of the tuning configuration, generates a tuning configuration file, and exits. Output example:
    Configuration for target_name_1 found. Tuning configuration disabled power management in addition to the options disabled by Intel® TCC Mode to meet your strict latency requirements, but may negatively impact power or best-effort performance. Creating tuning configuration... See /home/<user>/intel/tcc_tools/latest/tools/<target_name_1>/single_corepcierd_1_<date>/tuning_configuration.json for configuration details. Tuning configuration was created. Path to output file: /home/<user>/intel/tcc_tools/latest/tools/target_name_1/single_corepcierd_1_<date>/tuning_configuration.json For more information, see the log file: /home/<user>/intel/tcc_tools/latest/tools/target_name_1/single_corepcierd_1_<date>/dso_log Application exit
  12. After the application exits, confirm that the tool generated the tuning configuration file,
    tuning_configuration.json
    , in the following directory:
    cd <target_hostname>/single_corepcierd_1_<date> ls -la
  13. You can see the differences between the untuned system in Step 2: Run MRL on Untuned System and the tuned system in step 3 (this step). While the tuned system met the desired latency, the untuned system in step 2 disabled fewer power management features.
In this demo, only the tuned system (step 3) met the deadline. The data streams optimizer selected a configuration that disabled additional power management features specifically to keep the fabric awake for low latency MMIO transactions. The data streams optimizer achieved lower latency and higher power consumption compared to Intel® TCC Mode enabled. In a real-world use case, you can perform additional analysis outside of Intel® TCC Tools to determine if your system requirements, like power consumption, are met.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.