Introduction
This article describes how to debug or further configure your Intel® persistent memory devices with ipmctl. ipmctl is an open source tool maintained by Intel and is available for download on GitHub*. With ipmctl, you can select operating modes, create goals, provision capacities, create regions, and much more. The most common ipmctl calls are described in our Quick Start Guide.
This article assumes you have basic knowledge of ipmctl and persistent memory programming concepts. If you’re just getting started, check out the Quick Start Guide first, and come back to this article for debugging assistance.
Discover Configuration
Show Topology
To see available resources, use the show topology command, which displays both the Intel® Optane™ DC persistent memory modules and DDR4 dual in-line memory modules (DIMMs) discovered in the system by enumerating the SMIOS Type 17 tables. For more information on this, please refer to ACPI Specifications v6.0 or the Advanced Configuration Tables section of this article for NFIT table information.
Platform Configuration Details
You can learn many details about your configuration from looking at the platform configuration details (PCD) with the following command:
The tables that are shown when this command is run are:
- Configuration Header
- Current Config
- Interleave Information
- Identification Information x6
- Conf Input
- Conf Output
- Partition Size Change
- Interleave Information
- Identification Information x6
- Label Storage Area—Current Index
- Label Storage Area—Labels
Advanced Configuration and Power Interface Tables
The following Advanced Configuration and Power Interface (ACPI) tables are available:
- NFIT: The nonvolatile dual in-line memory module (NVDIMM) Firmware Interface Table
- PCAT: The Platform Capabilities Table
- PMTT: The Platform Memory Topology Table
Shortened versions of the output of each command can be seen below:
NFIT
PCAT
PMTT
Health Monitoring
Show DIMM Information
The show -dimm command displays the Intel Optane DC persistent memory modules discovered in the system and verifies that software can communicate with them. Among other information, this command outputs each DIMM’s ID, capacity, health state, and firmware version:
Sensor Health States
ipmctl has the ability to see health states of sensors located on each persistent memory module. The sensors available are:
|
|
Use the following command to see sensor health for a specific module. Health values for all modules can be seen by not specifying a DimmID.
Percentage Life Remaining
The remaining life of a persistent memory module is based on the number of reads/writes left in its lifetime. Use the following command to see the percentage of life remaining on each module. In the example below, you can see that DIMM 0x0101 has 45 percent life remaining, and the rest have 100 percent.
Similar to how in this call we can see the PercentageRemaining sensor value for each DIMM available, you could replace PercentageRemaining with any of the other sensor types and see their values that way.
On DIMM 0x0101, I injected an error to specify the PercentageRemaining to be 45 percent. You can read more about error injection in the Debugging section.
Change Sensor Thresholds
Each sensor has a set threshold that specifies the Normal range. On your modules, you can set your own threshold, called the NonCriticalThreshold. For example, if you were to set the MediaTemperature NonCriticalThreshold to a lower number than the Normal range, you would get a warning if the temperature went above that number specified. Each sensor’s threshold limit can be set with the following command:
Performance
Show Sensor Performance Per DIMM
Performance indicators can be seen either per DIMM, per indicator, or all of the above as a big dump. To see all the performance indicators of a single DIMM, use this command:
Here is the full list of performance indicators:
- DimmID: The Intel Optane DC persistent memory module identifier.
- MediaReads: Number of 64-byte reads from media on the Intel Optane DC persistent memory module since the last alternating current (AC) cycle.
- MediaWrites: Number of 64-byte writes to media on the Intel Optane DC persistent memory module since the last AC cycle.
- ReadRequests: Number of DDRT read transactions that the Intel Optane DC persistent memory module has serviced since the last AC cycle.
- WriteRequests: Number of DDRT write transactions that the Intel Optane DC persistent memory module has serviced since the last AC cycle.
- TotalMediaReads: Number of 64-byte reads from the media on the Intel Optane DC persistent memory module over its lifetime.
- TotalMediaWrites: Number of 64-byte writes to media on the Intel Optane DC persistent memory module over its lifetime.
- TotalReadRequest: Number of DDRT read transactions that the Intel Optane DC persistent memory module has serviced over its lifetime.
- TotalWriteRequest: Number of DDRT write transactions that the Intel Optane DC persistent memory module has serviced over its lifetime.
Debugging
Discover Errors
To debug errors on your modules, the following commands will come in handy. Seeing the error log can easily be done with show error log command.
If an error is present, the output will be similar to:
The –error option can be either Thermal or Media, with severity levels of either High or Low.
Inject an Error
For testing purposes, you may want to inject a mock error onto your persistent memory modules. Injectable errors include: Temperature, Poison, PoisonType, PackageSparing, PercentageRemaining, FatalMediaError, and DirtyShutdown. It is important to note that this command is only available when error injection is enabled on the Intel Optane DC persistent memory module in the BIOS. Examples of each of these can be seen in the ipmctl-inject-error man pages.
To change the PercentageRemaining:
To change the Temperature (Celsius) variable:
To clear injected errors, specify which injection property (Temperature, Poison, PoisonType, PackageSparing, PercentageRemaining, FatalMediaError, or DirtyShutdown), and add Clear=1. For example, the first call clears all DIMMs of any injected Temperature changes:
# ipmctl set -dimm Clear=1 Temperature=1
This call clears only DIMM 0x1001 of the injected PercentageRemaining change:
# ipmctl set -dimm 0x1001 PercentageRemaining=10 Clear=1
Diagnose Further Problems
Use the start diagnostic command to see a quick health overview of your persistent memory modules. After the –diagnostic flag, you can specify any of the following flags. Or, if left blank, all will display.
- Quick - This test verifies that the Intel Optane DC persistent memory module host mailbox is accessible and that basic health indicators can be read and are currently reporting acceptable values.
- Config - This test verifies that the BIOS platform configuration matches the installed hardware, and the platform configuration conforms to best-known practices.
- Security - This test verifies that all Intel Optane DC persistent memory modules have a consistent security state. It is a best practice to enable security on all Intel Optane DC persistent memory modules, rather than just some.
- FW - This test verifies that all Intel Optane DC persistent memory modules of a given model have consistent FW installed and other FW modifiable attributes are set in accordance with best practices.
Note that the test does not have a means of verifying that the installed FW is the optimal version for a given Intel Optane DC persistent memory module model, just that it has been consistently applied across the system.
For example, the following command shows all the diagnostic flags for DIMM 0x0001:
Security
Firmware Version
Show information about the firmware on one or more DIMMs:
Update Firmware
Update firmware on one or more DIMMs with the following command. To update all DIMMs, simply leave the –dimm tag off so that no DIMM is specified.
Firmware Debug Log
Dump the firmware debug log to a specified file destination using the following command:
Display CLI version
The ipmctl command line version can easily be seen with the following command:
Conclusion
ipmctl is a powerful tool used for configuring and managing Intel Optane DC persistent memory modules. This article outlines some of the most common ipmctl debugging and configuration commands used for learning more about your Intel Optane DC Persistent Memory Modules. The full ipmctl API can be found on the man pages or by typing ipmctl help at any time.