Intel® Data Center Diagnostic Tool for Intel® Xeon® Processors

Documentation

Maintenance & Performance

000058107

10/03/2023

Introduction

The Intel® Data Center Diagnostic Tool is a diagnostic software tool that can be run on your data center platforms to:

  • Verify the functionality of all cores within an Intel® Xeon® Processor.
  • Be used as part of a regular system maintenance program.

High reliability and availability in the data center require the right tools and a commitment to maintenance. Intel believes it is an industry best practice to use maintenance tools such as these for both initial deployment and periodic testing to help ensure the best system experience.

Note:  Modern computing infrastructure brings ever-increasing demand for processing power combined with business expectations for service quality and high availability (and guarantees on service-level agreements [SLAs] in general). These expectations emphasize the need for powerful software tools that can help predict, identify, and minimize unexpected system faults that might compromise service quality or uptime. 

System requirements

The Intel Data Center Diagnostic Tool is an application available for both Linux* and Windows* operating systems. The tool can be installed and run on many current Linux* and Windows* distributions, see Installation.

Starting with version 558 the tool can be installed on Windows* using provided MSI installer and run on any version of Windows* currently supported by Microsoft. Please consult Windows* Server release information to determine which versions of Windows* Server systems are currently available and supported.

For best coverage, run the application in the root system of a server. It is possible to run it inside a container or virtual machine but be aware that some functionality may be disabled.

Supported processors:

  • 4th Generation Intel® Xeon® Scalable Processors (formerly Sapphire Rapids)
  • 3rd Generation Intel® Xeon® Scalable Processors (formerly Ice Lake and Cooper Lake)
  • 2nd Generation Intel® Xeon® Scalable Processors (formerly Cascade Lake)
  • 1st Generation Intel® Xeon® Scalable Processors (formerly Skylake)
  • Intel® Xeon® Processor E5 v4 Family (formerly Broadwell)
  • Intel® Xeon® Processor E7 v4 Family (formerly Broadwell)
Note For developers: Intel started the Open Data Center Diagnostic Project, which opens Intel’s Data Center Diagnostic framework and provides select tests. This offers developers a consistent test development framework that invites the creativity of the Open-Source community to enhance cloud fleet management through the development of unique test screens and other innovative solutions. For more information and access to this framework and tests

Installation

Note

Additional details are available in the /usr/share/doc/dcdiag/README.rst (Linux*) or C:\Program Files\Intel\Data Center Diagnostic Tool\README.rst (Windows*) file included in the installation.

We recommend using the steps in the sections below to link to the repository, which ensures that you get the latest version of the Intel® Data Center Diagnostic Tool. However, if you require a downloadable binary, use an RPM file or DEB file, or Windows* MSI installer.

 

Debian*/Ubuntu*

To install the Intel® Data Center Diagnostic Tool software packages on Debian*-based distributions, add the Intel software package repository and install the appropriate packages.

Prior to copying+pasting to your console, you may want to run sudo ls and enter your password to prevent the commands from being consumed by the sudo password prompt:

Set up the key to verify the package signatures

sudo install -m 0755 -d /etc/apt/keyrings

curl https://repositories.intel.com/dcdt/dcdiag.pub | sudo gpg --dearmor -o /etc/apt/keyrings/dcdiag.gpg

sudo chmod a+r /etc/apt/keyrings/dcdiag.gpg

Set up the repository

echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/dcdiag.gpg] https://repositories.intel.com/dcdt/debian stable main" | sudo tee /etc/apt/sources.list.d/dcdiag.list > /dev/null

Install the package

sudo apt-get update

sudo apt-get install dcdiag

Fedora*/CentOS*/RHEL*

To install the Intel® Data Center Diagnostic Tool software packages on a Fedora-based distribution, add the Intel software package repository and install the package.

The first time you install, YUM or DNF will prompt you to accept the signing key. Verify that the fingerprint is as follows, and then accept it:
Userid: CN=Release Key
Fingerprint: 6226 CA48 AAB6 0900 2093 C7C4 0A04 4B42 CF00 5B79

Prior to copying+pasting to your console, you may want to run sudo ls and enter your password to prevent the commands from being consumed by the sudo password prompt:

Install the repository file

sudo yum install https://repositories.intel.com/dcdt/dcdiag-repo.rpm

Install the package

sudo yum install dcdiag

OpenSUSE*/SUSE Linux Enterprise*:

Install the repository file

sudo zypper ar https://repositories.intel.com/dcdt/dcdiag.repo

Install the package

sudo zypper install dcdiag

You will be warned that respond.xml is not signed. Respond yes to continue. You will be given another chance to verify the package signature. Verify that the fingerprint is as follows, and then accept it:

Repository: dcdiag
Key Name: CN=Release Key
Key Fingerprint: 6226CA48 AAB60900 2093C7C4 0A044B42 CF005B79
Key Created: Tue 24 Nov 2020 01:47:38 PM PST
Key Expires: Sat 25 Nov 2023 01:47:38 PM PST
Rpm Name: gpg-pubkey-cf005b79-5fbd7f7a

Microsoft Windows*

Download the MSI installer

Download the Intel® Data Center Diagnostic Tool installer file to the selected location and execute it.

Intel® Data Center Diagnostic Tool for Intel® Xeon® Processors for Windows* - Version 576 *Latest

Install the package

When using a graphical user interface or command line with no additional options provided to the installer, a User Account Control prompt will show up, requesting authorization. Verify that the installer is signed by Intel Corporation and authorize changes to the device. Once authorized, the installer proceeds with installing the tool in the default location, and exits.

Quiet installation

/quiet command line switch can be used to perform quiet installation. This type of installation does not require any user interaction, which makes it especially useful for remote installation.

Note that quiet installation does not activate the User Access Control prompt, so the installer needs to be run from the Administrator console.

Use the /help or /? command line switch to display all available command line options for the installer.

The Intel® Data Center Diagnostic Tool is installed in a default location:

C:\Program Files\Intel\Data Center Diagnostic Tool\

 

How to test the Intel® Xeon® Processor

On Linux* systems Intel® Data Center Diagnostic Tool may be enabled by the system administrator for background execution.

You can enable and start the Intel® Data Center Diagnostic Tool with the following command:

# systemctl enable --now dcdiag

You can verify that this is successful with the following command:

# systemctl status dcdiag

Sample response to the command:

● dcdiag.service - Intel® Data Center Diagnostic Tool

Loaded: loaded (/usr/lib/systemd/system/dcdiag.service; enabled; vendor preset: disabled)

Active: active (running) since Fri 2021-02-19 11:24:17 MST;

Docs: file:///usr/share/doc/dcdiag/README.rst

Main PID: 8777 (dcdiag)

CGroup: /system.slice/dcdiag.service

└─8777 /usr/bin/dcdiag –service

If any errors are detected while the Intel® Data Center Diagnostic Tool executes in the background, the tool will log them to the system log. The tool can also query if any errors were detected in the background scan using the --query argument.

# dcdiag --query
Intel® Data Center Diagnostic Tool Version 506
Test completed successfully. No issues detected.

The background execution mode is currently not supported for Windows* version of the tool.

This tool can also be run manually in the foreground by executing at a Linux* or Windows* command prompt:

# dcdiag

>"C:\Program Files\Intel\Data Center Diagnostic Tool\dcdiag.exe"

Note that in Windows*, the Intel® Data Center Diagnostic Tool installer does not automatically update the system’s PATH variable, so full absolute or relative path is needed to start the tool.

The manual test runs for about 45 minutes and has high CPU utilization.

When the diagnostic completes, the system returns one of the following messages:

  • Test completed successfully. No issues detected.
  • Test completed successfully. One or more machine check errors occurred. Please check the system logs.
  • This processor is not supported by this version of the tool.

    Check the system's processor model and version. This message appears if the Intel Data Center Diagnostic Tool does not detect a production version of the supported processors. Engineering samples are not supported by this tool.

    Find help in identifying the processor.
  • Test completed. Results are inconclusive due to an outdated version of the microcode.

    The latest version of the microcode addresses known issues. Please update. Microcode updates are usually delivered by your Linux* distribution vendor alongside security fixes and other firmware updates for various components. If your system does not have these updates enabled, we recommend that you enable them. The microcode is automatically loaded by the Linux* kernel on every boot and can be reloaded at runtime with the following command as root:

    echo 1 > /sys/devices/system/cpu/microcode

 

On Windows* microcode updates are delivered using standard Windows* Update channels. If your system does not have these updates enabled, we recommend that you enable them.

  • Test completed. Results are inconclusive due to the system exceeding temperature limits

    This could be due to a variety of issues with the system that is not providing enough cooling for the CPU to operate within required temperature limits. We recommend that you check your system to ensure that required cooling is operating correctly. This may include faulty fans, incorrect airflow, or some other environmental issue.
     
  • Test completed. Results are inconclusive, one or more machine check errors occurred.

    Check system logs.
     
  • Test failed. Contact your system manufacturer or processor vendor for support.
     
  • If test results show fail, check if your server node's processors are still under warranty:
    • If you have a Boxed Intel® Xeon® Processor still under 3-year warranty, contact Intel Customer Support for assistance.
    • If you have a tray processor, contact your system or processor vendor or place of purchase to check if the processor is still under warranty.
Note Tray processors are sold directly to system manufacturers or Intel authorized distributors. Intel does not provide direct warranty to end users for tray processors unless they come preinstalled in Intel® Data Center Blocks (Intel® DCB) server systems. Except for Intel DCB systems, the tray processor’s warranty is from the vendor or place of purchase of the processor or the system if the processor was pre-installed. Intel recommends purchasing from Intel Authorized Distributors, Intel Approved Suppliers, and resellers of Intel® products.
  • Be aware that Intel does not have an out-of-warranty replacement program.
  • Test failed.
  • Test completed, and an error was detected on the physical processor containing /sys/devices/system/cpu/cpuXX.
  • Contact your system manufacturer or processor vendor for support.
  • Test failed.
  • Test is unable to determine which physical processor caused the failure.
  • Contact your system manufacturer or processor vendor for support.

*Other names and brands may be claimed as the property of others.

Version History

Date Version Description
July 07, 2021 540 Initial version
Aug 16, 2022 549 bug fix
Sept 20, 2022 549 command changed to enable the tool and verify the enabling
Jan 10, 2023 550 include 4th gen Intel® Xeon® Processors
June 29, 2023 576 Version 576 Release

 

Related topics
Intel® Xeon® Support Central Website
Warranty Guide for Intel® Processors
Intel® Data Center Diagnostic Tool for Intel® Xeon® Processors for Windows*