Intel® Cluster Checker 2019 Update 9 for Linux* - Release Notes ------------------------------------------------------------------------------- CONTENTS -------- 1. OVERVIEW 2. NEW FEATURES 3. SYSTEM REQUIREMENTS 4. WHERE TO FIND THE RELEASE 5. INSTALLATION NOTES 6. DOCUMENTATION 7. KNOWN LIMITATIONS AND TROUBLESHOOTING 8. TECHNICAL SUPPORT 9. DISCLAIMER AND LEGAL INFORMATION ------------------------------------------------------------------------------- 1. OVERVIEW ------------------------------------------------------------------------------- Intel® Cluster Checker verifies the configuration and performance of Linux*-based clusters and checks the cluster's compliance with the Intel® Select Solutions for Simulation and Modeling. ------------------------------------------------------------------------------- 1.1. RELATED PRODUCTS AND SERVICES ------------------------------------------------------------------------------- Information about Intel® software development products is available at http://www.intel.com/software/products. These are some of the products related to Intel® Cluster Checker: o The Intel® C++ and Fortran Compilers include advanced optimization and multithreading capabilities, highly optimized performance libraries, and analysis tools for creating fast reliable multithreaded applications. http://www.intel.com/software/products/compilers o The Intel® MPI Library for Linux*, the Intel® Trace Analyzer and Collector for Linux*, and the Intel® Math Kernel Library Cluster Edition for Linux* are the most awarded development tools. They create, analyze, and optimize high-performance applications on clusters of Intel® processor-based systems. http://www.intel.com/software/products ------------------------------------------------------------------------------- 2. NEW FEATURES ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- 2.1 WHAT'S NEW IN VERSION 2019 Update 9.0 ------------------------------------------------------------------------------- - Further enhancements and improvements to the console output and reports were made including the following: - Additional details are provided when using the -n / --include-node analyzer filter parameter. - The output formatting reverts to previous versions when the new beta feature for node group analysis is not activated, -g / --groupfile. - When using the new beta feature for node group analysis filtering, the ordering of groups will be alphabetical. Numerics will be treated as a string value and generated node groups will be printed last. - Further improvements to the node group feature output were made including the following: - Better debugging and error messages for configuration set up. - Removed superfluous spacing. - Draw more attention to the line listing any runtime issues. - Removed duplicated information in node group specific output and added descriptors on the node group sections. - Replace example node group configuration file with a better example. - The '-h' help flag output has been revised - The '-n' flag has a changed description in the help output. Provided examples in the man pages. - Better handling of imb and osu providers. - Added a configurable timeout to collect extensions. - Timeout can be set as an environment variable or in the main configuration file. - Also documented the default timeout value in the configuration file. - Added linking via RPATH: - Distributed common libraries moved to a seperate 'lib_common' folder. - Update SQLite to version to fix: - CVE-2020-11655 and CVE-2020-11656 - CVE-2020-13434 and CVE-2020-13435 - CVE-2020-13630, CVE-2020-13631 and CVE-2020-13632 ------------------------------------------------------------------------------- 2.2 OLDER VERSIONS ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- 2.2.1 Version 2019 Update 8.0 ------------------------------------------------------------------------------- - Extensive changes made to the formatting of the output to enhance readability and parsing of the analysis. - Grouping of issues by type; functionality, performance, uniformity - Summary report on the number of each type of issue - Separation of Cluster Checker execution issues from system environmental issues - Detailed analysis and recommendations are detailed in an analysis log file - An option to redirect the generated reports from Cluster Checker into a JSON formatted file to enable separate reporting and analysis by user tools - To enable, include clck_json as part of the block of the cluster checker configuration XML file - default file /etc/clck.xml - Extended the collection and analysis capabilities of Intel® Cluster Checker to include the OSU Micro-Benchmark Suite (must be downloaded and installed separately: http://mvapich.cse.ohio-state.edu/benchmarks/). The OSU Collectives Benchmark Suite includes point-to-point, blocking and non-blocking collectives benchmark tests that verifies the functionality of the MPI functions. - Introduction of new Intel® MPI blocking and non-blocking collectives bench- mark tests for the verification of functionality of Intel® MPI. The results are collected and analyzed. - Introduction of a new beta feature to perform analysis on specific node groups defined by the user or system administrator in a nodegroup file. This beta feature only applies to the analysis of the nodes, not the collection of data on the nodes. An example node group file is available in /etc/example_group_file.xml. The documentation provides further details on this new beta feature. - CVE fixes for libxml2 - CVE-2020-7595 and CVE-2019-20388 ------------------------------------------------------------------------------- 2.2.2 Version 2019 Update 7.0 ------------------------------------------------------------------------------- - Improved error messaging covering: - when no high speed fabrics are detected for MPI - incorrect detection during analysis for select solutions priv plus - time out of collecting information using mpi.so collector - Fix to correctly identify Intel Python 3.7.4 - Common Vulnerabilities and Exposures (CVE) for SQLite3: Updated SQLite3 package to sqlite3.31.0 to address a group of CVEs identified in late 2019. ------------------------------------------------------------------------------- 2.2.3 Version 2019 Update 6.0 ------------------------------------------------------------------------------- - Added environment modules support for Intel® Cluster Checker. - The Environment Module file is found /clck/2019.7/env/modulefile/ - The command: module use /clck/2019.7/env/modulefile/ will add the clck to your module environment - module av will show what modules are available to be loaded via module load - Added patch to address SQLite CVE-2019-9937. - Improved clarity and response output for certain checks. - Improved handling and analysis of data from multiple databases. Analysis is performed on the most recent data collected when there exists multiple collections of the same data types. ------------------------------------------------------------------------------- 2.2.4 Version 2019 Update 5.0 ------------------------------------------------------------------------------- - New default tests with faster execution. - New predefined in-depth test sets made for user or admin specific analysis. - Enhanced summary output lists brief facts on nodes and issues. - Troubleshooting tests on prerequisites for Intel® MPI Library. - Verifies the uniformity of the BIOS and management firmware settings for the Intel® Server Boards through Intel's System Configuration Utility (syscfg). - Support for the latest Intel processors (Intel® Xeon® Platinum 9200 Processor Family). - Support for validation of Intel® Select Solutions for Simulation & Visualisation. - New -t option to set data age threshold. - Bug fixes and improvements, CVE updates. ------------------------------------------------------------------------------- 2.2.5 Version 2019 Update 4.0 ------------------------------------------------------------------------------- - Enhanced functionality for testing memory uniformity. - Added flexibility on checks for memlock limits to InfiniBand and Intel® Omni-Path Architecture (Intel® OPA) checks. - Improved support for diskless clusters. - Improved messages and bug fixes. ------------------------------------------------------------------------------- 2.2.6 Version 2019 Update 3.5 ------------------------------------------------------------------------------- - Added support for checking of second-generation Intel® Xeon® Scalable Processors by privileged or non-privileged users. - Updated support for validation of Intel® Select Solutions for Simulation and Modeling to include the second-generation Intel® Xeon® Scalable Processor solution. - Added support for checking Intel® Optane(TM) DC Persistent Memory configurations and uniformity. - Included support for second-generation Intel® Xeon® Scalable Processor with the Intel HPC Platform Specification. - Added checking for the Intel® Parallel Studio XE 2019.0 runtimes. ------------------------------------------------------------------------------- 2.2.7 Version 2019 Update 2.1 ------------------------------------------------------------------------------- - Updated support for validation of Intel® Select Solutions for Simulation and Modeling. - Added in support for Intel HPC Platform Specification 2018.0. - Intel® Cluster Checker 2019 Update 2.1 includes functional and security updates. Users should update to the latest version. ------------------------------------------------------------------------------- 2.2.8 Version 2019 Update 2.0 ------------------------------------------------------------------------------- - Intel® Cluster Checker 2019 Update 2 includes functional and security updates. Users should update to the latest version. ------------------------------------------------------------------------------- 2.2.9 Version 2019 Gold ------------------------------------------------------------------------------- - New 'clck' command simplifies execution with a single command. - Added improved output messaging: - New compact summary output provided on screen. - Details of analysis provided in the output logfile. - Simplified scheme to assess issues as ‘CRITICAL’, ‘WARNING’, or ‘INFORMATIONAL’ - Added -R option for specifying where results are written. - Changed -o option to specify where log output is written. - Added performance threshold checking for Intel® Xeon Phi(TM) Processor x205 Product Family. - Added -X command line option, allowing a user to obtain a list of available framework definitions and their respective descriptions on data collected and analysis tests. - Added the ability to mark two snapshots of a cluster state to identify changes. Currently supported with the following framework definitions: rpm_snapshot, hardware_snapshot, files_snapshot. - Added a user option to collect any missing or old data before analysis. - Added ability to collect data on a cluster that does not have pdsh if the Intel® MPI Library is installed. - Added the ability to collect data without specifying a node file if nodes are allocated through SLURM. - Added in support for validation of Intel® Select Solutions for Simulation and Modeling. - New Intel® Cluster Checker API. ------------------------------------------------------------------------------- 2.2.2 OLDER RELEASE NOTES ------------------------------------------------------------------------------- - The release notes for older, versions of Cluster Checker can be found at: https://software.intel.com/en-us/articles/intel-cluster-checker-release-notes -and-new-features ------------------------------------------------------------------------------- 3. SYSTEM REQUIREMENTS ------------------------------------------------------------------------------- The following sections describe hardware and software requirements. ------------------------------------------------------------------------------- 3.1. HARDWARE ------------------------------------------------------------------------------- - Intel® Xeon® processor (Intel® 64 architecture) - 1 GB of RAM recommended - 160 MB of free hard disk space required for installation ------------------------------------------------------------------------------- 3.2. SOFTWARE ------------------------------------------------------------------------------- Operating Systems: - CentOS 7 - Red Hat* Enterprise Linux* 7 - SUSE* Linux* Enterprise Server 12 - Ubuntu* 16.04, or 17.04 (See Section 7 for known issues) Runtimes: - Intel® MPI Library Note: While the full SDK versions of these components fulfill the requirement, only the runtime library is required. ------------------------------------------------------------------------------- 4. WHERE TO FIND THE RELEASE ------------------------------------------------------------------------------- Intel® Cluster Checker can be installed with Intel® Parallel Studio XE, standalone via Intel® Registration Center: https://registrationcenter.intel.com or as a standalone package via the Intel® YUM repository. See the Installation section of the User Guide for more information. ------------------------------------------------------------------------------- 5. INSTALLATION NOTES ------------------------------------------------------------------------------- The default Intel® Cluster Checker install path is: /opt/intel/clck/2019.9 Intel® Cluster Checker is distributed as a standalone package. To install package, run the following commands: % tar -xzf -C /tmp % cd /tmp/ % ./install.sh Notes: - Intel® Cluster Checker needs to be installed on all nodes. This can either be accomplished either by installing into a shared directory or by installing a local copy on each node. - To install a local copy on each node, repeat the package installation for each node. ------------------------------------------------------------------------------- 6. DOCUMENTATION ------------------------------------------------------------------------------- This release of Intel® Cluster Checker includes the following documentation: The Getting Started Guide walks through using Intel® Cluster Checker for the first time. The Intel® Cluster Checker User's Guide contains information about how to use, configure, and extend Intel® Cluster Checker. The User's Guide describes the basic usage models, contains information about specific configuration options, explains how to embed Intel® Cluster Checker functionality into other applications, shows how to add new checks to the tool, and demonstrates how to modify existing checks. The Intel® Cluster Checker API reference describes the API that may be used to embed Intel® Cluster Checker functionality into other software programs. The documentation can be found at: https://software.intel.com/en-us/intel-cluster-checker-support/documentation. ------------------------------------------------------------------------------- 7. KNOWN LIMITATIONS AND TROUBLESHOOTING ------------------------------------------------------------------------------- The following is a list of known issues in this release. - Data collection behavior and functionality o Intel MPI Benchmark (imb) providers will output false positives when run with MPICH. o imb non-blocking framework definitions are currently not behaving as desired on some systems. o When executing data collection as root, the following framework definitions can just hang with no message, but maybe terminated by a single ctrl-c or terminate by a provider timeout: mpi_multinode_functionality hpl_cluster_performance imb_allgather imb_allgatherv imb_allreduce imb_alltoall imb_barrier imb_bcast imb_benchmarks_blocking_collectives imb_benchmarks_non_blocking_collectives imb_gather imb_gatherv imb_iallgather imb_iallgatherv imb_iallreduce imb_ialltoall imb_ialltoallv imb_ibarrier imb_ibcast imb_igather imb_igatherv imb_ireduce imb_ireduce_scatter imb_iscatter imb_iscatterv imb_pingping imb_pingpong_fabric_performance imb_reduce imb_reduce_scatter imb_reduce_scatter_block imb_scatter imb_scatterv osu_allgather osu_allgatherv osu_allreduce osu_alltoall osu_alltoallv osu_barrier osu_bcast osu_benchmarks_blocking_collectives osu_benchmarks_non_blocking_collectives osu_benchmarks_point_to_point osu_bibw osu_bw osu_gather osu_gatherv osu_iallgather osu_iallgatherv osu_iallreduce osu_ialltoall osu_ialltoallv osu_ialltoallw osu_ibarrier osu_ibcast osu_igather osu_igatherv osu_ireduce osu_iscatter osu_iscatterv osu_latency osu_mbw_mr osu_reduce osu_reduce_scatter osu_scatter osu_scatterv (Ohio State University Micro Benchmarks) osu_* o When using mpi.so collector option (not default) with Intel MPI Library 2019 update 3, there is a known issue which will cause data to not be collected. Please use an earlier or later version of Intel MPI Library. o Currently Cluster Checker will not collect data for the HPCG benchmarks correctly if Intel® MPI and MPICH (www.mpich.org) are both installed on the environment beting tested. The test environment will look for and execute the Intel® MPI optimized binary for HPCG and thus reset the environment variables for MPICH. Discovery and handling of this limit- ation will be corrected in future versions of Intel® Cluster Checker. o imb_pingpong_fabric_performance framework definition when launched with an odd number of nodes through Slurm, with MPI as the collector mechanism (mpi.so), will report no-data for the last server assigned to the slurm job. Workaround involves using a nodefile to specifically test the last server where ‘no-data’ was reported with another server in the infrastructure. o The compute node hostname identified in the nodefile must match the hostname reported by the either the uname or hostname utility on the compute node itself. Deviations in the hostnames, or use of fully qualified domain names in either the nodefile or the compute node, may impact or produce inaccurate uniformity percentages and counts and be reported as a failure or warning by Cluster Checker. o Please note that for execution of HPCG benchmarks (such as in the checks hpcg_single and hpcg_cluster) on non-standard install path for the Intel® MPI Library and Intel® Math Kernel Library (Intel® MKL) runtime, libraries must be installed and be exported in the LD_LIBRARY_PATH on the system. o Use of the latest runtime libraries for Intel® MPI Library and Intel® Math Kernel Libary is required to ensure compatibility with Intel® Cluster Checker. o If the temporary directory used during collection is located on a shared file system, the directory will not be deleted. o The ORCM plugin is a technical preview feature. o Databases located on NFS file systems mounted with the "nolock" option are not supported. Not all data from concurrent data collection instances per database will be written to the database and the database may become corrupted. A single data collector instance per database can usually be used successfully in this case. o The error "Error: disk I/O error" may be generated when accessing a database located on a Lustre file system. The Lustre file system must be mounted with the "-o flock" option. o The 'iozone' data provider does not execute correctly on diskless clusters. o If collecting data as root, the value of the CLCK_SHARED_TEMP_DIR environment variable must be set to the fully-qualified path of a directory accessible on all nodes. o When collecting data on Ubuntu*, if the installed "which" command does not support --skip-functions and --skip-alias, a few providers will need additional configuration and a few providers will not run successfully. The following providers must be configured for the specification of absolute binary location: - cpuid - cpupower - dmesg - ibstat - lscpu - numactl - opahfirev - opasmaquery Refer to Intel® Cluster Checker User Manual, Chapter 6 for details about specifying absolute binary paths for the above mentioned providers. o Intel® Cluster Checker uses the command "ldconfig -p" as well as the environment variable LD_LIBRARY_PATH to detect the presence of required libraries. In order for Intel® Cluster Checker to detect required libraries, they must be present in the LD_LIBRARY_PATH or the result of "ldconfig -p". (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, intel_hpc_platform_sdvis-core-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o In order for Intel® Cluster Checker to detect the Intel® Distribution for Python*, it must be in the user’s PATH. (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o If Intel® Parallel Studio is sourced before the Intel® Distribution for Python* in the user's environment, Intel® Cluster Checker is unable to detect all the required libraries for Intel® MPI Library. (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o The detected version of Intel® MPI Library is used to determine whether Intel® Cluster Checker checks for Intel® Parallel Studio 2018 or 2019. If the Intel® MPI Library version does not match the version of the rest of Intel® Parallel Studio, the wrong set of libraries will be checked. (Applies to the Framework Definition intel_hpc_platform_compat-hpc-2018.0) o Intel® Cluster Checker can only detect the version of the Intel® Fortran Compiler version with Intel® Parallel Studio 2017 or later. (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o In addition, there are limitations to validating Intel® Select Solutions compliance when running on Ubuntu. It is not recommended to use Intel® Cluster Checker for Intel® Select Solutions compliance when running on Ubuntu. - Analysis behavior and functionality o Clusters containing dual port InfiniBand* adapters where the second port is unused should suppress the 'infiniband-port-physical-state-not-linkup' and 'infiniband-port-state-not-active' signs. See Chapter 4 of the User's Guide for more information on how to suppress signs. o When using the Linux* boot parameter isolcpus with an Intel® Xeon Phi(TM) processor using default MPI settings, MPI based applications may fail. If possible, change or remove the isolcpus Linux* boot parameter. If this is not possible and you are using the Intel® MPI Library, you can try setting I_MPI_PIN to off. Refer to the Intel® Cluster Checker reference manual for details on specifying environment variables for tests. o When run with dgemm/dgemm_cpu_performance or stream/stream_memory_bandwidth_performance framework, "stream-outlier" or "dgemm-data-is-substandard" may be observed as the corresponding provider scripts may not yield the expected performance with SNC-2/SNC-4 cluster mode and Flat memory mode configurations for Intel® Xeon Phi(TM) processor. There may be an issue with the kernel itself (BZ#1479763), documented at https://access.redhat.com/errata/RHBA-2017:2581 If there are no corresponding diagnoses, the signs may be suppressed. o The sign paraview-missing fires despite ParaView* being present on the system. (Applies to the Framework Definition intel_hpc_platform_sdvis-cluster-2018.0) ------------------------------------------------------------------------------- 8. TECHNICAL SUPPORT ------------------------------------------------------------------------------- If you did not register Intel® Cluster Checker during installation, please do so at the Intel® Software Development Products Registration Center at http://registrationcenter.intel.com. Registration entitles you to free technical support, product updates and upgrades for the duration of the support term. For information about how to find Technical Support, Product Updates, User Forums, FAQs, tips and tricks, and other support information, please visit: http://www.intel.com/software/products/support/ Note: If your distributor provides technical support for this product, please contact them for support rather than Intel. ------------------------------------------------------------------------------- 9. DISCLAIMER AND LEGAL INFORMATION ------------------------------------------------------------------------------- No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at Intel.com, or from the OEM or retailer. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others © 2020 Intel Corporation. Optimization Notice ------------------- Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804