Extending Intel ® Cluster Checker

Intel® Cluster Checker User Guide

Download PDF

ID 772051

Date 12/06/2022

Version 2021.7.2

Public

Visible to Intel only — GUID: GUID-4AB402F9-733D-4EFA-9E18-9116A030A4F1

View Details

Extending Intel ® Cluster Checker

Intel® Cluster Checker is extendable to provide the ability to easily add and configure functionality. This document uses a single end-to-end example to help illustrate how to extend Intel® Cluster Checker. In the example, the fictional Waterfowl Industries has developed the Duck Diagnostic Tool. This program comprehensively evaluates nodes using its trade secret methodology and rates them on its patented quack scale. A node is rated between 1 and 5 quacks, 5 being best. If there is an error during the evaluation, the node prints honk.

Collect Extensions
Data Providers
Analyzer Extensions
Knowledge Base
Postprocessor Extensions

Collect Extensions

Intel® Cluster Checker provides two methods of collecting data - pdsh and Intel® MPI Library. By default, the tool uses pdsh for data collection. For more information about using collect extensions, see the Data Collection chapter.

Data Providers

Intel® Cluster Checker uses data providers to collect data from the system. For more information about data providers, see the Data Providers chapter.

Analyzer Extensions

Intel® Cluster Checker analyzer extensions bridge the gap between the database and the knowledge base. Conceptually, an analyzer extension functions as follows:

Read the data from the database.
Transform the data. For example, extract the relevant information from the raw, unstructured database content via regular expressions.
Create CLIPS instances using the transformed data.

Extensions, in the form of shared libraries, plug into the analyzer framework, to perform these functions. Typically, there will be one analyzer extension per data provider/CLIPS class, but this may not always be the case.

The interfaces described in this chapter are located in /opt/intel/clck/2019.x/include/analyzer or /opt/intel/oneapi/clck/<version>/include/analyzer.

Analyzer Extension Format

Extensions are implemented through the Extension and Transform classes. Conceptually, the purpose of the Transform class is to read in data from a source, process the data, and then send the data to an output. The Extension class is specialized to use the Intel® Cluster Checker SQLite or ODBC database sources as the input and create CLIPS instances as the output.

Analyzer performs the following actions on each extension:

Load the extension shared library using dlopen().
Call the constructor of the extension.
Run the data input method parse().
Call the destructor of the extension.
Unload the shared library using dlclose().

Transform Class Member Methods Custom analyzer extensions require a custom parse function. All other functions in this section are provided within Intel® Cluster Checker for use within a custom parse function.

parse()

Pulls data from the input source, that is, the database, transform it, and calls route(). Parse executes data manipulation and extrapolation specific to each extension. parse() is a virtual function that serves as a placeholder for parsing raw data from the database into a format that can then be routed. This is the key function to be written when creating custom analyzer extensions.

set_header()

Defines the CLIPS slots to be populated. The order should match the order used in route(). set_header({"node_id", "timestamp", "count", "sound"});

set_name()

Sets the internal name of the extension. This name should match the name of the shared library and is also used in the framework definition analyzer_extension tag to configure the extension to be loaded. set_name(“duck”);

route()

Sends the data to the output sink; that is, create a CLIPS instance. The order should match the order used in set_header(). route({rows[i].hostname, rows[i].timestamp, variable1, "quack"});

Transform Class Member Variables

void* clips_env

Pointer to the CLIPS knowledge base environment.

Extension Class Summary

The Extension class inherits from the Transform class and provides additional functionality and class variables. Examples of this would be providing class member variables to store database data and functions to format parsed data for routing.

Custom Extensions for Framework Definitions

Framework Definitions accept native or custom extensions as long as they are specified as follows:

<configuration>
    <framework_definition>
        <analyzer_extension>
            <group>
                <entry>all_to_all</entry>
                <entry>cpu</entry>
                <entry>duck</entry>
            </group>
        </analyzer_extension>
    </framework_definition>
</configuration>

In the previous example, all_to_all and cpu are native extensions, while duck is a user defined extension. All the above extensions need to be located in the same folder as only one extension path can be specified per Framework Definition. If no path is specified, the default location is assumed /opt/intel/clck/2019.x/analyzer/intel64/cpp or /opt/intel/oneapi/clck/<version>/analyzer/intel64/cpp.

Database Interface

The database base class is a general interface for reading data from the database and currently supports SQLite and ODBC. The database class also allows to configure multiple database sources for analysis via configuration file. The data is queried over the provided data sources and select the data from the first available database. The following wrapper methods are provided for accessing the database, and are defined in /opt/intel/clck/2019.x/include/datastore/datastore.h or /opt/intel/oneapi/clck/<version>/include/datastore/datastore.h. These methods query the database view clck_1. If the database is provided by the user, the clck_1 view must be created manually (see the Database Schema section in the Reference for the database view).

bool select_provider_data(std::vector<std::string> providers,
  Rows& rows, std::string where_clause,
  bool mark_as_baseline);

The database rows resulting from the query are appended to the vector of rows provided by the caller in the second argument. When this function is called, an SQL query of the following form is constructed and executed over a loop of all the provider_names specified by the providers vector. (Note: The argument mark_as_baseline in all the database methods is experimental and not to be used.)

SELECT * FROM clck_1 WHERE provider=<provider_name>
  AND <where_clause>

A more general select method is also available.

bool select_data(const std::string query, Rows& rows,
  const std::map<std::string, int>& columns,
  bool mark_as_baseline);

As before, the database rows resulting from the query are appended to the vector of rows provided by the caller in the second argument. The difference is that the first argument may be any valid SQL SELECT query. Since not all database columns may be returned by the query, the third argument is a map of column names and their order in the SELECT query. The wrapper method selects the data from the first available database.

For example, the following would select the latest rows for each node corresponding to the duck provider. (see the Database Schema section in the Reference for the database view).

clck::database::select_data("SELECT * FROM clck_1 a INNER JOIN
   (SELECT Hostname, MAX(Unique_timestamp) AS Unique_timestamp,
   Provider FROM clck_1 WHERE Provider= 'duck' GROUP BY
   Hostname, Provider) b ON a.Provider=b.Provider AND a.
   Unique_timestamp=b.Unique_timestamp AND a.Hostname=b.
   Hostname", rows);

A nearly equivalent set of data can be obtained using the following function.

bool get_latest_rows_provider_data(const std::vector<std::string>& providers,
  Rows& rows, std::string hostname, bool mark_as_baseline);

Unlike the general select_data() function, get_latest_rows_provider_data() populates all the database columns rather than just the specified subset.

Knowledge Base and CLIPS Interface

Analyzer makes use of the CLIPS C API for interacting with the knowledge base (see the offical documentation for more details).

Creating CLIPS Class Instances

Analyzer extensions populate the knowledge base by creating CLIPS instances. The format of the data expected by the knowledge base (that is, the CLIPS slots) is defined by the corresponding knowledge base class.

Parsing Database Output

Once the data is read from the database it is available for processing. Any method available to C++ can be used to filter and transform the data into the format expected by the knowledge base, such as regular expression.

Handling Parse Errors

Parse errors can occur when an analyzer extension reads unexpected or invalid data from the database. If the error is critical to the operation of the entire extension, then it is appropriate to log an error and throw an exception. In the case of non-critical errors, then the parser should log a warning message, ignore the offending row in the database, and continue processing the rest of the rows.

Building Analyzer Extensions

Extensions are shared libraries and need to be built as such.

Sample extensions and a sample Makefile are available in the SDK Duck Sample*.

*Note: The duck sample is not working with Intel® Cluster Checker 2019 or Intel® Cluster Checker 2021.

GCC* 4.9 or later is required to build extensions. The Intel® C++ Compiler 15.0 or later may also be used, but GCC* 4.9 or later is still required.

Intel® Cluster Checker uses features from C++11, therefore the command line option -std=c++11 is required to build analyzer extensions.

Loading Extensions

To load an analyzer extension, add it to a custom framework definition using the following XML tags:

<configuration>
    <framework_definition name="customFWD">
        <analyzer_extension>
            <group>
                <entry>custom_extension</entry>
            </group>
        </analyzer_extension>
    </framework_definition>
</configuration>

The basename of the extension should match the internal extension name assigned by set_name(). This name is the value that should be added to the list of analyzer extensions.

Example

A complete, fully functional analyzer extension that transforms the output of the duck provider into instances of DUCK CLIPS class is located in the SDK Duck Sample*.

Knowledge Base

The knowledge base uses CLIPS rules to produce signs and diagnoses based on collected data. It is the framework through which Intel® Cluster Checker comes to conclusions about a system and thereby produces analysis. The knowledge base is documented in full in the Knowledge Base chapter.

Postprocessor Extensions

Postprocessor extensions format the results of analysis in a readable format. By default, Intel® Cluster Checker runs the summary postprocessor extension followed by the CLCK output log postprocessor extension. Postprocessor extensions can be specified in the config file using the following format:

<configuration>
    <postprocessor>
        <postproc_extensions>
            <group>
                <entry>summary</entry>
                <entry>clck_output_log</entry>
            </group>
        </postproc_extensions>
    </postprocessor>
</configuration>

They can also be included in an individual framework definition in a similar manner. For more information about customizing Framework Definitions, see the Framework Definitions chapter. The following postprocessor extensions are available:

Summary

The summary postprocessor extension displays a brief summary of the analysis results to the screen. This extension runs by default and can be specified with the entry tag using the string “summary”, as shown above.

CLCK Output Log

The CLCK output log postprocessor extension writes full analysis details to a log file. This extension runs by default and can be specified with the entry tag using the string “clck_output_log”, as shown above.

*Note: The duck sample is not working with Intel® Cluster Checker 2019 or Intel® Cluster Checker 2021.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® Cluster Checker User Guide

Extending Intel ® Cluster Checker

Collect Extensions

Data Providers

Analyzer Extensions

Custom Extensions for Framework Definitions

Database Interface

Knowledge Base and CLIPS Interface

Building Analyzer Extensions

Loading Extensions

Example

Knowledge Base

Postprocessor Extensions