Extending Intel ® Cluster Checker
Intel® Cluster Checker is extendable to provide the ability to easily add and configure functionality. This document uses a single end-to-end example to help illustrate how to extend Intel® Cluster Checker. In the example, the fictional Waterfowl Industries has developed the Duck Diagnostic Tool. This program comprehensively evaluates nodes using its trade secret methodology and rates them on its patented quack scale. A node is rated between 1 and 5 quacks, 5 being best. If there is an error during the evaluation, the node prints honk.
Collect Extensions
Intel® Cluster Checker provides two methods of collecting data - pdsh and Intel® MPI Library. By default, the tool uses pdsh for data collection. For more information about using collect extensions, see the Data Collection chapter.
Data Providers
Intel® Cluster Checker uses data providers to collect data from the system. For more information about data providers, see the Data Providers chapter.
Analyzer Extensions
Intel® Cluster Checker analyzer extensions bridge the gap between the database and the knowledge base. Conceptually, an analyzer extension functions as follows:
Read the data from the database.
Transform the data. For example, extract the relevant information from the raw, unstructured database content via regular expressions.
Create CLIPS instances using the transformed data.
Extensions, in the form of shared libraries, plug into the analyzer framework, to perform these functions. Typically, there will be one analyzer extension per data provider/CLIPS class, but this may not always be the case.
The interfaces described in this chapter are located in /opt/intel/clck/2019.x/include/analyzer or /opt/intel/oneapi/clck/<version>/include/analyzer.
Analyzer Extension Format
Extensions are implemented through the Extension and Transform classes. Conceptually, the purpose of the Transform class is to read in data from a source, process the data, and then send the data to an output. The Extension class is specialized to use the Intel® Cluster Checker SQLite or ODBC database sources as the input and create CLIPS instances as the output.
Analyzer performs the following actions on each extension:
Load the extension shared library using dlopen().
Call the constructor of the extension.
Run the data input method parse().
Call the destructor of the extension.
Unload the shared library using dlclose().
Transform Class Member Methods Custom analyzer extensions require a custom parse function. All other functions in this section are provided within Intel® Cluster Checker for use within a custom parse function.
parse()
Pulls data from the input source, that is, the database, transform it, and calls route(). Parse executes data manipulation and extrapolation specific to each extension. parse() is a virtual function that serves as a placeholder for parsing raw data from the database into a format that can then be routed. This is the key function to be written when creating custom analyzer extensions.
set_header()
Defines the CLIPS slots to be populated. The order should match the order used in route(). set_header({"node_id", "timestamp", "count", "sound"});
set_name()
Sets the internal name of the extension. This name should match the name of the shared library and is also used in the framework definition analyzer_extension tag to configure the extension to be loaded. set_name(“duck”);
route()
Sends the data to the output sink; that is, create a CLIPS instance. The order should match the order used in set_header(). route({rows[i].hostname, rows[i].timestamp, variable1, "quack"});
Transform Class Member Variables
void* clips_env
Pointer to the CLIPS knowledge base environment.
Extension Class Summary
The Extension class inherits from the Transform class and provides additional functionality and class variables. Examples of this would be providing class member variables to store database data and functions to format parsed data for routing.
Custom Extensions for Framework Definitions
Framework Definitions accept native or custom extensions as long as they are specified as follows:
<configuration> <framework_definition> <analyzer_extension> <group> <entry>all_to_all</entry> <entry>cpu</entry> <entry>duck</entry> </group> </analyzer_extension> </framework_definition> </configuration>
In the previous example, all_to_all and cpu are native extensions, while duck is a user defined extension. All the above extensions need to be located in the same folder as only one extension path can be specified per Framework Definition. If no path is specified, the default location is assumed /opt/intel/clck/2019.x/analyzer/intel64/cpp or /opt/intel/oneapi/clck/<version>/analyzer/intel64/cpp.
Database Interface
The database base class is a general interface for reading data from the database and currently supports SQLite and ODBC. The database class also allows to configure multiple database sources for analysis via configuration file. The data is queried over the provided data sources and select the data from the first available database. The following wrapper methods are provided for accessing the database, and are defined in /opt/intel/clck/2019.x/include/datastore/datastore.h or /opt/intel/oneapi/clck/<version>/include/datastore/datastore.h. These methods query the database view clck_1. If the database is provided by the user, the clck_1 view must be created manually (see the Database Schema section in the Reference for the database view).
bool select_provider_data(std::vector<std::string> providers, Rows& rows, std::string where_clause, bool mark_as_baseline);
The database rows resulting from the query are appended to the vector of rows provided by the caller in the second argument. When this function is called, an SQL query of the following form is constructed and executed over a loop of all the provider_names specified by the providers vector. (Note: The argument mark_as_baseline in all the database methods is experimental and not to be used.)
SELECT * FROM clck_1 WHERE provider=<provider_name> AND <where_clause>
A more general select method is also available.
bool select_data(const std::string query, Rows& rows, const std::map<std::string, int>& columns, bool mark_as_baseline);
As before, the database rows resulting from the query are appended to the vector of rows provided by the caller in the second argument. The difference is that the first argument may be any valid SQL SELECT query. Since not all database columns may be returned by the query, the third argument is a map of column names and their order in the SELECT query. The wrapper method selects the data from the first available database.
For example, the following would select the latest rows for each node corresponding to the duck provider. (see the Database Schema section in the Reference for the database view).
clck::database::select_data("SELECT * FROM clck_1 a INNER JOIN (SELECT Hostname, MAX(Unique_timestamp) AS Unique_timestamp, Provider FROM clck_1 WHERE Provider= 'duck' GROUP BY Hostname, Provider) b ON a.Provider=b.Provider AND a. Unique_timestamp=b.Unique_timestamp AND a.Hostname=b. Hostname", rows);
A nearly equivalent set of data can be obtained using the following function.
bool get_latest_rows_provider_data(const std::vector<std::string>& providers, Rows& rows, std::string hostname, bool mark_as_baseline);
Unlike the general select_data() function, get_latest_rows_provider_data() populates all the database columns rather than just the specified subset.
Knowledge Base and CLIPS Interface
Analyzer makes use of the CLIPS C API for interacting with the knowledge base (see the offical documentation for more details).
Creating CLIPS Class Instances
Analyzer extensions populate the knowledge base by creating CLIPS instances. The format of the data expected by the knowledge base (that is, the CLIPS slots) is defined by the corresponding knowledge base class.
Parsing Database Output
Once the data is read from the database it is available for processing. Any method available to C++ can be used to filter and transform the data into the format expected by the knowledge base, such as regular expression.
Handling Parse Errors
Parse errors can occur when an analyzer extension reads unexpected or invalid data from the database. If the error is critical to the operation of the entire extension, then it is appropriate to log an error and throw an exception. In the case of non-critical errors, then the parser should log a warning message, ignore the offending row in the database, and continue processing the rest of the rows.
Building Analyzer Extensions
Extensions are shared libraries and need to be built as such.
Sample extensions and a sample Makefile are available in the SDK Duck Sample*.
*Note: The duck sample is not working with Intel® Cluster Checker 2019 or Intel® Cluster Checker 2021.
GCC* 4.9 or later is required to build extensions. The Intel® C++ Compiler 15.0 or later may also be used, but GCC* 4.9 or later is still required.
Intel® Cluster Checker uses features from C++11, therefore the command line option -std=c++11 is required to build analyzer extensions.
Loading Extensions
To load an analyzer extension, add it to a custom framework definition using the following XML tags:
<configuration> <framework_definition name="customFWD"> <analyzer_extension> <group> <entry>custom_extension</entry> </group> </analyzer_extension> </framework_definition> </configuration>
The basename of the extension should match the internal extension name assigned by set_name(). This name is the value that should be added to the list of analyzer extensions.
Example
A complete, fully functional analyzer extension that transforms the output of the duck provider into instances of DUCK CLIPS class is located in the SDK Duck Sample*.
Knowledge Base
The knowledge base uses CLIPS rules to produce signs and diagnoses based on collected data. It is the framework through which Intel® Cluster Checker comes to conclusions about a system and thereby produces analysis. The knowledge base is documented in full in the Knowledge Base chapter.
Postprocessor Extensions
Postprocessor extensions format the results of analysis in a readable format. By default, Intel® Cluster Checker runs the summary postprocessor extension followed by the CLCK output log postprocessor extension. Postprocessor extensions can be specified in the config file using the following format:
<configuration> <postprocessor> <postproc_extensions> <group> <entry>summary</entry> <entry>clck_output_log</entry> </group> </postproc_extensions> </postprocessor> </configuration>
They can also be included in an individual framework definition in a similar manner. For more information about customizing Framework Definitions, see the Framework Definitions chapter. The following postprocessor extensions are available:
Summary
The summary postprocessor extension displays a brief summary of the analysis results to the screen. This extension runs by default and can be specified with the entry tag using the string “summary”, as shown above.
CLCK Output Log
The CLCK output log postprocessor extension writes full analysis details to a log file. This extension runs by default and can be specified with the entry tag using the string “clck_output_log”, as shown above.
*Note: The duck sample is not working with Intel® Cluster Checker 2019 or Intel® Cluster Checker 2021.