Installation Guide for the Intel® Ethernet Fabric Suite Software

Documentation

Install & Setup

000059662

09/09/2021

Download Intel® Ethernet Fabric Suite Basic Package or Intel® Ethernet Fabric Suite FS Package.

This article provides instructions and information for getting started with the Intel® Ethernet Fabric Suite (Intel® EFS) software installation.

This section provides information and procedures to install the Intel® Ethernet Fabric Suite software on the Management Node or on a host node in the fabric.

You install the software using one of the following methods:

  • TUI menus (recommended)
  • CLI commands
  • Linux* Distribution Software packages provided by Intel

It is recommended that you install the Intel EFS software on the Management Node using the Install TUI, and then use FastFabric to configure the Management Node.

Note

Without proper configuration on Management Node, some tools or applications may not work. For example, MPI applications may require password-less SSH, and some FastFabric functions depend on proper SNMP setup. It's crucial to configure the Management Node with FastFabric TUI or CLI commands after Intel® Ethernet Fabric Suite (Intel® EFS) software installation.

After the Management Node has been configured, the Basic software can be installed on all the remaining hosts using either the FastFabric TUI or a provisioning or diskless boot mechanism.

Note

If you are using a provisioning system, refer to the documentation that comes with the provisioning system.

Before starting the installation, perform the following:

  • Refer to this article for the list of compatible operating systems and required OS RPMs Installation Prerequisites.
  • For the pre-installation requirements, see this article.
  • Go to Download and Install NVIDIA* Software (Optional) section in the Software Install Guide including exporting NVIDIA_GPU_DIRECT, if you want GPUDirect* support.
  • You have downloaded and extracted the IEFS software package.
  • If you are using a customized installation via the Install CLI command, prepare your command-line options.
  • Gather your list of the IP addresses and netmasks for each interface you are going to set up.

Using the INSTALL command-line options

The ./INSTALL command for the Basic and FS installations are issued from the following directories:

  • Intel Basic directory: IntelEth-Basic.DISTRO.VERSION
  • Intel FS directory: IntelEth-FS.DISTRO.VERSION

Syntax
./INSTALL [-v|-vv] -R osver [-a|-n|-U|-u|-s|-O|-N|-i comp| -e comp] [-G] [-E comp] [-D comp] [--user-space] [--without-depcheck] [--rebuild] [--force] [--answer keyword=value]
or
./INSTALL -C
or
./INSTALL -V

Options

  • No option selected: Displays the Intel® EFS Software TUI.
  • -v: Provides verbose logging. Logs to the /var/log/iefs.log file.
  • - vv: Provides very verbose debug logging. Logs to the /var/log/iefs.log file.
  • -R osver: Force install for specific OS kernel version, rather than running kernel.
  • -a: Installs all Upper Layer Protocols (ULP) and drivers with the default options.
  • -n: Installs all ULPs and drivers with the default options, but does not change the autostart options.
  • -U: Upgrades/reinstalls all presently installed ULPs and drivers with the default options, and does not change the autostart options.
  • -u: Uninstalls all ULPs and drivers with the default options.
  • -s: Enables autostart for all installed software.
  • -O: Keeps the current modified rpm configuration file.
  • -N: Uses a new default rpm configuration file.
  • -i comp: Installs the given component with the default options. This option can appear multiple times on a command line.
Important

Using this command to upgrade or downgrade an individual component against the existing FS will update all previously installed components to the version of the individual component being installed.

  • -e comp: Uninstalls the given component with the default options. This option can appear multiple times on a command line.
  • -E comp: Enables autostart of given component. This option can appear with -D or multiple times on a command line.
Note To control which installed software is configured for autostart, combine this option with -a, -n, -i,  -e, and -U options.
  • -D comp: Disables autostart of the given component. This option can appear with -E or multiple times on a command line.
Note To control which installed software is configured for autostart, combine this option with -a, -n, -i, -e, and -U options.
  • --user-space: Skips kernel space components during installation.
  • --withoutdepcheck: Disables the check of OS dependencies.
  • --rebuild: Forces a rebuild of kernel module srpms.
  • --force: Forces the installation, even if the distributions do not match. Use of this option can result in undefined behaviors.
  • --answer keyword=value: Provides an answer to a question which might occur during the operation. Answers to questions that are not asked are ignored. Invalid answers result in prompting for interactive installations or use of the default for non-interactive installations.

Possible questions

ARPTABLE_TUNNING Adjust kernel ARP table size for large fabrics
ROCE_ON RoCE RDMA transport
LIMITS_SEL Resource Limits Selector

  • -C: Shows the list of supported component names.
  • -V: Outputs the version number of the software.
  • -G: Install GPU support components.

Other information

  • Supported Component (comp) Names: eth_tools, psm3, fastfabric, eth_rdma, openmpi_gcc_ofi, mpisrc, delta_debug
  • Supported Component (comp) Name Aliases: eth, mpi, psm_mpi
  • Components for -G (GPU) Installations:
    For RHEL*: iefs-kernel-updates-devel, iefs-kernel-updates-dkms, kmod-iefs-kernel-updates, iefs-kernel-updates-debuginfo, libpsm3-fi, libpsm3-fi-debuginfo, openmpi_gcc_cuda_ofi
    For SLES*: iefs-kernel-updates-devel, iefs-kernel-updates-kmp-default, iefs-kernel-updatesdkms, libpsm3-fi, openmpi_gcc_cuda_ofi
 Install using the TUI menus

You can install both the Intel® Ethernet Fabric Suite Software Basic and FS software packages using the Intel® EFS Software menu. You have your IPV4 list of IP addresses and netmasks for each interface you are going to set up.

Perform the following steps to install the Intel® EFS Software.

Caution

Do not interrupt an operation mid-process. Some operations may take a few minutes to complete.

 

Step Task/Prompt Action
1 At the command prompt, change directory to the location of the installation software package:
  • For Basic, type the following and press Enter: cd IntelEth-Basic.DISTRO.VERSION
  • For FS, type the following and press Enter: cd IntelEth-FS.DISTRO.VERSIONwhere DISTRO.VERSION is the distribution and CPU.
2 At the command prompt, start the install script.

Type ./INSTALL and press Enter.

 
Note
  • To install FS with GPU support, use ./INSTALL -G
  • To install FS with different root directory, use chroot.
  • When the kernel version in the chroot environment is different from the host's kernel version, use ./INSTALL -R to force the FS installation with the target OS kernel version.
3 Select 1) Install/Uninstall Software. Type 1.
4 Review the items to be installed. Accept the defaults (No action required).
Type N to go to the next page.
Note

If you need to change any item, enter the alphanumeric character associated with the item to toggle between Install or Don't Install.

5 Start the installation. Type P to perform the actions.
Note: This may take a few minutes.
6 Preparing OFA VERSION release for Install... Rebuild OFA SRPMs (a=all, p=prompt per SRPM, n=only as needed?)
[n]:
Press Enter to accept the default.
Note

The system will display prompts that require your response throughout the installation.

7 For each system prompt... Accept the defaults by pressing Enter to continue.
Note

Some of the default processes may take a few minutes to complete.

8 When the Intel® EFS Autostart Menu displays, review the items. Intel recommends leaving all of the Autostart selections set to the default values.
Note

If you need to change any item, enter the alphanumeric character associated with the item to toggle between Enable or Disable.

9 Run the Intel® EFS Autostart operations. Type P.
10 For each system prompt, "Hit any key to continue..." Press any key.
Note

When the installation completes, you are returned to the main menu.

11 Exit out of the TUI to the command prompt. Type X.
12 Reboot the server. Type reboot and press Enter.
Note

Do not interrupt the reboot process.
Depending on your operating system, the reboot may take a few minutes.

13 Verify the installation was successful. Type iefsconfig -V and press Enter.
  Install using CLI commands

You can install both the Intel® Ethernet Fabric Suite Software Basic and FS software packages using the ./INSTALL command.

The ./INSTALL command has many options including installing single components, as well as enabling and disabling autostart of components. This section provides you with instructions for the default installation, but you can append the install command with specific options for a more customized installation.

You have your IPV4 list of IP addresses and netmasks for each interface you are going to set up.

Perform the following steps to install the default Intel® EFS Software configuration:

Step Task/Prompt Action
1 At the command prompt, change directory to the location
of the installation software package:
  • For Basic, type the following and press Enter:
    cd IntelEth-Basic.DISTRO.VERSION
  • For FS, type the following and press Enter:
    cd IntelEth-FS.DISTRO.VERSION where DISTRO.VERSION is the distribution and CPU.
2 At the command prompt, start the install script.

Type ./INSTALL -n and press Enter.
Note:

  • To install FS with GPU support, use ./INSTALL -n -G.
  • To install FS with different root directory, use chroot.
  • When the kernel version in the chroot environment is different from the host's kernel version, use ./INSTALL -R to force the FS installation with the target OS kernel version.
3 At the command prompt, reboot the server. Type reboot and press Enter.
Install using Linux* Distribution software packages provided by Intel

The Intel® Ethernet Fabric Suite (Intel® EFS) software FS package contains the OS-specific repository for installing the Intel® EFS software.

This section provides the instructions for installing using the FS package repository.

Intel introduced virtual packages to facilitate FS installation.

  • A virtual package prefixed with ethmeta_ is a meta-package for an FS component in the INSTALL script. Installing a meta-package will install the corresponding component.
  • A virtual package prefixed with ethnode_ is an alias package for a typical FS installation on an HPC node.

Default installation options

This installation method will install Intel® EFS packages with default options. To install with different options, set the following system environment variables in advance of the installation.

  • Variable Name: ETH_ARPTABLE_TUNING
    Values:
    1 - Enable adjust kernel ARP table for large fabric (default)
    0 - Disable adjust kernel ARP table for large fabric
  • Variable Name: ETH_ROCE_ON
    Values:
    1 - Enable RoCE on supported NICs (default)
    0 - Disable RoCE on supported NICs
  • Variable Name: ETH_LIMITS_CONF
    Values:
    1 - Enable adjusted Memory Limit configuration (default)
    0 - Disable adjusted Memory Limit configuration

Repositories included in the Intel EFS package

The Intel EFS package contains the following repositories:
IEFS_PKGS: Contains all software needed to be installed on the compute node, management node, or service node, such as storage node.
IEFS_PKGS_CUDA: Contains all software needed to be installed on the node that includes NVIDIA* cards.

Note These two repositories cannot coexist on any node. Ensure that only one exists or is enabled.

Interoperate with the INSTALL script.

Intel recommends that you do not mix yum/zypper repository-based install with script-based install. Doing so may cause unexpected behaviors. However, you can switch from one install mechanism to another.

  • Switching from script-based install to yum/zypper-based install:
    Customer can switch to the yum/zypper-based install at any time. No special actions are required. To switch to a yum/zypper-based install: If Intel® EFS is already partially or fully installed with the script, the yum/zypper-based install will identify the installed packages and skip them during installation.
  • Switching from yum/zypper-based install to script-based install:
    The meta and alias packages of the yum/zypper-based install introduce extra dependencies on Intel® EFS packages. This could impact the script-based install because the code directly uses the rpm command for installation, which is sensitive to package dependencies. To switch to script-based install: Because you must remove the meta and alias packages first, Intel has improved the INSTALLscript to handle this. Using INSTALL -a, or -U, or -n will switch to script-based installation. INSTALL  -u will remove all packages include the meta and alias packages. Alternatively, you can manually remove the meta and alias packages with the yum/zypper command prior to starting the script-based install.

Repository deployment into the environment

The IntelEth-FS.<OS_VERSION>-x86_64.<VERSION>.tgz tar package contains the repository used to install the Intel® EFS software. It also includes a helper script called ethcreaterepo that checks and rebuilds kernel rpms, creates the local repository, and recommends packages to install on each compute, management, and service nodes.

Intel recommends using this script to create the local repository ensuring that the correct version kernel rpms and the proper repository is created for the GPU support requirement.

For example, if an ETH_PKGS_CUDA repository already exists and you want to replace it with a repository for ETH_PKGS, the script will back up and remove the ETH_PKGS_CUDA to prevent the GPU version packages from installing unintentionally.

After a local repository has been successfully created, you can transfer it to an enterprise repository, based on organization needs, to allow sharing it among nodes. The following shows the usage information for ethcreaterepo:
Usage:
ethcreaterepo [-G]
ethcreaterepo -i
ethcreaterepo --help
Create a local repo for Intel® Ethernet Fabric Suite packages.
Options:
-G create a repo with GPU Direct support (to install it must have NVidia driver installed)
-i display information about the repo it will create
--help produce full help text
Examples:
ethcreaterepo
ethcreaterepo -G
ethcreaterepo –i

After the script executes successfully, it will list the packages for installation. The example below shows the output for RHEL:

Repo IntelEth-FS was successfully created.
Please use the following component metapackages to install Intel Ethernet software
ethmeta_eth_tools : Intel Ethernet Meta Package for Eth Tools
ethmeta_fastfabric : Intel Ethernet Meta Package for FastFabric
ethmeta_mpisrc : Intel Ethernet Meta Package for MPI Source
ethmeta_openmpi_gcc_ofi : Intel Ethernet Meta Package for OpenMPI (ofi,gcc)
ethmeta_openmpi_gcc_ofi_dkms : Intel Ethernet Meta Package for OpenMPI (ofi,gcc) (DKMS version)
ethmeta_openmpi_gcc_ofi_userspace : Intel Ethernet Meta Package for OpenMPI (ofi,gcc) (user space only)
ethmeta_psm3 : Intel Ethernet Meta Package for PSM3
ethmeta_psm3_dkms : Intel Ethernet Meta Package for PSM3 (DKMS version)
ethmeta_psm3_userspace : Intel Ethernet Meta Package for PSM3 (user space only)
To facilitate installation, Intel provides the following aliases for common component combinations:
ethnode_mgmt : Useful for management node. Includes all components.
ethnode_mgmt_userspace : Useful for container. Same as eth_mgmt except it’s using user space version components.
ethnode_mgmt_dkms : DKMS version ethnode_mgmt. Requires DKMS pre-installed.
ethnode_compute : Useful for compute and login node. Includes all components except management (fastfabric)
ethnode_compute_userspace : Useful for container. Same as eth_compute except it’s using user space version components.
ethnode_compute_dkms : DKMS version ethnode_compute. Requires DKMS pre-installed.
ethnode_service : Useful for service node. Includes all components except fastfabric and mpi components.
ethnode_service_userspace : Useful for container. Same as eth_service except it’s using user space version components.
ethnode_service_dkms : DKMS version ethnode_service. Requires DKMS pre-installed.
Please run iefsconfig to config Intel Ethernet software after finished the installation

Using IEFS Repository on Linux* OS

You can install the Intel® Ethernet Fabric Suite Software packages on Red Hat* Enterprise Linux* (RHEL*) or SUSE* Linux* Enterprise Server (SLES*) using the OS distribution included in the IEFS package repository and its dependencies. You have your IPV4 list of IP addresses and netmasks for each interface you are going to set up. You have your software packages ready for installation.

Perform the following steps to install the default Intel® Ethernet Fabric Suite Software configuration:

Step Task/Prompt Action
Set Up the IEFS Repository
1 Create the local repository. At the command prompt, type: ethcreaterepo.
2 Create the local repository on nodes that need GPU support.
Note: If you install GPU-supported packages on nodes without NVIDIA* cards, you may see performance degradation.

At the command prompt, type: ethcreaterepo -G.
Note: After execution, the recommended install commands are provided.

3 On each node, install the Intel® EFS Software.
For a list of packages in specific Intel® EFS components,
refer to Intel® EFS Software Components to Packages
Mapping.
Type yum install <alias> under RHEL, or zypper install <alias> under SLES where  <alias>  is the recommended alias pkg (based on node type).
Note

Alternatively, you can insert the install command (based on node type) in a provision script.

Configure RDMA
4 At the command prompt, start  iefsconfig. Type iefsconfig.
5 Select 2) Reconfigure Eth RDMA. Type 2.
6 Enable RoCE RDMA transport (ROCE_ON)? [y]: Press Enter.
7 Resource Limits Selector (0-7) [5]: Press Enter or type another number depending on fabric size and applications run on the fabric.
8 For each interface, configure MTU and willing mode Priority Flow Control Configure interface <dev> now? [y]:
MTU value [9000]:
Flow Control config, recommend willing mode Priority
Flow Control...
Turn off Link Level Flow Control? [y]:
Turn on firmware DCB? [y]:
Type y to configure a interface.
Type desired MTU value.
Press Enter to turn off Link Level Flow Control.
Press Enter to turn on firmware DCB.
9 Reboot the server. Type reboot and press Enter.
Note

Do not interrupt the reboot process. Depending on your operating system, the
reboot may take a few minutes.

10 Verify that the installation was successful. Type iefsconfig -V and press Enter.
Install kernel module with DKMS

With Dynamic Kernel Module Support (DKMS) installed on your system, you can install the Intel® EFS kernel module with DKMS support so that when a kernel update occurs, you do not need to reinstall Intel® Ethernet Fabric Suite software. The DKMS framework will automatically rebuild the kernel module during the kernel update.

IMPORTANT

The kernel module rebuild may not work when you update to a new, major OS version. In this case, you must download the corresponding Intel EFS and reinstall it.

Prerequisites

Install DKMS prior to performing the steps below.

Note

DKMS is not available in Linux* distributions. You will need to download or install it by yourself. For example, you can install it from the following locations:

Follow the instructions described in Install Using the TUI Menus or Install Using CLI Commands to install Intel® Ethernet Fabric Suite software. When the install script detects DKMS, it will install DKMS version packages. To install using IEFS repository, follow the instructions described in Install Using Linux* Distribution Software Packages and choose the dkms version package to install.