Intel® AI Analytics Toolkit Installation and Getting Started Guide for Amazon Web Services (AWS)

ID 741227
Updated 4/25/2022
Version Latest
Public

author-image

By

Introduction

The Intel® AI Analytics Toolkit (AI Kit) targets data scientists, AI engineers, and researchers through optimized, familiar Python* tools and AI frameworks to accelerate end-to-end data science and analytics pipelines on Intel® architectures.

The AI Kit is part of oneAPI toolkits. Its components are built using performance libraries of the Intel® oneAPI Base Toolkit, i.e., Intel® oneAPI Math Kernel Library (oneMKL), Intel® oneAPI Data Analytics Library (oneDAL), Intel® oneAPI Deep Neural Network Library (oneDNN), and Intel® oneAPI Collective Communications Library (oneCCL) for low-level compute optimizations.

This article covers the installation steps and helps you to get started with the Intel® AI Analytics Toolkit on Amazon Web Services (AWS). For system requirements and other details, please refer to the Intel® AI Analytics Toolkit. Release notes can be found in a dedicated article.

Components of the Intel® AI Analytics Toolkit

All the AI Kit components can be installed standalone without needing to install full AI Kit. To install a particular component, you can check out the links provided below for each component.

The AI Kit includes:

Getting Started with AWS and Setting Up an AWS Linux* Instance

This article assumes you are familiar with the AWS environment. To learn more about working with AWS, see the Amazon Elastic Compute Cloud Documentation.

Specifically, this article assumes:

AWS Instances with Intel® Xeon® Processors

Two important Intel processor features are Intel® Advanced Vector Extensions(Intel® AVX-512) and Intel® Deep Learning Boost (Intel® DL Boost).

We recommend AWS EC2 instances with Intel® DL Boost for your deep learning workloads to get the best out of Intel® processors using AI Kit. Or at least instances with AVX512 support. Other EC2 Intel® Xeon® processor instances (without AVX512) can also be used for AI Kit.

Intel® Xeon® Scalable processors with Intel® DL Boost significantly increase deep learning training and inference performance by leveraging new Vector Neural Network Instruction (VNNI/INT8, supports INT8/BF16 quantization and mixed-precision models) over previous generation Intel® Xeon® Scalable processors (with FP32), for image recognition/segmentation, object detection, speech recognition, language translation, recommendation systems, reinforcement learning, and others AVX-512 is the latest x86 vector instruction set, with up to two fused-multiply add units and other optimizations for applications that are Floating Point (FP) intensive.

Based on these features and the Intel® Xeon® processor generation type, the table lists AWS EC2 instances with an Intel Xeon Processor. The codenames for Intel® Xeon® processors are following:

  • The codename for 1st gen Intel® Xeon® processors is Sky Lake(SKX).
  • The codename for 2nd gen Intel® Xeon® processors is Cascade Lake(CLX).
  • The codename for 3rd gen Intel® Xeon® processors is Ice Lake(ICX).

* These instances might launch other type of processor with AVX512 feature.

Creating an Instance

  • Log in to your AWS account.
  • Navigate to the EC2 dashboard.
  • Click the Launch Instance button.
  • Step 1 – Choose AMI: Select the AMI type to launch as an instance. Suggested AMI: Latest Amazon Linux* or Ubuntu* AMI.
  • Step 2 – Choose an Instance Type: Use the default t2.micro instant type.

Note: you can move to any step using the dashboard steps bar on top.

  • Step 3 – Configure Instance: Choose your desired (or default) VPC and a Subnet. Click the Protect against accidental termination box.
  • Step 4 – Add Storage: Set the instance storage to a minimum of 20GB.
  • Step 5 – Add Tags: Name the instance if you want to have a unique identifier (Example: Key = “Name”, Value = “AI Kit”).
  • Step 6 – Configure Security Group: You may use the default settings or restrict them to match your security needs. Important: Make sure SSH, Port TCP 22, is accessible. Add your IP address in Inbound rules.
  • Step 7 – Review and Launch: Click on Launch to launch the instance.
  • Create Key-Pair dialog: Use any existing key-pair or select the option Create a new key pair and enter a key pair name (Example: “aikit”). Click on Download key pair. A key file (.pem) will be created. Save the file (here: aikit.pem).

Note: this is the only time you will be able to download the key.

  • Click the Launch Instances button: The dashboard will display the Launch Status. Click on the Instance ID to return to the Instance Dashboard. This process may take a few minutes before the instance is running.

Connecting to the AWS Instance

Prerequisites
  • Install an SSH client
    Your Linux* computer most likely includes an SSH client by default. You can check for an SSH client by typing ssh in the command-line. If your computer does not recognize the command, the OpenSSH project provides a free implementation of the full suite of SSH tools. For more information, see the OpenSSH web page. For Windows*, you can use PowerShell* or download PuTTY.
     
  • Install the AWS CLI Tools
    (Optional) If you are using a public AMI from a third party, you can use the command-line tools to verify the fingerprint. For more information about installing the AWS CLI, please refer to Getting Set Up in the AWS Command Line Interface User Guide.
     
  • Get the public DNS name of the instance
    You can get the public DNS for your instance using the Amazon EC2 console (check the Public DNS (IPv4) column; if this column is hidden, choose the Show/Hide icon and select Public DNS (IPv4)). If you prefer, you can use the describe-instances (AWS CLI) or Get-EC2Instance (AWS Tools for Windows PowerShell) command.
     
  • Locate the private key
    Get the fully qualified path to the location on your computer of the .pem file for the key pair that you specified when you launched the instance.
     
  • Enable inbound SSH traffic from your IP address to your instance
    Ensure that the security group associated with your instance allows incoming SSH traffic from your IP address. For more information, see Authorizing Network Access to Your Instances.
    Important: Your default security group does not allow incoming SSH traffic by default.

Connecting Using SSH

  • In a command-line shell, change directories to the location of the private key file that you created when you launched the instance.
  • Use the chmod command to make sure that your private key file is not publicly viewable. For example, if the name of your private key file is my-key-pair.pem, use the following command:
o chmod 400 /path/my-key-pair.pem
  • Use the ssh command to connect to the instance. You specify the private key (.pem) file and the user_name@public_dns_name. For Amazon Linux*, the user name is ec2-user. For RHEL*, the user name is ec2-user or root. For Ubuntu*, the user name is ubuntu or root. For CentOS*, the user name is centos. For Fedora*, the user name is ec2-user. For SuSE*, the user name is ec2-user or root. Otherwise, if ec2-user and root do not work, check with your AMI provider.
o ssh -i /path/my-key-pair.pem ec2-user@ec2-198-51-100-1.compute-1.amazonaws.com
o [ec2-user@ip-172-31-10-114 ~]$ cat /etc/*release*
o NAME="Amazon Linux"
o VERSION="2"
o ID="amzn"
o ID_LIKE="centos rhel fedora"
o VERSION_ID="2"
o PRETTY_NAME="Amazon Linux 2"
o ANSI_COLOR="0;33"
o CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
o HOME_URL="https://amazonlinux.com/"
o Amazon Linux release 2 (Karoo)
o cpe:2.3:o:amazon:amazon_linux:2

Intel® AI Analytics Toolkit Installation

Once your AWS instance is up and running, you can proceed with the product installation.

To download the product:

  • Open the Get the Intel® AI Analytics Toolkit link.
  • For Windows* OS, Intel® AI Analytics Toolkit installation instructions can be followed using the Conda Package Manager Distribution.
  • For Linux* OS, the easiest way to install the Intel® AI Analytics Toolkit is using the Online & Offline Distribution.
  • In case of an AWS Linux* instance, an Online Installer is preferred. You have the option to choose the components of the AI Kit that will be installed.
  • Once you select Linux*, Online & Offline Distribution, and the Online Installer type, either click on the download link and upload that file to the AWS instance, or just follow those commands as follows in the command-line.
$wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/18486/l_AIKit_p_2022.1.2.135.sh
$sudo sh ./l_AIKit_p_2022.1.2.135.sh

Note: These commands might change when new versions of AI Kit are released.

  • Another easy way to get started with the Intel® AI Analytics Toolkit and take advantage of the Intel® optimization for AI frameworks and machine learning libraries in a containerized way, is to use its docker image from Intel’s public repository on docker hub. To download docker on an AWS EC2 instance, check this link.
##Pull AI Kit docker image after installing docker
$docker pull intel/oneapi-aikit

Getting Started with the Intel® AI Analytics Toolkit

Toolkit details

  • To check the installation
$ cd /opt/intel/oneapi/
  • To activate the AI Kit environment,
$ source /opt/intel/oneapi/setvars.sh 
:: initializing oneAPI environment ... 
      -bash: BASH_VERSION = 4.2.46(2)-release 
      args: Using "$@" for setvars.sh arguments: 
:: compiler – latest 
:: dal – latest 
:: dev-utilities – latest 
:: intelpython – latest 
:: ipp – latest 
:: mkl – latest 
:: modelzoo – latest 
:: mpi – latest 
:: neural-compressor – latest 
:: pytorch – latest 
:: tbb – latest 
:: tensorflow – latest 
:: oneAPI environment initialized ::
  • The AI Kit includes Conda. You can check the available Conda environments included in the toolkit using conda env list.
$conda env list 
# conda environments:
base                * /opt/intel/oneapi/intelpython/latest 
2022.0.2              /opt/intel/oneapi/intelpython/latest/envs/2022.0.2 
pytorch               /opt/intel/oneapi/intelpython/latest/envs/pytorch 
pytorch-1.8.0         /opt/intel/oneapi/intelpython/latest/envs/pytorch-1.8.0 
tensorflow            /opt/intel/oneapi/intelpython/latest/envs/tensorflow 
tensorflow-2.6.0      /opt/intel/oneapi/intelpython/latest/envs/tensorflow-2.6.0

   :: oneAPI environment initialized ::
  • The Model Zoo for Intel® Architecture is a Github repository which contains links to pre-trained models, sample scripts, best practices, and step-by-step tutorials for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors. There, e.g., you can try various scripts to see the benefits of Intel® optimized AI frameworks (Tensorflow and PyTorch), which come with the AI kit.

Jupyter Setup and How to Run Jupyter Notebooks

  • To use Intel® optimized Tensorflow* and PyTorch* in Jupyter, you need to add the kernels for them. The Intel® Python* kernel is already included in Jupyter.
##install jupyter and ipykernel 
$ python3 -m pip install jupyter ipykernel

##Activate Intel-optimized Tensorflow 
$conda activate tensorflow

##You may need to install these packages 
$python3 -m pip install python-dateutil packaging

##Add kernel for Tensorflow environment 
$python3 -m ipykernel install --name intel-tensorflow –user

##Activate Intel-optimized PyTorch and add kernel for its environment 
$conda activate pytorch 
$python3 -m ipykernel install --name intel-pytorch –user
  • To run Jupyter Notebook, start a Jupyter session with port 8888:
$ jupyter notebook --no-browser --port=8888 
[I 11:40:14.125 NotebookApp] Serving notebooks from local directory: /home/ec2-user 
[I 11:40:14.126 NotebookApp] Jupyter Notebook 6.4.10 is running at: 
[I 11:40:14.126 NotebookApp] 
http://localhost:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0 
[I 11:40:14.126 NotebookApp] or 
http://127.0.0.1:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0 
[I 11:40:14.126 NotebookApp] Use Control-C to stop this server and shut down
all kernels (twice to skip confirmation). 
[C 11:40:14.130 NotebookApp]

To access the notebook, open this file in a browser: 
file:///home/ec2-user/.local/share/jupyter/runtime/nbserver-6322-open.html 
Or copy and paste one of these URLs: 
http://localhost:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0 
or 
http://127.0.0.1:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0
  • Open a new terminal and establish an SSH connection to your Jupyter Notebook. Your command should be similar to the one below.
$ ssh -i /path/my-key-pair.pem ec2-user@ec2-198-51-100-1.compute-1.amazonaws.com -L xxxx:localhost:8888
  • Now take the url from the Jupyter Notebook session command and replace 8888 with xxxx and open it in a browser to start the Jupyter Notebook environment.
  • To run samples for Intel® PyTorch and Intel® Tensorflow in Jupyter, you can clone Intel’s repository oneAPI-samples and check AI-and-Analytics for various AI Kit related samples.
##Install git if not done before using 
$ sudo yum install git

##Clone the repo to the instance 
$git clone https://github.com/oneapi-src/oneAPI-samples
  • You need to select the Jupyter kernel while running a Jupyter Notebook accordingly, i.e., for Intel® Python* samples, select the Python* kernel.

Note: Some samples may require you to setup other environments(mentioned in readme) for performance analysis.

Additional Resources