Intel® oneAPI AI Analytics Toolkit Installation and Getting started...

Aditya Sirvaiya

Introduction

The Intel® AI Analytics Toolkit (AI Kit) targets data scientists, AI engineers, and researchers through optimized, familiar Python* tools and AI frameworks to accelerate end-to-end data science and analytics pipelines on Intel® architectures.

The AI Kit is part of oneAPI toolkits. Its components are built using performance libraries of the Intel® oneAPI Base Toolkit, i.e., Intel® oneAPI Math Kernel Library (oneMKL), Intel® oneAPI Data Analytics Library (oneDAL), Intel® oneAPI Deep Neural Network Library (oneDNN), and Intel® oneAPI Collective Communications Library (oneCCL) for low-level compute optimizations.

This article covers the installation steps and helps you to get started with the Intel® AI Analytics Toolkit on Amazon Web Services (AWS). For system requirements and other details, please refer to the Intel® AI Analytics Toolkit. Release notes can be found in a dedicated article.

Components of the Intel® AI Analytics Toolkit

All the AI Kit components can be installed standalone without needing to install full AI Kit. To install a particular component, you can check out the links provided below for each component.

The AI Kit includes:

Intel® Distribution for Python including highly-optimized scikit-learn and XGBoost libraries*: Get faster Python application performance right out of the box, with minimal or no changes to your code.
Intel® Optimization for PyTorch*: The Intel® oneAPI Deep Neural Network Library (oneDNN) is included in PyTorch as the default math kernel library for deep learning.
Intel® Optimization for TensorFlow*: This version integrates primitives from oneDNN into the TensorFlow runtime for accelerated performance.
Intel® Distribution of Modin* (available through Anaconda* only): which enables you to seamlessly scale preprocessing across multi nodes using this intelligent, distributed dataframe library with an identical API to pandas.
Model Zoo for Intel® Architecture: Access pretrained models, sample scripts, best practices, and step-by-step tutorials for many popular open source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors.
Intel® Neural Compressor: Quickly deploy low-precision inference solutions on popular deep-learning frameworks such as TensorFlow*, PyTorch*, MXNet*, and ONNX* (Open Neural Network Exchange) runtime.

Getting Started with AWS and Setting Up an AWS Linux* Instance

This article assumes you are familiar with the AWS environment. To learn more about working with AWS, see the Amazon Elastic Compute Cloud Documentation.

Specifically, this article assumes:

You have an AWS account.
You are familiar with creating instances within the AWS environment.
To learn more about launching an instance, refer to Getting Started with Amazon EC2 Linux* Instances and Launching an Instance.

AWS Instances with Intel® Xeon® Processors

Two important Intel processor features are Intel® Advanced Vector Extensions(Intel® AVX-512) and Intel® Deep Learning Boost (Intel® DL Boost).

We recommend AWS EC2 instances with Intel® DL Boost for your deep learning workloads to get the best out of Intel® processors using AI Kit. Or at least instances with AVX512 support. Other EC2 Intel® Xeon® processor instances (without AVX512) can also be used for AI Kit.

Intel® Xeon® Scalable processors with Intel® DL Boost significantly increase deep learning training and inference performance by leveraging new Vector Neural Network Instruction (VNNI/INT8, supports INT8/BF16 quantization and mixed-precision models) over previous generation Intel® Xeon® Scalable processors (with FP32), for image recognition/segmentation, object detection, speech recognition, language translation, recommendation systems, reinforcement learning, and others AVX-512 is the latest x86 vector instruction set, with up to two fused-multiply add units and other optimizations for applications that are Floating Point (FP) intensive.

Based on these features and the Intel® Xeon® processor generation type, the table lists AWS EC2 instances with an Intel Xeon Processor. The codenames for Intel® Xeon® processors are following:

The codename for 1st gen Intel® Xeon® processors is Sky Lake(SKX).
The codename for 2nd gen Intel® Xeon® processors is Cascade Lake(CLX).
The codename for 3rd gen Intel® Xeon® processors is Ice Lake(ICX).

* These instances might launch other type of processor with AVX512 feature.

Creating an Instance

Log in to your AWS account.
Navigate to the EC2 dashboard.
Click the Launch Instance button.
Step 1 – Choose AMI: Select the AMI type to launch as an instance. Suggested AMI: Latest Amazon Linux* or Ubuntu* AMI.
Step 2 – Choose an Instance Type: Use the default t2.micro instant type.

Note: you can move to any step using the dashboard steps bar on top.

Step 3 – Configure Instance: Choose your desired (or default) VPC and a Subnet. Click the Protect against accidental termination box.
Step 4 – Add Storage: Set the instance storage to a minimum of 20GB.
Step 5 – Add Tags: Name the instance if you want to have a unique identifier (Example: Key = “Name”, Value = “AI Kit”).
Step 6 – Configure Security Group: You may use the default settings or restrict them to match your security needs. Important: Make sure SSH, Port TCP 22, is accessible. Add your IP address in Inbound rules.
Step 7 – Review and Launch: Click on Launch to launch the instance.
Create Key-Pair dialog: Use any existing key-pair or select the option Create a new key pair and enter a key pair name (Example: “aikit”). Click on Download key pair. A key file (.pem) will be created. Save the file (here: aikit.pem).

Note: this is the only time you will be able to download the key.

Click the Launch Instances button: The dashboard will display the Launch Status. Click on the Instance ID to return to the Instance Dashboard. This process may take a few minutes before the instance is running.

Connecting to the AWS Instance

Prerequisites

Install an SSH client
Your Linux* computer most likely includes an SSH client by default. You can check for an SSH client by typing ssh in the command-line. If your computer does not recognize the command, the OpenSSH project provides a free implementation of the full suite of SSH tools. For more information, see the OpenSSH web page. For Windows*, you can use PowerShell* or download PuTTY.
Install the AWS CLI Tools
(Optional) If you are using a public AMI from a third party, you can use the command-line tools to verify the fingerprint. For more information about installing the AWS CLI, please refer to Getting Set Up in the AWS Command Line Interface User Guide.
Get the public DNS name of the instance
You can get the public DNS for your instance using the Amazon EC2 console (check the Public DNS (IPv4) column; if this column is hidden, choose the Show/Hide icon and select Public DNS (IPv4)). If you prefer, you can use the describe-instances (AWS CLI) or Get-EC2Instance (AWS Tools for Windows PowerShell) command.
Locate the private key
Get the fully qualified path to the location on your computer of the .pem file for the key pair that you specified when you launched the instance.
Enable inbound SSH traffic from your IP address to your instance
Ensure that the security group associated with your instance allows incoming SSH traffic from your IP address. For more information, see Authorizing Network Access to Your Instances.
Important: Your default security group does not allow incoming SSH traffic by default.

Connecting Using SSH

In a command-line shell, change directories to the location of the private key file that you created when you launched the instance.
Use the chmod command to make sure that your private key file is not publicly viewable. For example, if the name of your private key file is my-key-pair.pem, use the following command:


o chmod 400 /path/my-key-pair.pem

Use the ssh command to connect to the instance. You specify the private key (.pem) file and the user_name@public_dns_name. For Amazon Linux*, the user name is ec2-user. For RHEL*, the user name is ec2-user or root. For Ubuntu*, the user name is ubuntu or root. For CentOS*, the user name is centos. For Fedora*, the user name is ec2-user. For SuSE*, the user name is ec2-user or root. Otherwise, if ec2-user and root do not work, check with your AMI provider.

o ssh -i /path/my-key-pair.pem ec2-user@ec2-198-51-100-1.compute-1.amazonaws.com
o [ec2-user@ip-172-31-10-114 ~]$ cat /etc/*release*
o NAME="Amazon Linux"
o VERSION="2"
o ID="amzn"
o ID_LIKE="centos rhel fedora"
o VERSION_ID="2"
o PRETTY_NAME="Amazon Linux 2"
o ANSI_COLOR="0;33"
o CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
o HOME_URL="https://amazonlinux.com/"
o Amazon Linux release 2 (Karoo)
o cpe:2.3:o:amazon:amazon_linux:2

Intel® AI Analytics Toolkit Installation

Once your AWS instance is up and running, you can proceed with the product installation.

To download the product:

Open the Get the Intel® AI Analytics Toolkit link.
For Windows* OS, Intel® AI Analytics Toolkit installation instructions can be followed using the Conda* Package Manager Distribution.
For Linux* OS, the easiest way to install the Intel® AI Analytics Toolkit is using the Online & Offline Distribution.
In case of an AWS Linux* instance, an Online Installer is preferred. You have the option to choose the components of the AI Kit that will be installed.
Once you select Linux*, Online & Offline Distribution, and the Online Installer type, either click on the download link and upload that file to the AWS instance, or just follow those commands as follows in the command-line.

$wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/18486/l_AIKit_p_2022.1.2.135.sh
$sudo sh ./l_AIKit_p_2022.1.2.135.sh

Note: These commands might change when new versions of AI Kit are released.

Another easy way to get started with the Intel® AI Analytics Toolkit and take advantage of the Intel® optimization for AI frameworks and machine learning libraries in a containerized way, is to use its docker image from Intel’s public repository on docker hub. To download docker on an AWS EC2 instance, check this link.

##Pull AI Kit docker image after installing docker
$docker pull intel/oneapi-aikit

Getting Started with the Intel® AI Analytics Toolkit

Toolkit details

To check the installation


$ cd /opt/intel/oneapi/

To activate the AI Kit environment,

$ source /opt/intel/oneapi/setvars.sh 
:: initializing oneAPI environment ... 
      -bash: BASH_VERSION = 4.2.46(2)-release 
      args: Using "$@" for setvars.sh arguments: 
:: compiler – latest 
:: dal – latest 
:: dev-utilities – latest 
:: intelpython – latest 
:: ipp – latest 
:: mkl – latest 
:: modelzoo – latest 
:: mpi – latest 
:: neural-compressor – latest 
:: pytorch – latest 
:: tbb – latest 
:: tensorflow – latest 
:: oneAPI environment initialized ::

The AI Kit includes Conda. You can check the available Conda environments included in the toolkit using conda env list.

$conda env list 
# conda environments:
base                * /opt/intel/oneapi/intelpython/latest 
2022.0.2              /opt/intel/oneapi/intelpython/latest/envs/2022.0.2 
pytorch               /opt/intel/oneapi/intelpython/latest/envs/pytorch 
pytorch-1.8.0         /opt/intel/oneapi/intelpython/latest/envs/pytorch-1.8.0 
tensorflow            /opt/intel/oneapi/intelpython/latest/envs/tensorflow 
tensorflow-2.6.0      /opt/intel/oneapi/intelpython/latest/envs/tensorflow-2.6.0

   :: oneAPI environment initialized ::

The Model Zoo for Intel® Architecture is a Github repository which contains links to pre-trained models, sample scripts, best practices, and step-by-step tutorials for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors. There, e.g., you can try various scripts to see the benefits of Intel® optimized AI frameworks (Tensorflow and PyTorch), which come with the AI kit.

Jupyter Setup and How to Run Jupyter Notebooks

To use Intel® optimized Tensorflow* and PyTorch* in Jupyter, you need to add the kernels for them. The Intel® Python* kernel is already included in Jupyter.

##install jupyter and ipykernel 
$ python3 -m pip install jupyter ipykernel

##Activate Intel-optimized Tensorflow 
$conda activate tensorflow

##You may need to install these packages 
$python3 -m pip install python-dateutil packaging

##Add kernel for Tensorflow environment 
$python3 -m ipykernel install --name intel-tensorflow –user

##Activate Intel-optimized PyTorch and add kernel for its environment 
$conda activate pytorch 
$python3 -m ipykernel install --name intel-pytorch –user

To run Jupyter Notebook, start a Jupyter session with port 8888:

$ jupyter notebook --no-browser --port=8888 
[I 11:40:14.125 NotebookApp] Serving notebooks from local directory: /home/ec2-user 
[I 11:40:14.126 NotebookApp] Jupyter Notebook 6.4.10 is running at: 
[I 11:40:14.126 NotebookApp] 
http://localhost:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0 
[I 11:40:14.126 NotebookApp] or 
http://127.0.0.1:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0 
[I 11:40:14.126 NotebookApp] Use Control-C to stop this server and shut down
all kernels (twice to skip confirmation). 
[C 11:40:14.130 NotebookApp]

To access the notebook, open this file in a browser: 
file:///home/ec2-user/.local/share/jupyter/runtime/nbserver-6322-open.html 
Or copy and paste one of these URLs: 
http://localhost:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0 
or 
http://127.0.0.1:8888/?token=045fef4e02c7a17eba7fb7abd43866d48da911c6e11189a0

Open a new terminal and establish an SSH connection to your Jupyter Notebook. Your command should be similar to the one below.


$ ssh -i /path/my-key-pair.pem ec2-user@ec2-198-51-100-1.compute-1.amazonaws.com -L xxxx:localhost:8888

Now take the url from the Jupyter Notebook session command and replace 8888 with xxxx and open it in a browser to start the Jupyter Notebook environment.
To run samples for Intel® PyTorch and Intel® Tensorflow in Jupyter, you can clone Intel’s repository oneAPI-samples and check AI-and-Analytics for various AI Kit related samples.

##Install git if not done before using 
$ sudo yum install git

##Clone the repo to the instance 
$git clone https://github.com/oneapi-src/oneAPI-samples

You need to select the Jupyter kernel while running a Jupyter Notebook accordingly, i.e., for Intel® Python* samples, select the Python* kernel.

Note: Some samples may require you to setup other environments(mentioned in readme) for performance analysis.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® AI Analytics Toolkit Installation and Getting Started Guide for Amazon Web Services (AWS)*

Introduction

Components of the Intel® AI Analytics Toolkit

Getting Started with AWS and Setting Up an AWS Linux* Instance

AWS Instances with Intel® Xeon® Processors

Creating an Instance

Connecting to the AWS Instance

Prerequisites

Connecting Using SSH

Intel® AI Analytics Toolkit Installation

Getting Started with the Intel® AI Analytics Toolkit

Toolkit details

Jupyter Setup and How to Run Jupyter Notebooks

Additional Resources

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® AI Analytics Toolkit Installation and Getting Started Guide for Amazon Web Services (AWS)*

Introduction

Components of the Intel® AI Analytics Toolkit

Getting Started with AWS and Setting Up an AWS Linux* Instance

AWS Instances with Intel® Xeon® Processors

Creating an Instance

Connecting to the AWS Instance

Prerequisites

Connecting Using SSH

Intel® AI Analytics Toolkit Installation

Getting Started with the Intel® AI Analytics Toolkit

Toolkit details

Jupyter Setup and How to Run Jupyter Notebooks

Additional Resources

Product and Performance Information