Deep Learning Training and Testing on a Single Node Intel® Xeon® Scalable Processor System Using Intel® Optimized Caffe*

Published: 10/20/2017  

Last Updated: 10/20/2017

By Beenish Zia

I. Introduction

This document provides step-by-step instructions on how to train and test your trained single node Intel® Xeon® Scalable processor platform system, using an Intel® distribution of Caffe* framework for image recognition datasets (CIFAR10, MNIST). This document provides beginner level instructions, and both training and inference is happening on the same system. The steps have been verified on Intel Xeon Scalable processors as well as Intel® Xeon Phi™ processor systems, but should work on any latest Intel Xeon processor-based system. None of the software pieces used in this document were performance optimized.

This document is targeted for a beginner level audience who want to learn how to proceed with training and testing a deep learning dataset using the Intel distribution of Caffe framework once they have Intel Xeon processor-based hardware. The document assumes that the reader has basic Linux* knowledge and is familiar with concepts of deep learning training. The instructions can be confidently used as they are, or can be the foundation for enhancements and/or modifications.

This document is divided into seven major sections including the introduction. Section II details hardware and software bill of materials used to implement and verify the training. Section III covers installing CentOS Linux* as the base operating system. Sections IV and V cover details of software suites that need to be installed to have all the tools, libraries, and compilers needed for the training. Sections VI and VII enlist the steps needed build the model, train, and test the model with two simple datasets.

The hardware and software bill of materials used for verified implementation have been mentioned in Section II. Users can try a different configuration, but the configuration in Section II is recommended. Intel® Parallel Studio XE Cluster Edition provides you with most of the basic tools & libraries in one package installation, that are used for complete implementation of steps in this document. Furthermore, starting with Intel Parallel Studio XE Cluster Edition from the beginning will accelerate the learning curve needed for multinode implementation of the same training and testing, as this software will be significantly instrumental on a multinode implementation.

Similar follow-up documentation on step-by-step instructions detailing benchmarking on single-node, multinode implementation, and other frameworks implementation can be expected to be published in the future. 

II. Hardware and Software Bill of Materials

Item Manufacturer Model/Version
Hardware    
Intel® Server Chassis Intel R1208WT
Intel® Server Board Intel S2600WT
(2x) Intel® Xeon® Scalable processor Intel Intel® Xeon® Gold 6148 processor
(6x) 32GB LRDIMM DDR4 Crucial* CT32G4LFD4266
(1x) Intel® SSD 1.2TB Intel S3520
Software    
CentOS Linux* Installation DVD   7.3.1611
Intel® Parallel Studio XE Cluster Edition   2017.4
Intel® Distribution of Caffe*   MKL2017
Intel® Machine Learning Scaling Library for Linux* OS   2017.1.016

III. Install the Linux* Operating System

This section requires the following software component: CentOS-7-x86_64-*1611.iso. The software can be downloaded from the CentOS website.

DVD ISO was used for implementing and verifying the steps in this document, but the reader can use Everything ISO and Minimal ISO if preferred.

  • Insert the CentOS* 7.3.1611 install disc/USB. Boot from the drive and select Install CentOS 7.
  • Select Date and Time.
  • If necessary, select Installation Destination.
    • Select the automatic partitioning option.
    • Click Done to return home. Accept all defaults for the partitioning wizard if prompted.
  • Select Network and host name.
    • Enter “<hostname>” as the hostname.
      • Click the Apply button for the hostname to take effect.
    • Select Ethernet enp3s0f3 and click Configure to setup the external interface.
      • From the General section, check Automatically connect to this network when it’s available.
      • Configure the external interface as necessary. Save and exit.
    • Select the toggle to ON for the interface.
    • Click Done to return home
  • Select “Software Selection”
    • In the box labeled “Base Environment” on the left side, select “Infrastructure server”.
    • Click Done to return home.
  • Wait until the Begin Installation button is available, which may take several minutes. Then click it to continue.
  • While waiting for the installation to finish, set the root password.
  • Click Reboot when the installation is complete.
  • Boot from the primary device.
  • Log in as root.

Note: The next steps can all be done from the command line. If you need a GUI version of CentOS, follow the steps in the Appendix.

Configure YUM*

If the public network implements a proxy server for internet access, Yellowdog Updater Modified* (YUM*) must be configured in order to use it.

  • Open the /etc/yum.conf file for editing.
  • Under the main section, append the following line:
    Proxy=http://<address>:<port>
    where <address> is the address of the proxy server and <port> is the HTTP port.
  • Save the file and exit.

Disable updates and extras. Certain procedures in this document require packages to be built against the kernel. A future kernel update may break the compatibility of these built packages with the new kernel, so we disable repository updates and extras to provide further longevity to this document.

This document may not be used as is when CentOS updates to the next version. To use this document after such an update, it is necessary to redefine repository paths to point to CentOS 7.3 in the CentOS vault. To disable repository updates and extras:

Yum-config-manager --disable updates --disable extras

Install EPEL*

Extra Packages for Enterprise Linux* (EPEL*) provides 100 percent, high quality add-on software packages for Linux distribution [7]. This helps to avoid any error messages you might see during Caffe install and build.

Install GNU* C Compiler

Check whether the GNU Compiler Collection* (GCC*) is installed. Should be part of the Development Tools install. You can check by typing:

gcc --version or whereis gcc

IV. Install Intel® Distribution of Caffe*

  • Install Intel distribution of Caffe prerequisites:
Yum –y install git python-devel boost boost-devel cmakh>umpy \
gflags gflags-devel glog glog-devel protobuf \
protobuf-devel hdf5 hdf5-devel lmdb lmdb-devel leveldb leveldb-devel \
snappy-devel opencv opencv-devel
  • Install Intel® Machine Learning Scaling Library:

The Intel Machine Learning Scaling Library provides an efficient implementation of communication patterns used in deep learning (make sure to download the latest version from GitHub*; update the path below as needed):

yum -y install https://github.com/01org/MLSL/releases/download/v2017.1-Preview/intel-mlsl-devel-64-2017.1-016.x86_64.rpm

  • Append commands to source environments to the end of the system skeleton .bashrc.

The environment for the Intel Machine Learning Scaling Library may be loaded by sourcing environment scripts (EOF is End Of File):

cat >>/etc/skel/.bashrc <<EOF
#===== Intel Machine Learning Scaling Library ====
source /opt/intel/mlsl_2017.0.006/intel64/bin/mlslvars.sh
EOF

Configure HTTP and HTTPS Proxies:

If your network implements a proxy server for Internet access, configure the HTTP and HTTPS proxies to use it.

a. Run the following command to enable the proxy for HTTP and HTTPS:

cat >>/etc/skel/.bashrc <<EOF
#====== HTTP and HTTPs proxies ========
export http_proxy=http://<address>:<port>
export https_proxy=https://<address>:<port>
EOF

V. Install Intel® Parallel Studio XE 2017 Cluster Edition

Note: This section requires the following software component:

parallel_studio_xe_2017_update4.tgz

Get the Parallel Studio XE 2017 Cluster Edition product and license file.

Get the Parallel Studio XE 2017 Cluster Edition installation guide:

If you are just going through the document for educational purposes, you can use the 30-day trial version of Intel Parallel Studio XE. However, if you plan to use the basic software installation for long-term professional use and will be building upon this basic guide, then a licensed version is recommended.

If you are saving to a USB, you might have to save two separate Zip* files, and then do cat parallel_*zip*>psxe_update4.zip; then unzip psxe_update4.zip:

  • Install prerequisite packages:

yum –y install gtk2 redhat-lsb gcc gcc-c++ kernel-devel

  • Extract the installer:

tar –xzf parallel_studio_xe_2017_update4.tgz –C /usr/src

  • Install Intel Parallel Studio XE 2017 Cluster Edition:
    1. Start the installer

/usr/src/parallel_studio_xe_2017_update4/install.sh

2. Press Enter to continue.
3. Read the end-user license agreement. Press Space to scroll through each page and continue to the next prompt.
4. Type the word accept and press Enter.
5. Wait for the prerequisite check to finish. This check may take several minutes.
6. Follow the prompts to activate the license. Activation may take several minutes. Press Enter to continue.
7. Accept or decline involvement in the Intel® Software Improvement Program. Press Enter to continue.
8. Press Enter to begin configuring the installation.
9. Press Space to scroll, type 2, then press Enter.
10. Press 1 to deselect IA-32 architecture, then press Enter.
11. Press Enter once to proceed, then press Enter again to begin the installation.
12. If the prompt shown below appears, select ‘y’ and press Enter.

13. Wait for the installation to finish. Installation may take several minutes. When prompted, press Enter to complete the installation.

Set Up Environment Scripts

Append commands to source environments to the end of the system skeleton .bashrc.

Components of Intel Parallel Studio 2017 XE may be loaded by sourcing environment scripts:

cat >>/etc/skel/.bashrc <<EOF
#=== Intel Parallel Studio XE 2017 Update 4 ====
source /opt/intel/parallel_studio_xe_2017.*
EOF

***Post Installation***

VI. Build Intel Distribution of Caffe

You need to build the software so the source code is converted into an executable code that can be run on the platform.

  • Execute the following Git* commands to obtain the latest snapshot of Intel distribution of Caffe:

git clone https://github.com/intel/caffe.git intelcaffe

source /opt/intel/mlsl_2017.1.016/intel64/bin/mlslvars.sh

  • Building from make file (if this fails, try building from the cmake file as mentioned in the next bullet):

cd intelcaffe/

Make a copy of the Makefile.config.example:

cp Makefile.config.example Makefile.config

Open Makefile.config in your favorite editor and uncomment USE_MLSL variable:

vi Makefile.config

USE_MLSL :=1

Execute the make command to build Intel distribution of Caffe:

make –j  <#number of cores> -k

  • Building from cmake (optional; use only if step above unsuccessful):

cd intelcaffe

mkdir build

cd build

Execute the following CMake command in order to prepare the build:

cmake .. –DBLAS=mk1 –DUSE_MLSL=1 –DCPU_ONLY=1

Build Intel distribution of Caffe with multinode support and <#> as the number of cores. This step may take several minutes:

Make –j <# of cores> -k

Note:The build of Intel distribution of Caffe will trigger Intel® Math Kernel Library for Machine Learning to be downloaded to the intelcaffe/external/mkl/directory and be automatically configured.

VII. Train on Intel Distribution Caffe and Test the Training3, 6

CIFAR10 dataset

  • Train the system:

cd ~/intelcaffe

Get CIFAR10 data:

./data/cifar10/get_cifar10.sh

Convert CIFAR10 data into leveldb format and compute image mean:

./examples/cifar10/create_cifar10.sh

  • Test the training:

./examples/cifar10/train_quick.sh

Expect approximately 75 percent accuracy.

MNIST dataset

  • Train the system:

cd ~/intelcaffe

Get CIFAR10 data:

./data/mnist/get_mnist.sh

Convert MNIST data into leveldb format and compute image mean:

./examples/mnist/create_mnist.sh

  • Test the training:

./examples/mnist/train_lenet.sh

Expect approximately 99 percent accuracy.

Acknowledgement

Special thanks to my colleague, Anuya Welling, for documenting the step in her reference design, which was significantly used in writing this document. Also, all her help in resolving various issues that I faced during the process is much appreciated. I would also like to acknowledge all the helpful resources available on GitHub, which were highly instrumental in validating the steps mentioned in this document.

References

  1. Intel® Scalable System Framework (Intel® SSF) Reference Design. 2017.03.31
  2. Guide to multi-node training with Intel® Distribution of Caffe*
  3. Alex’s CIFAR-10 tutorial, Caffe style
  4. Intel Product Registration Center
  5. Multi-node CIFAR10
  6. Training LeNet on MNIST with Caffe
  7. How to Enable EPEL Repository for RHEL/CentOS 7.x/6.x/5.x
  8. https://software.intel.com/en-us/ai-academy/basics

Appendix

For CentOS GUI installation. Make sure you have the required Zip files on a thumb drive.

fdisk –l

#choose the sdb drive whichever name it says for USB
mount /dev/sdb1 /media
 
ls /media
yum –y localinstall /media/unzip-6.0-15.el7.x86_64.rpm 
cat /media/CentOS-7-x86_64-Everything-1611.zip.00* >cent7.zip 
umount /media
unzip cent7.zip 

ls /media
mkdir /media/cdrom 
mount –o loop CentOS-7-x86_64-Everything-1611.iso /media/cdrom 

vi /etc/yum.repos.d/CentOS-Media.repo
#change enabled=0 to 1
enabled=1

mv /etc/yum.repos.d/* /root/
mv CentOS-Media.repo /etc/yum.repos.d/
#Done because network wasn’t connected; if network connected then just set proxies and it should work without moving the CentOS file…
yum repolsit 
yum –y groupinstall “Development and Creative Workstation” ; yum –y groupinstall “Development Tools”  
#This takes time

Once installation is done, type startx to start GUI version of CentOS.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.