Train a TensorFlow* Model on Intel® Architecture

Published: 06/21/2017  

Last Updated: 06/29/2018

Introduction

In this paper, you will learn how to train and save a TensorFlow* model, build a TensorFlow model server, and test the server using a client application. 

Prerequisites

Intel® Optimization for TensorFlow Installation Guide Use sources from GitHub* to build and install an instance of TensorFlow that's optimized for Intel architecture​.
Build and Install TensorFlow Serving* on Intel Architecture Build and install TensorFlow Serving, a high-performance serving system for machine learning models that are designed for production environments.
Background Reading MNIST for ML Beginners
Serving a TensorFlow Model

 

Train and Save a MNIST Model

The Modified National Institute of Standards and Technology (MNIST) database contains 60,000 training images and 10,000 testing images used for training and testing in the field of machine learning. Because of its relative simplicity, the MNIST database is often used as an introductory dataset for demonstrating machine learning frameworks.

Step 1

Open a terminal.

 

Step 2

Type the following commands:

cd ~/serving
bazel build //tensorflow_serving/example:mnist_saved_model
rm -rf /tmp/mnist_model
bazel-bin/tensorflow_serving/example/mnist_saved_model /tmp/mnist_model

 

Because we passed /tmp/mnist_model for the export directory, the trained model was saved in /tmp/mnist_model/1. Since we omitted the training_iterations and model_version command-line parameters when we ran mnist_saved_model, they defaulted to 1000 and 1, respectively.

The /tmp/mnist_model/1 subdirectory contains the following files:

  • saved_model.pb (serialized tensorflow::SavedModel)
  • One or more graph definitions of the model
  • Metadata of the model, such as signatures
  • Serialized variables of the graphs

Troubleshooting Tips 

Error logged as NotFoundError in mnist_export example #421. If you encounter an error after issuing the last command try this workaround:

  1. Open serving bazel-bin/tensorflow_serving/example/mnist_saved_model.runfiles/ org_tensorflow/tensorflow/contrib/image/__init__.py
  2. Comment-out (#) the following line as shown:
    #from tensorflow.contrib.image.python.ops.single_image_random_dot_stereograms import single_image_random_dot_stereograms
  3. Save and close __init__.py.
  4. Enter the command:
    bazel-bin/tensorflow_serving/example/mnist_saved_model /tmp/mnist_model

In some instances, you might encounter an issue with the downloaded training files becoming corrupted when the script runs. This error is identified as Not a gzipped file #170 on GitHub. If necessary, these files can be downloaded manually by issuing the following commands from the /tmp directory:

wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

 

Build and Start the TensorFlow Model Server

Step 3

Build the TensorFlow model server by issuing the following command:

bazel build //tensorflow_serving/model_servers:tensorflow_model_server

 

Step 4

Start the TensorFlow model server by issuing the following command:
bazel-bin/tensorflow_serving/model_server/tensorflow_model_server --port=9000 --model_name=mnist --model_base_path=/tmp/mnist_model/ &

 

Test the TensorFlow Model Server

Step 5

Open the mnist_client utility provided in the TensorFlow Serving installation.

 

Step 6

Enter the following commands from the /serving directory:

bazel build //tensorflow_serving/example:mnist_client
bazel-bin/tensorflow_serving/example/mnist_client --num_tests=1000 --server=localhost:9000

Results similar to Figure 1 will appear.

Screenshot of a command prompt window with client test results

Figure 1. TensorFlow client test results

 

Troubleshooting Tips

There is an error identified on GitHub as gRPC doesn't respect the no_proxy environment variable that may result in an Endpoint read failed error when you run the client application. Issue the env command to see if the http_proxy environment variable is set. If so, it can be temporarily reversed by issuing the following command:

unset http_proxy

Conclusion

You've now learned to train and save a simple model based on the MNIST dataset, and then deploy it using a TensorFlow model server. You also used the mnist_client example for a simple machine learning inference.

In this series of tutorials, we explored the process of building the TensorFlow machine learning framework and TensorFlow Serving, a high-performance serving system for machine learning models, optimized for Intel architecture.

 

References

TensorFlow is a leading deep learning and machine learning framework, which now integrates optimizations for Intel® Xeon® processors.

 

Additional Framework Information

TensorFlow Website

TensorFlow Optimizations on Modern Intel Architecture - introduces the specific graph optimizations, performance experiments, and details for building and installing TensorFlow with CPU optimizations.

 

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.