Deploy Intel® Neural Compressor Using AWS*



Intel® Neural Compressor is an open source Python* library designed to help quickly optimize inference solutions on popular deep learning frameworks (TensorFlow*, PyTorch*, ONNX* [Open Neural Network Exchange] runtime, and Apache MXNet*). Intel Neural Compressor is a ready-to-run optimized solution that uses the features of 3rd generation Intel® Xeon® Scalable processors (formerly code named Ice Lake) for performance improvement.

This quick start guide provides instructions for deploying Intel Neural Compressor to Docker* containers. The containers are packaged by Bitnami* on Amazon Web Services (AWS)* for the Intel® processors.

What's Included

The Intel Neural Compressor includes the following precompiled binaries:

Intel®-Optimized Library                                                      

Minimum Version                                                      





Intel® Extension for PyTorch*


ONNX Runtime




  • An AWS account with Amazon EC2*. For more information, see Get Started.
  • 3rd generation Intel Xeon Scalable processors

  1. Sign into the AWS console.
  2. Go to your EC2 Dashboard, and then select Launch Instances.
  3. In the search box, enter Ubuntu.
  4. Select the appropriate Ubuntu* instance, and then select Next. The Choose an Amazon Machine Image (AMI) screen appears.


  5. To choose an instance:
    a. Locate the right region and any of the Intel Xeon Scalable processors, and then select the Select button.
    b. Select Configure Instance Details. The available regions are:

             • US East (Ohio)
             • US East (N. Virginia)
             • US West (N. California)
             • US West (Oregon)
             • Asia Pacific (Mumbai)
             • Asia Pacific (Seoul)
             • Asia Pacific (Singapore)
             • Asia Pacific (Sydney)
             • Asia Pacific (Tokyo)
             • Europe (Frankfurt)
             • Europe (Ireland)
             • Europe (Paris)
             • South America (São Paulo) 
For up-to-date information, see Amazon EC2 M6i Instances.

  1. Select Next: Configure Instance Details. The next page appears.
  1. To configure the instance details, select the appropriate Network for the VPC and the Subnet for your instance to launch into.


  2. Select Add Storage. The storage page appears.
  3. Increase the storage size to fit your needs, and then select Add a Tag. The tag page appears.
  4. (Optional) On the Add Tags tab, add the tags (notes for your own reference), and then select Configure Security Group. The Configure Security Group page appears.
  5. To create a security group:
    1. Enter a Security group name and Description.
    2. Configure the protocol and port range to allow for communication on port 22.
    3. Select Review and Launch. The Review Launch Instance page appears.


  6. To ensure that the details match your selection, scroll through the information.
  7. Select Launch. A keypair page appears.


  8. To choose a keypair:
    1. Select Choose to create new or Existing keypair.
    2. In Select a keypair, choose the keypair you created if you have an existing keypair. Otherwise, choose Create New to generate one.
    3. Select the I acknowledge check box.
    4. Select Launch Instances. The Instances page appears. It shows the launch status.


  9. To launch an instance:
    1. On the left side of the page, select the check box next to the instance.
    2. In the upper-right of the page, select Connect. The Connect to instance page appears. This page has information on how to connect to the instance.


    3. Select the SSH client tab, and then copy the command under Connect to your instance using its Public DNS.


    4. In a terminal window, to connect the instance, enter the SSH command. You specify the path and file name of the private key (.pem), the username for your instance, and the public DNS name.


  10. To deploy Intel Neural Compressor on a Docker container:
    1. If needed, install Docker on Ubuntu.
    2. To pull the latest Intel Neural Compressor image, open a terminal window, and then enter the following command: docker pull bitnami/INC-intel:latest.

Note Intel recommends using the latest image. If needed, you can find older versions in the Docker Hub Registry.

    1. To test Intel Neural Compressor, start the container with this command: sudo docker run -it --name inc bitnami/inc-intel

Note inc-intel is the container name for the bitnami/inc-intel image.

For more information on the docker run command (which starts a Python* session), see Docker Run. The container is now running.

  1. To import Intel Neural Compressor into your program, in the terminal window, enter the following command: from neural_compressor.experimental import Quantization, common

For more information about using Intel Neural Compressor and its API, see Documentation.


To ask questions for product experts at Intel, go to the Intel Collective at Stack Overflow, and then post your question with the intel-cloud tag.

For questions about the Intel Neural Compressor image on Docker Hub, see the Bitnami Community.

To file a Docker issue, see Issues.