Optimize a ResNet50 V1.5 FP32 Training Container with TensorFlow*

ID 679187
Updated 6/15/2022
Version Latest
Public

author-image

By

Pull Command

docker pull intel/image-recognition:tf-latest-resnet50v1-5-fp32-training

Description

This document has instructions for running ResNet50* v1.5 FP32 training using Intel® Optimization for TensorFlow*.

Note that the ImageNet dataset is used in these ResNet50 v1.5 examples. Download and preprocess the ImageNet dataset using the instructions here. After running the conversion script you should have a directory with the ImageNet dataset in the TF records format.

Quick Start Scripts

Script name Description
fp32_training_demo Launches a short run using small batch sizes and a limited number of steps to demonstrate the training flow
fp32_training_1_epoch Launches a test run that trains the model for one epoch and saves checkpoint files to an output directory.
fp32_training_full Trains the model using the full dataset and runs until convergence (90 epochs) and saves checkpoint files to an output directory. Note that this will take a considerable amount of time.
multi_instance_training_demo Uses numactl to execute one instance per socket of a short run using small batch sizes and a limited number of steps to demonstrate the training flow
multi_instance_training Uses numactl to execute one instance per socket for the full training flow. Checkpoint files and logs for each instance are saved to the output directory. Note that this will take a considerable amount of time.

Docker*

The ResNet50 v1.5 FP32 training model container includes the scripts and libraries needed to run ResNet50 v1.5 FP32 training. To run one of the model training quick start scripts using this container, you'll need to provide volume mounts for the ImageNet dataset and an output directory where checkpoint files will be written.

DATASET_DIR=<path to the preprocessed imagenet dataset>
OUTPUT_DIR=<directory where checkpoint and log files will be written>

docker run \
  --env DATASET_DIR=${DATASET_DIR} \
  --env OUTPUT_DIR=${OUTPUT_DIR} \
  --env http_proxy=${http_proxy} --env https_proxy=${https_proxy} \
  --volume ${DATASET_DIR}:${DATASET_DIR} \
  --volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
  --privileged --init -t \
  intel/image-recognition:tf-latest-resnet50v1-5-fp32-training \
  /bin/bash quickstart/<script name>.sh

Documentation and Sources

Get Started​
Docker* Repository
Main GitHub*
Readme
Release Notes
Get Started Guide

Code Sources
Dockerfile
Report Issue


License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.


View All Containers and Solutions 🡢