Transformer-LT MLPerf FP32 Inference TensorFlow* Container

Published: 11/13/2020  

Last Updated: 06/15/2022

Pull Command

docker pull intel/language-translation:tf-latest-transformer-mlperf-fp32-training

Description

This document has instructions to run a Transformer Language FP32 training in MLPerf* benchmark suite using Intel® Optimization for TensorFlow*. Detailed information on MLPerf* benchmark can be found in mlperf/training.

Datasets

Decide the problem you want to run to get the appropriate dataset. We will get the training data of it as an example:

Download dataset for computing BLEU score.

export DATASET_DIR=/home/<user>/transformer_data
mkdir $DATASET_DIR && cd $DATASET_DIR
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/newstest2014.en
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/newstest2014.de

For the training dataset, download and untar the model package.

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_3_0/transformer-mlperf-fp32-training.tar.gz
tar -xzf transformer-mlperf-fp32-training.tar.gz

export PYTHONPATH=$PYTHONPATH:/home/<user>/transformer-mlperf-fp32-training/models/common/tensorflow
export DATASET_DIR=/home/<user>/transformer_data
    
cd /home/<user>/transformer-mlperf-fp32-training/models/language_translation/tensorflow/transformer_mlperf/training/fp32/transformer
python data_download.py --data_dir=$DATASET_DIR

Running python data_download.py --data_dir=$DATASET_DIR assumes you have a Python* environment similar to what the intel/intel-optimized-tensorflow:latest container provides. One option would be to run the above within the intel/intel-optimized-tensorflow:latest container eg: docker run -u $(id -u):$(id -g) --privileged --entrypoint /bin/bash -v /home/:/home/ -it intel/intel-optimized-tensorflow:latest

Quick Start Scripts

Transformer Language in MLPerf benchmark can run with full training or fewer training steps. During training we can control if it will do the evaluation or not.

Script name Description
fp32_training_demo Runs 100 training steps (run on a single socket of the CPU).
fp32_training Runs 200 training steps, saves checkpoints and do evaluation (run on a single socket of the CPU).
fp32_training_mpirun Runs training in multi-instance mode "2 sockets in a single node for example" using mpirun for the specified number of processes.

Docker*

The model container includes the scripts and libraries needed to run Transformer Language FP32 training. To run one of the quickstart scripts using this container, you'll need to provide volume mounts for the dataset and an output directory.

DATASET_DIR=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>

docker run \
  --env DATASET_DIR=${DATASET_DIR} \
  --env OUTPUT_DIR=${OUTPUT_DIR} \
  --env http_proxy=${http_proxy} \
  --env https_proxy=${https_proxy} \
  --volume ${DATASET_DIR}:${DATASET_DIR} \
  --volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
  --privileged --init -t \
  intel/language-translation:tf-latest-transformer-mlperf-fp32-training \
  /bin/bash quickstart/<script name>.sh

 


Documentation and Sources

Get Started​
Docker* Repository
Main GitHub*
Readme
Release Notes
Get Started Guide

Code Sources
Dockerfile
Report Issue


License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.


Related Containers and Solutions

Transformer-LT MLPerf FP32 Training TensorFlow* Model Package

View All Containers and Solutions 🡢

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.