Run a DLRM Bfloat16 Training Using a PyTorch* Model Package

Published: 12/11/2020

Download Command

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_3_0/dlrm-bfloat16-training.tar.gz

Description

This document has instructions for running DLRM BFloat16 training using Intel® Extension for PyTorch*.

Prepare your dataset according to the instructions.

Set the DATA_PATH to point to "

" directory when running DLRM.

 

Quick Start Scripts

Script name Description
train_single_node Run 32K global BS with 4 ranks on 1 node

Bare Metal

To run on bare metal first, follow the instructions until Section 4.

After installing the prerequisites, Set environment variables for the path to your DATA_PATHthen run a quick start script.

DATA_PATH=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>
quickstart/<script name>.sh

 


Documentation and Sources

Get Started
Main GitHub*
Readme
Release Notes
Get Started Guide

Code Sources
Report Issue

 


License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.


Related Containers and Solutions

DLRM BFloat16 Training TensorFlow* Container

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.