Optimize a BERT-Large FP32 Inference Model Package with TensorFlow*

Published: 10/23/2020  

Last Updated: 06/15/2022

Download Command

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_3_0/bert-large-fp32-inference.tar.gz


This document has instructions for running BERT FP32 inference using Intel® Optimization for TensorFlow*.

BERT-Large Data

Download and unzip the BERT-Large uncased (whole word masking) model from the Google* BERT repository. Then, download the Stanford Question Answering Dataset (SQuAD) dataset file dev-v1.1.json into the wwm_uncased_L-24_H-1024_A-16 directory that was just unzipped.

wget https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip
unzip wwm_uncased_L-24_H-1024_A-16.zip

wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -P wwm_uncased_L-24_H-1024_A-16

Set the DATASET_DIR to point to that directory when running BERT-Large inference using the SQuAD data.

Quick Start Scripts

Script name Description
fp32_benchmark This script runs BERT-Large fp32 inference.
fp32_profile This script runs fp32 inference in profile mode.
fp32_accuracy This script runs BERT-Large fp32 inference in accuracy mode.

Bare Metal

To run on bare metal, the following prerequisites must be installed in your enviornment:

Once the above dependencies have been installed, download and untar the model package, set environment variables, and then run a quick start script. See the datasets and list of quick start scripts for more details on the different options.

The snippet below shows how to run a quick start script:

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_3_0/bert-large-fp32-inference.tar.gz
tar -xvf bert-large-fp32-inference.tar.gz
cd bert-large-fp32-inference

DATASET_DIR=<path to the dataset being used>
OUTPUT_DIR=<directory where log files will be saved>

# Run a script for your desired usage
bash ./quickstart/<script name>.sh

Documentation and Sources

Get Started​
Main GitHub*
Release Notes
Get Started Guide

Code Sources
Report Issue

License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.

Related Containers and Solutions

BERT-Large FP32 Inference TensorFlow* Container

View All Containers and Solutions 🡢

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.