Wide & Deep Large Dataset Int8 Inference TensorFlow* Container

Published: 12/09/2020  

Last Updated: 06/15/2022

Pull Command

docker pull intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference

Description

Datasets

The large Kaggle* Display Advertising Challenge Dataset will be used. The data is from Criteo and has a field indicating if an ad was clicked (1) or not (0), along with integer and categorical features.

Download large Kaggle Display Advertising Challenge Dataset from Criteo Labs.

Follow the instructions to convert the downloaded dataset to tfrecords using preprocess_csv_tfrecords.py:

  • Store the path to
    mkdir dataset
    cd /home/<user>/dataset

Copy the eval.csv and test.csv into your current working directory /home/<user>/dataset

  • Launch Docker*
    cd /home/<user>/dataset
    docker run -it --privileged -u root:root \
               --volume /home/<user>/dataset:/dataset \
               intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \
               /bin/bash
    
  • Now run the data preprocessing step:
    cd /dataset
    python /workspace/wide-deep-large-ds-int8-inference/models/recommendation/tensorflow/wide_deep_large_ds/dataset/preprocess_csv_tfrecords.py \
         --inputcsv-datafile eval.csv \
         --calibrationcsv-datafile train.csv \
         --outputfile-name preprocessed_eval

Now preprocessed eval dataset will be stored as eval_preprocessed_eval.tfrecords in /home/<user>/dataset directory.

  • Exit out of Docker once the dataset preprocessing completes.
    exit

Set the DATASET_DIR to point to this directory when running Wide and Deep using a large dataset:

export DATASET_DIR=/home/<user>/dataset/eval_preprocessed_eval.tfrecords

Quick Start Scripts

Script name Description
int8_online_inference Runs online inference (batch_size=1). The NUM_OMP_THREADS environment variable and the hyperparameters num-intra-threads, num-inter-threads can be tuned for best performance.
int8_accuracy Measures the model accuracy (batch_size=1000).

Docker*

The model container includes the scripts and libraries needed to run Wide and Deep using a large dataset Int8 inference. To run one of the quickstart scripts using this container, you'll need to provide volume mounts for the dataset and an output directory.

  • Running inference to check accuracy:
DATASET_DIR=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>

docker run \
  --env DATASET_DIR=${DATASET_DIR} \
  --env OUTPUT_DIR=${OUTPUT_DIR} \
  --env http_proxy=${http_proxy} \
  --env https_proxy=${https_proxy} \
  --volume ${DATASET_DIR}:${DATASET_DIR} \
  --volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
  --privileged --init -t \
  intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \
  /bin/bash quickstart/int8_accuracy.sh
  • Running online inference: Set NUM_OMP_THREADS for tunning the hyperparameter num_omp_threads.
DATASET_DIR=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>
NUM_OMP_THREADS=1

docker run \
  --env DATASET_DIR=${DATASET_DIR} \
  --env OUTPUT_DIR=${OUTPUT_DIR} \
  --env NUM_OMP_THREADS=${NUM_OMP_THREADS} \
  --env http_proxy=${http_proxy} \
  --env https_proxy=${https_proxy} \
  --volume ${DATASET_DIR}:${DATASET_DIR} \
  --volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
  --privileged --init -t \
  intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \
  /bin/bash quickstart/int8_online_inference.sh \
  --num-intra-threads 1 --num-inter-threads 1

Documentation and Sources

Get Started
Docker* Repository
Main GitHub*
Readme
Release Notes
Get Started Guide

Code Sources
Dockerfile
Report Issue


License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.


Related Containers and Solutions

Wide & Deep Large Dataset Int8 Inference TensorFlow* Model Package

View All Containers and Solutions 🡢

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.