Pull Command
docker pull intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference
Description
Datasets
The large Kaggle* Display Advertising Challenge Dataset will be used. The data is from Criteo and has a field indicating if an ad was clicked (1) or not (0), along with integer and categorical features.
Download large Kaggle Display Advertising Challenge Dataset from Criteo Labs.
- Download the large version of evaluation dataset from: https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv
- Download the large version of train dataset from: https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv
Follow the instructions to convert the downloaded dataset to tfrecords using preprocess_csv_tfrecords.py:
- Store the path to
mkdir dataset cd /home/<user>/dataset
Copy the eval.csv and test.csv into your current working directory /home/<user>/dataset
- Launch Docker*
cd /home/<user>/dataset docker run -it --privileged -u root:root \ --volume /home/<user>/dataset:/dataset \ intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \ /bin/bash
- Now run the data preprocessing step:
cd /dataset python /workspace/wide-deep-large-ds-int8-inference/models/recommendation/tensorflow/wide_deep_large_ds/dataset/preprocess_csv_tfrecords.py \ --inputcsv-datafile eval.csv \ --calibrationcsv-datafile train.csv \ --outputfile-name preprocessed_eval
Now preprocessed eval dataset will be stored as eval_preprocessed_eval.tfrecords in /home/<user>/dataset directory.
- Exit out of Docker once the dataset preprocessing completes.
exit
Set the DATASET_DIR
to point to this directory when running Wide and Deep using a large dataset:
export DATASET_DIR=/home/<user>/dataset/eval_preprocessed_eval.tfrecords
Quick Start Scripts
Script name | Description |
---|---|
int8_online_inference |
Runs online inference (batch_size=1 ). The NUM_OMP_THREADS environment variable and the hyperparameters num-intra-threads , num-inter-threads can be tuned for best performance. |
int8_accuracy |
Measures the model accuracy (batch_size=1000 ). |
Docker*
The model container includes the scripts and libraries needed to run Wide and Deep using a large dataset Int8 inference. To run one of the quickstart scripts using this container, you'll need to provide volume mounts for the dataset and an output directory.
- Running inference to check accuracy:
DATASET_DIR=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>
docker run \
--env DATASET_DIR=${DATASET_DIR} \
--env OUTPUT_DIR=${OUTPUT_DIR} \
--env http_proxy=${http_proxy} \
--env https_proxy=${https_proxy} \
--volume ${DATASET_DIR}:${DATASET_DIR} \
--volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
--privileged --init -t \
intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \
/bin/bash quickstart/int8_accuracy.sh
- Running online inference: Set
NUM_OMP_THREADS
for tunning the hyperparameternum_omp_threads
.
DATASET_DIR=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>
NUM_OMP_THREADS=1
docker run \
--env DATASET_DIR=${DATASET_DIR} \
--env OUTPUT_DIR=${OUTPUT_DIR} \
--env NUM_OMP_THREADS=${NUM_OMP_THREADS} \
--env http_proxy=${http_proxy} \
--env https_proxy=${https_proxy} \
--volume ${DATASET_DIR}:${DATASET_DIR} \
--volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
--privileged --init -t \
intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \
/bin/bash quickstart/int8_online_inference.sh \
--num-intra-threads 1 --num-inter-threads 1
Documentation and Sources
Get Started
Docker* Repository
Main GitHub*
Readme
Release Notes
Get Started Guide
Code Sources
Dockerfile
Report Issue
License Agreement
LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.
Related Containers and Solutions
Wide & Deep Large Dataset Int8 Inference TensorFlow* Model Package