Download Command
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_3_0/wide-deep-large-ds-fp32-inference.tar.gz
Description
Datasets
The large Kaggle* Display Advertising Challenge Dataset will be used. The data is from Criteo and has a field indicating if an ad was clicked (1) or not (0), along with integer and categorical features.
Download large Kaggle Display Advertising Challenge Dataset from Criteo Labs.
- Download the large version of evaluation dataset from: https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv
- Download the large version of train dataset from: https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv
Follow the instructions to convert the downloaded dataset to tfrecords using preprocess_csv_tfrecords.py:
- Store the path to
mkdir dataset cd /home/<user>/dataset
Copy the eval.csv and test.csv into your current working directory /home//dataset
- Launch Docker*
cd /home/<user>/dataset docker run -it --privileged -u root:root \ --volume /home/<user>/dataset:/dataset \ intel/recommendation:tf-latest-wide-deep-large-ds-int8-inference \ /bin/bash
- Now run the data preprocessing step:
cd /dataset python /workspace/wide-deep-large-ds-int8-inference/models/recommendation/tensorflow/wide_deep_large_ds/dataset/preprocess_csv_tfrecords.py \ --inputcsv-datafile eval.csv \ --calibrationcsv-datafile train.csv \ --outputfile-name preprocessed_eval
Now preprocessed eval dataset will be stored as eval_preprocessed_eval.tfrecords in /home//dataset directory.
- Exit out of Docker once the dataset preprocessing completes.
exit
Set the DATASET_DIR
to point to this directory when running Wide and Deep using a large dataset:
export DATASET_DIR=/home/<user>/dataset/eval_preprocessed_eval.tfrecords
Quick Start Scripts
Script name | Description |
---|---|
int8_online_inference |
Runs online inference (batch_size=1 ). The NUM_OMP_THREADS environment variable and the hyperparameters num-intra-threads , num-inter-threads can be tuned for best performance. |
int8_accuracy |
Measures the model accuracy (batch_size=1000 ). |
Bare Metal
To run on bare metal, the following prerequisites must be installed in your environment:
- Python* 3
- intel-tensorflow==1.15.2
- numactl
After installing the prerequisites, download and untar the model package. Set environment variables for the path to your DATASET_DIR
and an OUTPUT_DIR
where log files will be written, then run a quickstart script.
DATASET_DIR=<path to the dataset>
OUTPUT_DIR=<directory where log files will be written>
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_3_0/wide-deep-large-ds-fp32-inference.tar.gz
tar -xzf wide-deep-large-ds-int8-inference.tar.gz
cd wide-deep-large-ds-int8-inference
- Running inference to check accuracy:
quickstart/int8_accuracy.sh
- Running online inference: Set
NUM_OMP_THREADS
for tunning the hyperparameternum_omp_threads
.
NUM_OMP_THREADS=1
quickstart/int8_online_inference.sh \
--num-intra-threads 1 --num-inter-threads 1
Documentation and Sources
Get Started
Main GitHub* Repository
Readme
Release Notes
Get Started Guide
Code Sources
Report Issue
License Agreement
LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.
Related Containers and Solutions
Wide & Deep Large Dataset Int8 Inference TensorFlow* Container