Performance Data for Intel® AI Data Center Products
Find the latest AI benchmark performance data for Intel Data Center products, including detailed hardware and software configurations.
Pretrained models, sample scripts, best practices, and tutorials
- Intel® Developer Cloud
- Intel® AI Reference Models and Jupyter Notebooks*
- AI-Optimized CPU Containers from Intel
- AI-Optimized GPU Containers from Intel
- Open Model Zoo for OpenVINO™ toolkit
- Jupyter Notebook tutorials for OpenVINO™
- AI Performance Debugging on Intel® CPUs
Measurements were taken using:
- PyTorch* Optimizations from Intel
- TensorFlow* Optimizations from Intel
- Intel® Distribution of OpenVINO™ Toolkit
4th Generation Intel® Xeon® Scalable Processors
Intel® Xeon® Platinum 6448Y Processor (32 Cores)
Deep Learning Inference
Framework Version | Model | Usage | Precision | Throughput | Perf/Watt | Accuracy | Latency(ms) | Batch size |
---|---|---|---|---|---|---|---|---|
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | int8 | 7015.80 img/s | 75.99(%) with BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | bf16 | 3609.80 img/s | 76.14(%) with BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | bf32 | 1153.05 img/s | 76.13(%) with BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | fp32 | 894.15 img/s | 64 | |||
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | int8 | 9117.04 img/s | 116 | |||
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | bf16 | 4851.59 img/s | 68 | |||
Intel PyTorch 1.13 | ResNet50 v1.5 | Image Recognition | bf32 | 1576.28 img/s | 68 | |||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | fp32 | 901.70 img/s | 76.48(%) with BS=100 | 1 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | int8 | 6243.78 img/s | 76.02(%) with BS=101 | 1 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | bf16 | 3417.28 img/s | 76.75(%) with BS=102 | 1 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | bf32 | 1120.95 img/s | 76.47(%) with BS=103 | 1 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | fp32 | 901.40 img/s | 22.61 | 64 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | int8 | 8908.44 img/s | 3.41 | 116 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | bf16 | 4606.38 img/s | 5.14 | 80 | ||
Intel TensorFlow 2.11 | ResNet50 v1.5 | Image Recognition | bf32 | 1475.14 img/s | 64 | |||
OpenVINO | ResNet50 v1.5 | Image Recognition | fp32 | 885.78 img/s | 76.46(%) | |||
OpenVINO | ResNet50 v1.5 | Image Recognition | int8 | 6495.75 img/s | 76.36(%) | |||
OpenVINO | ResNet50 v1.5 | Image Recognition | bf16 | 3531.29 img/s | 76.47(%) | |||
OpenVINO | ResNet50 v1.5 | Image Recognition | fp32 | 887.29 img/s | ||||
OpenVINO | ResNet50 v1.5 | Image Recognition | int8 | 8562.83 img/s | ||||
OpenVINO | ResNet50 v1.5 | Image Recognition | bf16 | 4269.57 img/s | ||||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | fp32 | 25.48 sent/s | 93.15(F1) with BS=8 | 1 | ||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | int8 | 181.72 sent/s | 92.78(F1) with BS=8 | 1 | ||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | bf16 | 114.62 sent/s | 93.2(F1) with BS=8 | 1 | ||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | bf32 | 47.52 sent/s | 93.15(F1) with BS=8 | 1 | ||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | fp32 | 28.20 sent/s | 56 | |||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | int8 | 154.42 sent/s | 56 | |||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | bf16 | 110.94 sent/s | 16 | |||
Intel PyTorch 1.13 | BERTLarge SQuAD1.1 seq_len=384 | Natural Language Processing | bf32 | 45.37 sent/s | 16 | |||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | fp32 | 25.26 sent/s | 92.98(F1) with BS=32 | 1 | ||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | int8 | 173.81 sent/s | 92.32(F1) with BS=32 | 1 | ||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | bf16 | 113.56 sent/s | 93.01(F1) with BS=32 | 1 | ||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | bf32 | 48.19 sent/s | 93.00(F1) with BS=32 | 1 | ||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | fp32 | 26.02 sent/s | 16 | |||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | int8 | 162.11 sent/s | 16 | |||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | bf16 | 113.03 sent/s | 128 | |||
Intel TensorFlow 2.11 | BERTLarge seq_len=384 | Natural Language Processing | bf32 | 44.77 sent/s | 16 | |||
OpenVINO | BERTLarge | Natural Language Processing | fp32 | 30.75 sent/s | 93.25(F1) | 1 | ||
OpenVINO | BERTLarge | Natural Language Processing | int8 | 207.64 sent/s | 92.65(F1) | 1 | ||
OpenVINO | BERTLarge | Natural Language Processing | bf16 | 122.66 sent/s | 93.29(F1) | 1 | ||
OpenVINO | BERTLarge | Natural Language Processing | fp32 | 28.37 sent/s | 16 | |||
OpenVINO | BERTLarge | Natural Language Processing | int8 | 205.7 sent/s | 16 | |||
OpenVINO | BERTLarge | Natural Language Processing | bf16 | 121.2 sent/s | 16 | |||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | fp32 | 20.88 img/s | 20 mAP with BS=16 | 1 | ||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | int8 | 301.04 img/s | 19.9 mAP with BS=16 | 1 | ||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf16 | 147.99 img/s | 19.98 mAP with BS=16 | 1 | ||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf32 | 21.77 img/s | 20 mAP with BS=16 | 1 | ||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | fp32 | 20.82 img/s | 112 | |||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | int8 | 278.59 img/s | 112 | |||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf16 | 151.04 img/s | 112 | |||
Intel PyTorch 1.13 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf16 | 21.82 img/s | ||||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | fp32 | 20.81 img/s | 22.40 mAP | 1 | ||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | int8 | 290.47 img/s | 21.40 mAP | 1 | ||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | bf16 | 148.50 img/s | 22.50 mAP | 1 | ||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | bf32 | 21.69 img/s | 22.40 mAP | 1 | ||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | fp32 | 20.73 img/s | 56 | |||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | int8 | 265.92 img/s | 56 | |||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | bf16 | 142.63 img/s | 56 | |||
Intel TensorFlow 2.11 | SSD-ResNet34 | Object Detection | bf32 | 21.60 img/s | 56 | |||
OpenVINO | SSD-ResNet34 | Object Detection | fp32 | 20.51 img/s | 20 mAP | 1 | ||
OpenVINO | SSD-ResNet34 | Object Detection | int8 | 322.16 img/s | 19.9 mAP | 1 | ||
OpenVINO | SSD-ResNet34 | Object Detection | bf16 | 147.37 img/s | 20 mAP | 1 | ||
OpenVINO | SSD-ResNet34 | Object Detection | fp32 | 20.69 img/s | 64 | |||
OpenVINO | SSD-ResNet34 | Object Detection | int8 | 303.29 img/s | 64 | |||
OpenVINO | SSD-ResNet34 | Object Detection | bf16 | 144.55 img/s | 64 | |||
Intel PyTorch 1.13 | RNNT LibriSpeech | Speech Recognition | fp32 | 43.11 fps | 7.31 WER with BS=64 | 1 | ||
Intel PyTorch 1.13 | RNNT LibriSpeech | Speech Recognition | bf16 | 213.21 fps | 7.30 WER with BS=64 | 1 | ||
Intel PyTorch 1.13 | RNNT LibriSpeech | Speech Recognition | bf32 | 94.06 fps | 7.32 WER with BS=64 | 1 | ||
Intel PyTorch 1.13 | RNNT LibriSpeech | Speech Recognition | fp32 | 312.95 fps | 448 | |||
Intel PyTorch 1.13 | RNNT LibriSpeech | Speech Recognition | fp16 | 1345.69 fps | 448 | |||
Intel PyTorch 1.13 | RNNT LibriSpeech | Speech Recognition | bf32 | 940.18 fps | 448 | |||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | fp32 | 105.21 fps | 84.18(%) at BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | int8 | 921.42 fps | 84.05(%) at BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | bf16 | 506.98 fps | 84.18(%) at BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | bf32 | 159.27 fps | 84.18(%) at BS=128 | 1 | ||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | fp32 | 104.06 fps | 64 | |||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | int8 | 1361.66 fps | 116 | |||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | bf16 | 614.73 fps | 64 | |||
Intel PyTorch 1.13 | ResNeXt101 32x16d ImageNet | Image Classification | bf32 | 183.57 fps | 116 | |||
OpenVINO | ResNeXt101 32x16d ImageNet | Image Classification | fp32 | 103.2 fps | 84.17(%) | 1 | ||
OpenVINO | ResNeXt101 32x16d ImageNet | Image Classification | int8 | 922.61 fps | 84.2(%) | 1 | ||
OpenVINO | ResNeXt101 32x16d ImageNet | Image Classification | bf16 | 498.71 fps | 84.16(%) | 1 | ||
OpenVINO | ResNeXt101 32x16d ImageNet | Image Classification | fp32 | 102.46 fps | 64 | |||
OpenVINO | ResNeXt101 32x16d ImageNet | Image Classification | int8 | 1248.03 fps | 64 | |||
OpenVINO | ResNeXt101 32x16d ImageNet | Image Classification | bf16 | 603.51 fps | 64 | |||
Intel PyTorch 1.13 | MaskR-CNN COCO 2017 | Object Detection | fp32 | 19.30 img/s | 1 | |||
Intel PyTorch 1.13 | MaskR-CNN COCO 2017 | Object Detection | bf16 | 96.30 img/s | 1 | |||
Intel PyTorch 1.13 | MaskR-CNN COCO 2017 | Object Detection | bf32 | 26.62 img/s | 1 | |||
Intel PyTorch 1.13 | MaskR-CNN COCO 2017 | Object Detection | fp32 | 17.48 img/s | 37.82/34.23 bbox/segm | 112 | ||
Intel PyTorch 1.13 | MaskR-CNN COCO 2017 | Object Detection | bf16 | 89.11 img/s | 37.75/34.33 bbox/segm | 112 | ||
Intel PyTorch 1.13 | MaskR-CNN COCO 2017 | Object Detection | bf32 | 25.64 img/s | 37.78/34.22bbox/segm | 112 | ||
Intel PyTorch 1.13 | DLRM Criteo Terabyte | Recommender | fp32 | 1564836.01 rec/s | 80.27 AUC | 128 | ||
Intel PyTorch 1.13 | DLRM Criteo Terabyte | Recommender | int8 | 13793657.22 rec/s | 80.27 AUC | 128 | ||
Intel PyTorch 1.13 | DLRM Criteo Terabyte | Recommender | bf16 | 6942136.72 rec/s | 80.27 AUC | 128 | ||
Intel PyTorch 1.13 | DLRM Criteo Terabyte | Recommender | bf32 | 2648795.53 rec/s | 80.27 AUC | 128 | ||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | fp32 | 18.64 sent/s | 27.16 BLEU with BS=64 | 1 | ||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | int8 | 51.51 sent/s | 27.11 BLEU with BS=64 | 1 | ||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | bf16 | 34.24 sent/s | 27.13 BLEU with BS=64 | 1 | ||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | bf32 | 18.67 sent/s | 27.14 BLEU with BS=64 | 1 | ||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | fp32 | 90.49 sent/s | 448 | |||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | int8 | 239.95 sent/s | 448 | |||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | bf16 | 217.82 sent/s | 448 | |||
Intel TensorFlow 2.11 | Transformer MLPerf | Language Translation | bf32 | 103.14 sent/s | 448 | |||
Intel TensorFlow 2.11 | DIEN Amazon Books Data | Recommender | fp32 | 89221.46 rec/s | 77.18(%) with BS=128 | 16 | ||
Intel TensorFlow 2.11 | DIEN Amazon Books Data | Recommender | bf16 | 104481.13 rec/s | 77.11(%) with BS=128 | 16 | ||
Intel TensorFlow 2.11 | DIEN Amazon Books Data | Recommender | bf32 | 90065.53 rec/s | 77.19(%) with BS=128 | 16 | ||
Intel TensorFlow 2.11 | DIEN Amazon Books Data | Recommender | fp32 | 359324.72 rec/s | 65536 | |||
Intel TensorFlow 2.11 | DIEN Amazon Books Data | Recommender | bf16 | 466339.98 rec/s | 65536 | |||
Intel TensorFlow 2.11 | DIEN Amazon Books Data | Recommender | bf16 | 376498.95 rec/s | 65536 | |||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | fp32 | 2.04 samp/s | 85.30 mean | 1 | ||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | int8 | 9.32 samp/s | 85.08 mean | 1 | ||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | bf16 | 9.04 samp/s | 85.31 mean | 1 | ||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | bf32 | 3.12 samp/s | 85.30 mean | 1 | ||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | fp32 | 1.90 samp/s | 6 | |||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | int8 | 10.29 samp/s | 6 | |||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | bf16 | 9.35 samp/s | 6 | |||
Intel TensorFlow 2.11 | 3D-UNet | Image Segmentation | bf32 | 3.15 samp/s | 6 | |||
OpenVINO | 3D-UNet | Image Segmentation | fp32 | 1.99 samp/s | 0.85 mean | 1 | ||
OpenVINO | 3D-UNet | Image Segmentation | int8 | 15.5 samp/s | 0.85 mean | 1 | ||
OpenVINO | 3D-UNet | Image Segmentation | bf16 | 10.34 samp/s | 0.85 mean | 1 | ||
OpenVINO | 3D-UNet | Image Segmentation | fp32 | 1.88 samp/s | 6 | |||
OpenVINO | 3D-UNet | Image Segmentation | int8 | 14.3 samp/s | 6 | |||
OpenVINO | 3D-UNet | Image Segmentation | bf16 | 9.68 samp/s | 6 |
Hardware and software configuration (measured January 10, 2023):
- Hardware configuration for Intel® Xeon® Platinum 6448Y processor (formerly code named Sapphire Rapids): 2 sockets, 32 cores, 225 watts, 16 x 32 GB DDR5 4800 memory, BIOS version EGSDCRB1.SYS.8901.P01.2209200243, operating system: CentOS* Stream 8, using Intel® Advanced Matrix Extensions (Intel® AMX) int8 and bf16 with Intel® oneAPI Deep Neural Network Library (oneDNN) v2.7 optimized kernels integrated into Intel® Extension for PyTorch* v1.13, Intel® Extension for TensorFlow* v2.12, and Intel® Distribution of OpenVINO™ toolkit v2022.3. Measurements may vary.
- If the dataset is not listed, a synthetic dataset was used to measure performance. Accuracy (if listed) was validated with the specified dataset.