Models

Model Performance Data for Intel® Gaudi® 3 AI Accelerators

These performance numbers are measured using the latest SynapseAI* software release version 1.18.0, unless otherwise noted.

Note All models for both training and inference are using the PyTorch* 2.4.0 framework. Other applicable frameworks used for training or inference are noted for each model.

Explore Intel Gaudi 2 Accelerator Performance Data

INFERENCE | TRAINING

Large Language Models (LLM) for Throughput with Intel Gaudi 3 Accelerator

Model #HPU Precision Input Length Output Length Batch Size Throughput

Model	# HPU	Precision	Input Length	Output Length	Batch Size	Throughput (tokens/sec)
LLaMA 2 7b	1	fp8	128	128	1,536	19,810
LLaMA 2 7b	1	fp8	128	2,048	217	6,763
LLaMA 2 7b	1	fp8	2,048	128	153	2,029
LLaMA 2 7b	1	fp8	2,048	2,048	75	2,734
LLaMA 2 70b	2	fp8	128	128	1,750	4,433
LLaMA 2 70b	2	fp8	128	2,048	512	6,026
LLaMA 2 70b	2	fp8	2,048	128	231	498
LLaMA 2 70b	2	fp8	2,048	2,048	240	2,641
LLaMA 3.1 8B	1	fp8	128	128	1,536	24,310
LLaMA 3.1 8B	1	fp8	128	2,048	768	18,830
LLaMA 3.1 8B	1	fp8	2,048	128	256	2,652
LLaMA 3.1 8B	1	fp8	2,048	2,048	364	7,405
LLaMA 3.1 70B	2	fp8	128	128	3,516	3,711
LLaMA 3.1 70B	2	fp8	128	2,048	450	5,776
LLaMA 3.1 70B	2	fp8	2,048	128	223	497
LLaMA 3.1 70B	2	fp8	2,048	2,048	175	2,588
LLaMA 3.1 70B	8	fp8	128	128	4,000	15,008
LLaMA 3.1 70B	8	fp8	128	2,048	600	15,711
LLaMA 3.1 70B	8	fp8	2,048	128	400	1,600
LLaMA 3.1 70B	8	fp8	2,048	2,048	600	8,946
Mistral 7b	1	fp8	128	128	896	24,433
Mistral 7b	1	fp8	128	2,048	120	13,726
Mistral 7b	1	fp8	2,048	128	120	2,085
Mistral 7b	1	fp8	2,048	2,048	44	4,970

System Configuration

Intel Gaudi 3 Platform

System: HLS-Gaudi3 with eight Intel Gaudi 3 platform HL-325L mezzanine cards, two Intel Xeon Platinum 8480+ CPUs at 2.0 GHz, and 1 TB of system memory

Common Software

Ubuntu* v22.04
Intel Gaudi software v1.18.0 (full software support details)
PyTorch: Models run with PyTorch v2.4.0

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Models

Model Performance Data for Intel® Gaudi® 3 AI Accelerators

Large Language Models (LLM) for Throughput with Intel Gaudi 3 Accelerator

System Configuration

Stay Informed