| Model | # HPU | Sequence Length | Precision | Batch Size | Throughput (tokens/sec) |
|---|---|---|---|---|---|
| LLaMA V2 7B | 8 | 4,096 | FP8 | 1,024 | 68,464 |
| LLaMA V2 13B | 16 | 4,096 | FP8 | 256 | 58,282 |
| LLaMA V2 70B | 64 | 4,096 | FP8 | 1,024 | 54,274 |
| LLaMA V3.1 8B | 8 | 8,192 | FP8 | 128 | 36,309 |
| LLaMA V3.1 70B | 64 | 8,192 | FP8 | 128 | 43,677 |