Intel® Gaudi® AI Accelerators Support Llama 3.3 Release

Stay in the Know on All Things CODE

author-image

By

Today Meta* announced the release of Llama 3.3, a text-only 70B instruction-tuned model that provides enhanced performance relative to Llama 3.1 70B and to Llama 3.2 90B when used for text-only applications. This update reflects Meta's commitment to open innovation, making cutting-edge generative AI accessible to all developers and enterprises. In a post today on X, Meta’s Head of Generative AI, Ahmad Al-Dahle, has stated that this new model delivers core performance at a lower cost than its Llama 3.1 405B model.  

As a close partner of Meta, Intel promptly supports all new Llama releases, including Llama 3.3, with its own AI solutions. Intel recently introduced the Intel® Gaudi® 3 AI accelerator, which delivers 4x AI compute for BF16,  2x AI compute for FP8, and 2x networking bandwidth for massive system scale out compared to its predecessor – a significant leap in performance and productivity for AI training and inference. Open ecosystem software, like PyTorch*, Hugging Face*, vLLM*, and OPEA, is optimized for Intel Gaudi and makes it easy for AI systems deployment for Llama 3.3. The optimizations in both hardware and software make Intel Gaudi 3 a great choice for running open LLMs such as Llama 3.3.

Performance metrics will soon be available on the Model Performance Data for Intel® Gaudi® 3 AI Accelerators.