Intel® Gaudi® Software Version 1.21.0

ID 834275
Updated 6/4/2025
Version
Public

Intel® Gaudi® Software Version 1.21.0

  • Create new deep learning models or migrate existing code in minutes.

  • Deliver generative AI performance with simplified development and increased productivity

author-image

By

Intel® Gaudi® Software Version 1.21.0 Release

We're thrilled to announce the release of Intel® Gaudi® software version 1.21.0, packed with exciting new features, enhancements, and fixes to elevate your AI development experience on the Intel® Gaudi® accelerator platform.

New Features and Enhancements

With version 1.21.0, we've expanded support for Gaudi 3, integrating it with the Intel® Tiber™ AI Cloud. This integration opens new possibilities for cloud-based AI development, offering robust performance and scalability. Additionally, the PerfTest tool now supports Gaudi 2, with a new --basic_check switch for efficient testing across systems.

Our firmware updates bring advanced sensor capabilities and protocols for Gaudi 3, enhancing system monitoring and management. For Gaudi 3 and Gaudi 2, we've introduced a new API for power management, ensuring optimal performance and energy efficiency.

In the realm of PyTorch, Intel Gaudi continues to support a wide range of models using Eager mode and torch.compile. We've enabled multi-threaded graph compilation to boost performance and reduce time-to-train, while Lazy mode will be deprecated in future releases. New profiling methods and performance enhancements for vLLM are also included, alongside support for innovative features like Automatic Prefix Caching and Pipeline Parallelism.

Media and Model Support

We've added support for HEVC video decode in MediaPipe, broadening the media processing capabilities of the platform. The Intel Gaudi vLLM fork now supports a variety of new models, including DeepSeek-R1 and Codellama-34b-instruct-hf, ensuring comprehensive model coverage for diverse AI applications.

Bug Fixes and Resolved Issues

Version 1.21.0 addresses several firmware issues, improving sensor readings and utilization values for Gaudi 3. We've resolved DeepSpeed graph breaks and memory mapping issues, enhancing performance and stability. Transformer Engine fixes ensure accuracy during FP8 precision training, while various compiler and kernel improvements reduce compilation time and enhance device performance.

Known Issues and Limitations

As with any release, there are known issues to be aware of. Certain Ubuntu versions require IOMMU passthrough, and performance degradation may occur in high power mode. PyTorch users should note limitations with multiprocess worker creation and compatibility issues with PyTorch Lightning and DeepSpeed.

Looking Ahead

We recommend updating to the latest version to benefit from these improvements and stay aligned with the latest model coverage. A new version, v1.22.0, is targeted for release later in 2025.

For more detailed information, please visit the Intel Gaudi release notes page. We look forward to seeing how you leverage these new capabilities in your AI projects!