Upgraded Libraries
This release has upgraded versions of several libraries, including:
- DeepSpeed v0.9.4
- PyTorch* Lightning v2.0.4
- TensorFlow* v2.12.1
Support
The release introduces:
- Support for DeepSpeed-Chat and has an example published in the Habana reference models repository.
- DeepSpeed support in Lightning.
Support for Habana mixed precision is deprecated and will be dropped in the next release. For mixed-precision support, switch to autocast.
Metrics
For debuging and profiling, you can now retrieve metrics using Metric APIs for:
- cpu_fallback
- memory_defragmentation
- recipe_cache
Reference Models
Some Intel Gaudi AI accelerator reference models were updated to use the PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES runtime environment variable that allows Intel Gaudi software to automatically handle dynamic shapes.
Added Kernel
FusedSDPA is a fused implementation of the nn.functional.scaled_dot_product_attention() API on the Intel® Gaudi® processor.
Performance Improvements
For more information on the LLM inference performance improvements, see Model Performance Data.
More Information
For more information on this version, see Release Notes.