Learn LLM Optimization Using Transformers and PyTorch* on CPUs & GPUs
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Large language models (LLM) and the applications built around them have emerged as powerful tools for understanding and generating natural language. However, optimizing these models for maximum efficiency and performance remains a significant challenge.
This session introduces a solution: Optimize LLM workloads on target hardware using the Intel® Extension for Transformers* and Intel® Extension for PyTorch*.
The session also covers:
- An introduction to Intel Extension for Transformers and Intel Extension for PyTorch—two powerful libraries for enhancing AI workload performance on Intel platforms.
- Using API calls in the PyTorch extension to optimize LLM performance and memory use.
- Using the transformer extension's optimization features, such as model compression, neural speed, and neural chat, which is a framework to build customized chatbots.
Skill level: Novice
Featured Software
Choose from the following download options: