Learn LLM Optimization using Transformers & PyTorch* on CPU & GPU

Learn LLM Optimization Using Transformers and PyTorch* on CPUs & GPUs

Subscribe Now

Stay in the know on all things CODE. Updates are delivered to your inbox.

Overview

Large language models (LLM) and the applications built around them have emerged as powerful tools for understanding and generating natural language. However, optimizing these models for maximum efficiency and performance remains a significant challenge.

This session introduces a solution: Optimize LLM workloads on target hardware using the Intel® Extension for Transformers* and Intel® Extension for PyTorch*.

The session also covers:

An introduction to Intel Extension for Transformers and Intel Extension for PyTorch—two powerful libraries for enhancing AI workload performance on Intel platforms.
Using API calls in the PyTorch extension to optimize LLM performance and memory use.
Using the transformer extension's optimization features, such as model compression, neural speed, and neural chat, which is a framework to build customized chatbots.

Skill level: Novice

Featured Software

Choose from the following download options:

Download Code Samples

Jump to:

You May Also Like

You May Also Like

Related Articles

Optimize PyTorch and TensorFlow* Models: Two On-Demand Training Sessions

Get Started with Intel Extension for PyTorch on a GPU

Increase PyTorch Inference Throughput by 4x

Optimize Transformer Model Inference on Intel Processors

Related Webinars & Workshops

Optimize PyTorch Performance on the Latest Intel CPUs and GPUs

Introduction: Get Faster PyTorch Programs with TorchDynamo

Optimize Transformer Models with Tools from Intel and Hugging Face*

Compress the Transformer: Optimize Your DistilBERT Models

<link rel="stylesheet" href="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/commons-page.min.css" type="text/css"><script src="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/commons-page.min.js" defer></script>

<link rel="preload" href="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/atomVideo.min.css" as="style"><link rel="stylesheet" href="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/atomVideo.min.css" type="text/css"><script src="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/atomVideo.min.js"></script>

<link rel="preload" href="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/colorBlock.min.css" as="style"><link rel="stylesheet" href="/etc.clientlibs/settings/wcm/designs/ver/250715/intel/clientlibs/pages/colorBlock.min.css" type="text/css">

<script>!function(){var e=setInterval(function(){"undefined"!=typeof $CQ&&($CQ(function(){CQ_Analytics.SegmentMgr.loadSegments("/etc/segmentation"),CQ_Analytics.ClientContextUtils.init("/etc/clientcontext/intel",window.location.pathname.substr(0,window.location.pathname.indexOf(".")))}),clearInterval(e))},100)}();</script>