The new 4th gen Intel Xeon Scalable processor features architecture improvements such as Intel® AMX that accelerate classical machine learning and deep learning workloads. Intel AMX allows most deep learning inference models and small and medium-sized deep learning training models to run on the same platform.
Intel AMX has two primary components: tiles and tiled matrix multiplication (TMUL). The tiles store large amounts of data in eight two-dimensional registers, each one kilobyte in size. TMUL is an accelerator engine attached to the tiles that contains instructions to compute larger matrices in a single operation.