Get Up to 2.75 Times the STREAM Triad Performance with Amazon EC2 m6i Instances Compared to m6a Instances with AMD EPYC Processors
As the global market continues to grow in complexity, more and more organizations are collecting and analyzing data to inform their business decisions. The right data, analyzed correctly, can help businesses find and resolve issues, plan for the future, and grow their customer bases. Analyzing large datasets requires a great deal of computing power, however, and especially for high-performance computing (HPC), it’s critical to choose the right cloud instances. Memory is a particularly urgent concern, with high speed and throughput being vital for applications that cache data in memory.
To help companies find instances with the memory throughput they need, Intel tested the sustained memory throughput of two sets of Amazon Web Services (AWS) Elastic Compute Cloud (EC2) instances:
- m6i instances with 3rd Gen Intel® Xeon® Scalable processors.
- m6a instances with AMD EPYC processors.
Using the STREAM Triad benchmark, which records the memory throughput of each instance in MB/s, Intel tested both instance families at 8 vCPUs and 16 vCPUs. Delivering up to 2.75x the memory throughput, the m6i instances significantly out-performed the m6a options, thereby answering the question of which may be better for memory-intensive HPC workloads.
Increased STREAM Triad Throughput for Smaller Instances
As Figure 1 shows, the 8-vCPU m6i instance, enabled by 3rd Gen Intel® Xeon® Scalable processors, outperformed its AMD processor-enabled counterpart by 79%. If your workloads are modestly sized, these 8-vCPU m6i instances are a great option for maximizing performance.
Increased STREAM Triad Throughput for Medium-Size Instances
For larger workloads, Intel tested the same instance types at the 16-vCPU size. Again, we see that the m6i instance, enabled by 3rd Gen Intel® Xeon® Scalable processors, performed significantly better than the m6a instance featuring AMD processors. This time, though, the memory throughput of the m6i instance was 2.75 times that of the m6a instance (see Figure 2).
Conclusion
For workloads that can benefit from high levels of sustained memory throughput, such as HPC workloads, companies must invest in cloud instances that are up to the task. This testing makes it clear that the 3rd Gen Intel® Xeon® Scalable processor-backed m6i instances from AWS offer superior memory bandwidth performance compared to m6a instances with AMD processors. Especially if your organization faces continued growth in workload sizes, consider the AWS EC2 m6i instances when seeking a cloud solution for your HPC workloads.
Learn More
To get started running your HPC workloads on Amazon EC2 m6i instances featuring 3rd Gen Intel® Xeon® Scalable processors, go to https://aws.amazon.com/ec2/instance-types/m6i/.
Testing done by Intel in Jan. 2022. All configurations used Ubuntu 20.04.3 LTS kernel 5.11.0-1022-aws on AWS us-west=2, with STREAM v5.10, ICC 2021.2 compiler, and set the OMP_NUM_THREADS equal to # of vCPUs. AMD configs used complier flags: -mcmodel medium -shared-intel -O3 -march=core-avx2 -DSTREAM_ARRAY_SIZE=268435456 -DNTIMES=100 -DOFFSET=0 -qopenmp -qopt-streaming-stores always -qopt-zmm-usage=high. Intel configs used compiler flags: -O3 -qopt-streaming-stores=always -qopt-zmm-usage=high -xCORE-AVX512 -qopenmp -mcmodel=large -DSTREAM_ARRAY_SIZE=268435456. m6a.2xlarge: AMD EPYC 7R13, 8 cores, 32GB RAM, up to 12.5 Gbps network BW, up to 6.6 Gbps storage BW; m6i.2xlarge: Intel Xeon Platinum 8375C, 8 cores, 32 GB RAM, up to 10 Gbps network BW, up to 12.5 Gbps storage BW; m6a.4xlarge: AMD EPYC 7R13, 16 cores, 64GB RAM, up to 12.5 Gbps network BW, up to 6.6 Gbps storage BW; m6i.4xlarge: Intel Xeon Platinum 8375C, 16 cores, 64GB RAM up to 10 Gbps network BW, up to 12.5 Gbps storage BW.