EPCC: Cluster Helps Solve Exascale I/O Challenges

Intel® Optane™ persistent memory with 2nd Gen Intel® Xeon® Scalable processors accelerate HPC workloads.

Executive Summary
EPCC (Edinburgh Parallel Computing Center)  is a hotbed for scientific research using High Performance Computing (HPC). The center has gained an international reputation for the most advanced capability in all aspects of HPC, High Performance Data Analytics (HPDA), and novel computing. EPCC hosts ARCHER (Advanced Research Computing High End Resource), Cirrus, and Tesseract, the United Kingdom’s Extreme Scaling service of the DiRAC (Distributed Research utilizing Advanced Computing) facility. Tesseract is a 20,000-plus core hypercube-based supercomputer built on Intel® Xeon® Scal­able processors and Intel® Omni-Path Architecture (Intel® OPA) fabric. But among its other resources is a smaller, yet as important, cluster that enabled breakthrough research on the NEXTGenIO project. Researchers use this cluster, built by Fujitsu on 2nd Generation Intel Xeon Scalable processors and Intel® Optane™ persistent memory, to modify and optimize codes that can deliver the benefits of Intel Optane persistent memory for high-throughput parallel workloads.

To achieve Exascale computing requires addressing many challenges, including those with HPC I/O, which has lagged behind other HPC advances. EPCC led the NEXTGenIO project, funded by the European Commission, with research into how to leverage byte-addressable persistent memory (B-APM) for large parallel com­puting workloads. Partners in NEXTGenIO include EPCC, Intel, Fujitsu, Barcelona Supercomputing Center (BSC), Technische Universitat Dresden, ARM/Allinea, European Center for Medium-range Weather Forecasting (ECMWF), and ARCTUR.

Intel Optane persistent memory along with 2nd Generation Intel Xeon Scalable processors deliver a flexible B-APM architecture for servers using Intel Optane persistent memory.

For the NEXTGenIO project, work at EPCC focused on the overall technical and architectural needs to maximize the potential of Intel Optane persistent memory in its several different usage modes. Working with industry and academia across Europe, researchers implemented new filesystems, designed and developed data-aware schedulers, investigated check-pointing software, and integrated I/O and communi­cation libraries for the necessary modifications and optimizations to software that can benefit from Intel Optane persistent memory.

Exploring new memory hierarchies for HPC architectures in order to accelerate performance and increase system efficiencies has been a focus of supercomputing for many years. From burst buffers to SSDs in each node for local data storage, and now B-APM, system architects are moving faster, more efficient technologies closer to the CPUs.

Intel Optane persistent memory combines the traits of storage and memory into a single high-capacity module that fits into a server DRAM slot. Most 2nd Generation Intel Xeon Scalable processors recognize this technology, provid­ing near-DRAM performance with up to 3 TB of capacity per socket. Intel Optane persistent memory offers different usage modes to optimize the technology for various types of workloads, whether they need massive volatile memory capacity, persistent data storage with near-DRAM perfor­mance, or a combination of both.

“With an OpenFOAM code with high amounts of I/O, we used Intel Optane persistent memory as a filesystem on the node. We used some of our tools to get data into and out of the node and used the persistent memory as the data store. The solver ran over 8x faster—12 percent of the original run time— compared to going to the Lustre filesystem for data.”—Adrian Jackson, Senior Research Fellow at EPCC

EPCC’s new research cluster built by Fujitsu provided computing resources for NEXTGenIO research. In an intensive co-creation process, Fujitsu analyzed NEXTGenIO partners’ application I/O and compute requirements and designed and manufactured the NEXTGenIO system focused to overcome existing bottlenecks. The cluster houses 34 nodes of dual-socket 2nd Generation Intel Xeon Platinum 8260 processors, 3 TBs of Intel Optane persistent memory, and 192 GB of DRAM per node. A 100 Gbps Intel Omni-Path Architecture fabric connects the compute nodes, while a 56 Gbps InfiniBand network attaches to an external Lustre parallel filesystem. The architecture allowed researchers to learn how to take advantage of high-capacity persistent memory across many different large parallel codes, such as OpenFOAM, a compu­tational fluid dynamics (CFD) code used in the ECMWF’s IFS forecasting software.

“We are glad Fujitsu could contribute to the success of the NEXTGenIO project,” said Olivier Delachapelle, Fujitsu Head of Category Management-Products Sales Europe. “Solving the I/O bottleneck opens the door to significantly increased perfor­mance and scalability towards HPC Exascale. Fujitsu success­fully integrated the DCPMM technology to our PRIMERGY and PRIMEQUEST product lines. This breakthrough in I/O perfor­mance therefore will also reduce time to results for many of our customers applications in a broad range of business sectors.”

Intel Optane persistent memory along with 2nd Generation Intel Xeon Scalable processors deliver a flexible B-APM architec­ture for servers.

EPCC’s new Intel Optane persistent memory cluster delivered promising benefits for the NEXTGenIO project. Benchmarks and large-scale applications were used to measure performance of the cluster with Intel Optane DC persistent memory.

Working with the ECMWF, project partners were able develop novel software to take advantage of the new cluster’s tech­nologies. The ECMWF ensemble forecasting software runs 56 forecasts every day to provide nine to 16-day weather fore­casts, which are made available to member organizations. The software was modified to use Intel Optane persistent memory in app-direct mode, leveraging the technology as a persistent data store.

“With an OpenFOAM code with high amounts of I/O,” explained Adrian Jackson, Senior Research Fellow at EPCC, “we used Intel Optane persistent memory as a filesystem on the node. We used some of our tools to get data into and out of the node and used the persistent memory as the data store. The solver ran over 8x faster—12 percent of the original run time—compared to going to the Lustre filesystem for data.”1

Using Intel Optane Persistent Memory, an OpenFOAM code with high amounts of I/O ran faster compared to a Lustre filesystem.1

Other codes took advantage of the system’s two-level memory architecture, using the large memory capacity of Intel Optane persistent memory for data and the DRAM as cache (memory mode).

In another study, while a large-scale simulation required 20 nodes or more to run without persistent memory, the same code ran on a single node with Intel Optane persistent memory, according to Jackson. “While the simulation did not run faster, it ran more efficiently on a single node,” added Jackson.1 Such consolidation potentially promises benefits for scaling out large applications on smaller clusters.

Several other studies were completed, including synthetic workflows and IOR benchmarks. The work is described in a University of Edinburgh Research Explorer report.2 The work is ongoing to further optimize I/O for large-scale codes on HPC machines with Intel Optane persistent memory and 2nd Generation Intel Xeon Scalable processors.

Explore Related Products and Solutions

Intel® Xeon® Scalable Processors

Drive actionable insight, count on hardware-based security, and deploy dynamic service delivery with Intel® Xeon® Scalable processors.

Learn more

Intel® Optane™ Persistent Memory

Extract more actionable insights from data – from cloud and databases, to in-memory analytics, and content delivery networks.

Learn more

Intel® Omni-Path Architecture

Intel® Omni-Path Architecture (Intel® OPA) lowers system TCO while providing reliability, high performance, and extreme scalability.

Learn more

Notices and Disclaimers

Intel® technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at https://www.intel.com. // Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit https://www.intel.com/benchmarks. // Performance results are based on testing as of the date set forth in the configurations and may not reflect all publicly available security updates. See configuration disclosure for details. No product or component can be absolutely secure. // Cost reduction scenarios described are intended as examples of how a given Intel®-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. // Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. // In some test cases, results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Product and Performance Information

1Results provided by EPCC