NREL: Kestrel Flies with Intel® Xeon® Processors

The National Renewable Energy Laboratory acquired a supercomputer with more computing capacity and greater energy efficiency.

At a glance:

  • The U.S. Department of Energy’s National Renewable Energy Laboratory (NREL) focuses on energy efficiency and renewable energy. NREL uses high-performance computing systems to support the research done by staff and partners in industry and government.

  • With the growing demand for computing capacity, NREL acquired a new system, Kestrel, planned for production this year. The new system also has greater power efficiency than the previous system.

author-image

By

Executive Summary

The U.S. Department of Energy’s National Renewable Energy Laboratory (NREL), located in Golden, Colorado, focuses on energy efficiency and renewable energy. NREL began its mission during the oil crisis of the 1970s as the federal government began to look at the need for alternative sources, such as renewables and innovative materials.

“Our research revolves around improving energy efficiency for buildings, residences, the grid, and renewable energies of all kinds—whether it’s solar, wind, hydrogen, or geothermal,” Aaron Andersen, Group Manager for Advanced Computing Operations at NREL, stated. “There’s a lot of variation, a lot of material science work that we do to identify either how to make those more cost-effective, or to make them more efficient, or ideally both. And obviously, we do a lot of work around vehicles and transportation, major efforts in hydrogen, and a few others.”

NREL uses high-performance computing (HPC) systems to support the research done by staff and partners in industry and government. The laboratory currently supports over 300 projects and 1300 HPC users across as many as 60 institutions, according to Andersen.

NREL’s last production supercomputer, Eagle, was installed in 2019. Eagle is built on Intel® Xeon® Gold 6154 processors, offering eight petaFLOPS of performance. With growing demand for computing capacity over the years, NREL acquired a new system, Kestrel, planned for production this year.

The 44 petaFLOPS Kestrel, built with 4th Gen Intel® Xeon® processors, provides 5.5x more computing capacity with 2.2x more power efficiency than Eagle, enabling NREL to support more innovative research and materials development with better power efficiency.1

Researchers at the National Renewable Energy Laboratory work to improve energy efficiency for buildings, residences, the grid, and renewable energies of all kinds—whether it’s solar, wind, hydrogen, or geothermal.

Challenge

Renewable energy research covers a range of scientific domains: weather, materials, energy generation and consumption patterns, and geophysical phenomena, among others. NREL’s computational capabilities, therefore, need to be diverse for users to validate, model, and propose new ideas and products.

Understanding Wind and Innovating Materials

One example of such products is wind turbines, a major producer of renewable energy today. Companies like Avangrid work with NREL to study wind and weather for their projects.

“We utilize the weather research and forecasting (WRF) model, but at a very high resolution, which allows us to do wind forecasts and wind studies for most of North America,” explained Andersen. “We create datasets that designers use, whether they’re studying the placement and optimization of wind farms or looking at potential wind projects.”

Materials science takes up a bulk of NREL HPC resources. Researchers are looking for new and more efficient materials for photovoltaics, such as perovskites, or for batteries. They’re also researching materials on membranes for hydrolysis and ways to make chemical processes more efficient, such as for hydrogen production.

“In the case of the wind applications and in material science,” Andersen added, “we’re running a lot more machine learning workloads, where we identify promising chemical combinations or chemical compounds that researchers then verify in the lab.”

With the large domain of possible chemical compounds—whether for more efficient batteries or voltaics or stronger wind turbine blades—narrowing them down to promising possibilities requires substantial computing.

Simulating Power Grids

Energy suppliers (and even homeowners) are putting more and more renewable resources into the grid. According to Andersen, with such a large quantity of these sources, the grid needs to be much more resilient, requiring more dynamic control processes. The HPC systems at NREL are used to simulate events where wind or solar is plentiful versus where it is scarce. NREL scientists also use HPC to “solve different combinations of power system states in parallel,” helping power authorities understand how to manage the grid and maintain stability. Such simulations provide insight that could allow for more resilient and robust operation of the power grid in the face of extreme events.

More Computing Needed

While Eagle has helped innovate new ideas and solutions, over the years, the demand for computing at NREL outpaced its capacity.

With the congestion for computing, it was time to upgrade capacity to enable continuing and next-generation research.

Solution

Kestrel was built by HPE after NREL awarded the company through an open procurement process. It was designed and built to handle the large and diverse workloads required in renewable energy research. Its 44 petaFLOPS of performance come from 4th Gen Intel Xeon processors and NVIDIA GPUs. The machine is being delivered in two phases.

The CPU phase landed in March and is expected to be in production this summer. This partition is built on 2,304 nodes with dual 4th Gen Intel Xeon processors, 104 cores, and 256 gigabytes of memory per node, plus nearly 500 terabytes of local NVMe storage. Ten additional large memory nodes with the same processors contain 2 terabytes of memory and 12.8 terabytes of local NVMe storage.

The second GPU phase will be delivered later this year with 140 GPU nodes targeted for machine learning workloads.

“We’re in the early stages of acceptance testing,” Andersen explained. “We reran the full suite of benchmarks that we had for the procurement. And then we’ll simulate workloads for a number of weeks, possibly followed by some early users on the system to stress test in June or July.” 

Kestrel is a very efficient supercomputer. The total node count is nearly identical to Eagle, but it is 5.5x more powerful and more power efficient, achieving 2.2x more calculations per watt of energy than Eagle under full load.1

A large part of that efficiency comes from the processor architecture. But Kestrel is also liquid-cooled; the NREL data center does not utilize chillers, greatly reducing the energy demand versus traditional air conditioning systems.

“Eliminating chillers is certainly a major feature for us in energy efficiency,” Andersen added. “With a PUE of 1.03, we are among the top few most energy efficient data centers in operation.”

To power the system, along with other resources on the NREL campus, the organization uses a number of renewable resources.

Graph Neural Network Accelerates Design of New, Stable Solid-State Battery Materials

A critical barrier to the development of solid-state batteries (SSBs) is the thermodynamic instability of electrode-electrolyte interfaces. Improved SSB design requires new materials that are stable at suitable reduction and oxidation potentials. Discovering new stable materials in unexplored chemical spaces necessitates quick and accurate prediction of thermodynamic stability and efficient search strategies. Using Eagle, the NREL team developed a new approach to finding stable and functional crystal structures by using a graph neural network (GNN) to predict an upper bound to the fully relaxed energy obtained from DFT. New structures were generated via substitution of known prototypes, and a GNN was trained on a new database of close to 128,000 DFT calculations. Many DFT-validated material candidates were found to be stable and exhibit desired functional properties, such as a large electrochemical stability window and suitable reduction and oxidation potentials. —NREL Advanced Computing Annual Report 2022, page 18.2

Result

Once Kestrel is in production, it will allow NREL to launch and support many more research projects than it has been able to do in the past. For example, simulating the power grid in incredible detail.

“Very often, we simulate every customer,” Andersen explained. “We basically simulate the equivalent of what data is coming off individual meters all the way down to the residential level. We can then simulate scenarios—like air conditioning demand, utility outages, or utility production—and identify what impact that has on the grid. It allows us to design better control systems that can be tuned or improved to maintain power quality and stability. That’s pretty fascinating work in our Insight Center.”

In the Insight Center, they can image data down to the individual transformers to understand what loads are doing at any given facility.

Kestrel will enable more research, discovery, and innovation to help transform energy generation, efficiency, usage, and security.

Solution Summary

With the need to reduce use of fossil fuels and as new devices are added to the country’s power grid, renewable energy research is critical to provide new, more efficient energy resources for a more sustainable world. Kestrel will support that research and discovery with 44 petaFLOPS peak computing capability from a combination of CPU nodes for traditional HPC workloads and GPU nodes for machine learning. The CPU nodes deliver approximately 14 PF of peak performance, while the GPU nodes will provide approximately 30 PF of peak performance, including processors from Intel, AMD, and NVIDIA. Kestrel will help more scientists understand the nation’s power usage and research new materials, while industrial partners design and optimize their renewable energy projects.

Solution Ingredients

  • 2,304 nodes of 4th Gen Intel Xeon processors (104 cores per node) with 256 GB DDR5 
  • 10 large memory nodes with 4th Gen Intel Xeon processors and 2 TB DDR5
  • 140 GPU nodes, eight with 4th Gen Intel Xeon processors
  • 44 PF peak performance

Moving Towards Greater Sustainability

Data centers are working hard to achieve neutral carbon footprints through renewable energy sources, even while increasing their computational capacity. The 4th Gen Intel Xeon processor is helping them with innovations that help save power consumption while accelerating computing and making workloads run more efficiently.

For example, built-in accelerators in 4th Gen Intel Xeon processors, such as Intel® Deep Learning Boost (Intel® DL Boost) and Intel® Advanced Matrix Extensions (Intel® AMX), help improve performance per watt. Accelerated computing means workloads can be completed faster, consuming less power. Additionally, built-in telemetry allows IT to continuously monitor, measure, and control power demand and carbon emissions. When non-HPC workloads move to the cloud, new tools like Granulate software (an Intel company) can optimize applications automatically and continuously for core count and power.

Liquid evaporative cooling is replacing chillers in many data centers. Intel has worked with partners to validate cooling fluid and solutions, helping to create reference documents for simpler cooling deployments. To that end, Intel offers the industry’s first immersion cooling warranty rider for select Intel Xeon processors.

From processor architecture to silicon manufacturing, Intel is constantly assessing sustainability for the company and to aid its customers. From improving performance per watt to more efficient cooling solutions, Intel continues to strive for sustainable technology solutions that will enable organizations to reduce the carbon footprint of their data centers.

Download the PDF ›