DAOS Storage Revolutionizes High-Performance Storage

Enabled by Intel® Optane™ persistent memory, Distributed Asynchronous Object Storage (DAOS) offers dramatic improvements to storage I/O to accelerate HPC, AI, analytics, and cloud projects.

DAOS

  • Traditional storage relies on locking metadata in I/O blocks when processing requests from multiple compute nodes. DAOS stores metadata in bytes in Intel® Optane™ persistent memory, eliminating the latency from I/O block locking.

  • Combined with NVMe SSDs, DAOS can take full advantage of PCIe direct connect speeds for microsecond storage I/O.

  • DAOS is helping push exascale HPC boundaries and driving some of the top configurations listed in the IO-500.

  • Persistent memory also mitigates the performance impact of write pressure in DAOS with low-cost NAND storage.

  • Intel® Xeon® Scalable processors in storage clusters can improve DAOS performance with more memory channels and PCIe bandwidth.

author-image

By

DAOS Storage Revolutionizes High-Performance Storage

Enabled by Intel® Optane™ persistent memory, Distributed Asynchronous Object Storage (DAOS) offers dramatic improvements to storage I/O to accelerate HPC, AI, analytics, and cloud projects.

Achieve High-Performance Storage with DAOS

DAOS is an open source, distributed storage solution based on several innovative principles. This technology uses the fast I/O and data persistence of Intel® Optane™ persistent memory in combination with any Non-Volatile Memory Express (NVMe) SSD, such as Intel® NVMe SSDs, to alleviate bottlenecks and drive storage performance in distributed environments.

Latency in Traditional Storage

In a traditional storage solution, metadata informs the operating system (OS) about where data is located within a storage cluster. Anytime the system accesses data for read or write operations, it must also create or modify correlating metadata within an I/O block on the underlying storage media. In compute clusters, multiple nodes may need to access the same block, so traditional storage will temporarily lock the block to prevent write conflicts. When replicated across millions of read/write operations, this process generates a significant amount of storage latency that limits application I/O.

Microsecond Write Latencies with DAOS, Intel® Optane™ Persistent Memory, and Intel® NVMe SSDs

In a DAOS configuration, Intel® Optane™ persistent memory modules store metadata for the entire cluster by byte rather than by block, so there’s no need to lock the block as with traditional storage. The use of NVMe SSDs further allows storage I/Os to saturate the PCIe bus with a bigger data pipeline as compared to SATA SSDs.1 As a result, DAOS can deliver storage I/O that is faster by orders of magnitude—from milliseconds (ms) to tens of microseconds (μs)—compared to traditional storage.2 Persistent memory also preserves metadata through system shutdowns or reboots and can absorb small write operations to help ensure system uptime and availability for stringent SLAs. In DAOS deployments with 3D QLC NAND storage, persistent memory can also help mitigate the performance impact of write pressure on the storage cluster.

Figure 1. The DAOS software and hardware stack.

Open Source Software and Validated DAOS Releases

In addition to the hardware layer, solution providers will need open source DAOS software to complete the stack. Developers can download and compile the code directly from GitHub. For a simpler deployment path, tested and validated binary releases are available through the community daos.io website. Intel actively works with partners and solution providers to enable their DAOS product offerings with L3 technical support.

Access the latest code from GitHub
Download validated binaries from daos.io
Review DAOS documentation on daos.io
Join the DAOS mailing list for the latest community updates

HPC and Big Data: DAOS Storage Changing the Future

Within big data and HPC clusters, compute nodes are tightly connected to storage tiers and data scientists commonly deal with a variety of cold, warm, and hot data types. The future of storage configurations will depend on a hybrid approach where DAOS is attached to file systems that also use cost-optimized SATA drives. Academic and government labs are already seeing results with high compute utilization in HPC clusters that are driving fast discovery.

For example, Washington University’s radiology research center deployed a software-defined storage system enabled by DAOS to accommodate up to 13 petabytes of storage at a USD 1,500 reduced cost per storage node.3 Commercial HPC deployments show a lot of potential as well, especially in the energy and healthcare sectors that depend on HPC AI, analytics, and simulation workloads. In the IO-500 benchmark, a ranking of the world’s fastest storage systems, half of the top 10 positions are currently held by DAOS configurations.4

DAOS Storage for Exascale Performance

DAOS is the file system of choice for the Argonne National Lab (ANL) Aurora supercomputer, the first planned HPC system targeting exascale compute performance, with up to 230 petabytes of DAOS-enabled storage at > 25 TB/s read/write bandwidth. ANL and the Texas Advanced Computing Center (TACC), another Intel partner with DAOS-enabled HPC, were also both ranked in the top five on the IO-500 list as of September 2020. These successes have also spurred interest from CSPs like Google Cloud Platform, who is now looking to integrate DAOS into its cloud storage services.

Low Read Latencies, Even in Presence of Write Pressure

Even for cost-optimized media that’s qualified for read-intensive workloads, DAOS can have a positive impact. When tested with Intel® QLC 3D NAND SSDs in place of NVMe SSDs, a DAOS configuration was able to achieve read tail latencies of five nines (P99.999) between 200 and 300 μs, meaning that 99.999 percent of all requests were delivered in under 300 μs.2 In the presence of write pressure as high as 2,500 MB/s, the same test showed that DAOS could maintain file system SLAs to achieve five nines in less than 5 ms.2

Intel® Xeon® Scalable Processors and DAOS Performance

Processor performance in a storage node will positively impact DAOS performance. Generational improvements in the number of memory channels, bandwidth per channel, as well as PCIe speed (PCIe 4 vs. PCIe 3) offer a significant boost to DAOS. In an IOR benchmark test, a configuration with 3rd Gen Intel® Xeon® Scalable processors and Intel® Optane™ persistent memory 200 series achieved a 58 percent increase in write performance compared to previous-generation CPUs and persistent memory.5 It’s expected that PCIe Gen 5 in future processor generations will bring even higher performance levels to DAOS.

DAOS Conclusion: A New Path for Fast Storage I/O

DAOS offers a new path to achieve excellent storage I/O that matches pace with growing compute performance and powers the most-demanding use cases while providing storage I/O headroom in everything else. Applications for AI, analytics, HPC, and even cloud computing can benefit from DAOS based on Intel® Optane™ persistent memory.

FAQs

Frequently Asked Questions

Distributed Asynchronous Object Storage (DAOS) is an open source, distributed storage solution that uses Intel® Optane™ persistent memory for metadata access and retrieval, in combination with NVMe SSDs for high-performance data read/write speeds.

Traditional distributed storage saves metadata in I/O blocks on the underlying media and locks the I/O block from changes anytime a compute node performs an operation that creates, updates, or deletes metadata. In a DAOS configuration, metadata is stored byte by byte on Intel® Optane™ persistent memory modules. DAOS storage clusters do not need to lock I/O blocks down, which results in a dramatic reduction in storage I/O latency.

Yes, you can use POSIX with DAOS. While DAOS does not use POSIX internally, we provide a POSIX interface, as well as a key-value, HDF5, MPI-IO, Python, and other interfaces. See the daos.io documentation for more information on the tradeoffs of different DAOS interfaces.

Performance varies by use, configuration, and other factors. Learn more at intel.com/PerformanceIndex​​.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available ​updates. See backup for configuration details. No product or component can be absolutely secure.

Product and Performance Information

1Configurations: Performance claims obtained from data sheet, sequential read/write at 128K block size for NVMe and SATA, 64K for SAS. Intel® SSD DC P3700 Series 2 TB, SAS Ultrastar SSD1600MM, Intel® SSD DC S3700 Series SATA 6 Gbps. Intel® Core™ i7-3770K CPU @ 3.50 GHz, 8 GB of system memory, Windows Server 2012, IOMeter. Random performance is collected with four workers each with 32 QD.
2Achieve High-Performance Storage with DAOS,” intel.com, June 2021.
3“Washington University: Deploying High-Speed Storage with Intel® Technologies,” intel.com, October 2021.
4IO500.org as of October 2021.
5Test by Intel® as of October 15, 2020. Baseline configuration: Platform: S2600WFO, 1 node with 2x8260L Platinum Intel® Xeon® 2nd Gen Scalable CPUs, microcode 0x400002c, HT and turbo on, performance mode, system BIOS SE5C620.86B.02.01.0008.031920191559, PMem Firmware 01.00.00.5127, system DRAM config 12 slots / 16 GB / 2666 (192 GB total memory), system PMem Config 12 slots / 512 GB / 2666 (6 TB total PMem), 1x Intel® SATA SSD, 2x Intel® OPA100 NIC, PCH Intel® C621, OS openSUSE Leap 15.2, Kernel 5.3.18-lp152.44-default, workload DAOS 1.1.0. New configuration: Platform: WLYDCRB1, 1 node with 2x ICX-24C Intel® Xeon® 3rd Gen Scalable CPUs (Ice Lake preproduction), microcode 0x8b000260, HT and turbo on, performance mode, system BIOS WLYDCRB1.SYS.0017.D75.2007020055, PMem Firmware 02.01.00.1110, system DRAM Config 16 slots / 16 GB / 3200 run at 2933 (256 GB total memory), system PMem config 16 slots / 512 GB / 2933 (8 TB total PMem), 1x Intel® SATA SSD, 2x Intel® OPA100 NIC, PCH Intel® C621, OS openSUSE Leap 15.2, Kernel 5.3.18-lp152.44-default, workload DAOS 1.1.0.