Building a Modern AI Data Storage Pipeline

Modern storage infrastructures need an AI data pipeline that can deliver at every stage. See how information moves through the four stages of the AI data pipeline and where Intel® Optane™ SSDs and 3D NAND SSDs play a critical role.



The rapid evolution of AI solutions and Intel technology leadership provides an opportunity for businesses to access deeper insights and increase competitiveness by unlocking the full potential of their data. AI innovators are adopting a shared storage pipeline architecture. This consists of performance optimized and capacity optimized devices, supporting a mixed usage environment of AI and other applications, such as business analytics, reporting, web services, and more.

At Ingest, unstructured and structured data is received from multiple sources. Increasingly, more of this data is streaming real time. An Ingest buffer supporting high write throughput while de-staging to capacity storage is important in keeping sources online. Data preparation steps include labeling, compressing, de-duplication, transforming, and cleansing. Time spent in data prep can consume 80% of the pipeline. Storage capable of supporting high throughput mixed workloads is key in enabling reduced prep time.

Training is a compute intensive phase where training runs are completed on randomized data sets. Underperforming storage can limit the number of runs completed or the data per run, which can both reduce compute utilization and model accuracy.

Inference involves reading trained models and the data from Ingest. Fast response times are increasingly important as expectations for real time results become more prevalent. Through all phases of the storage pipeline, Intel Optane SSDs help improve AI outcomes and efficiency by moving data at high throughput across variable access patterns with predictable low latency. And capacity optimized Intel PCIE 3D NAND SSDs help improve the space and operational efficiency of storing ever-increasing volumes of data.

To find out how Intel can modernize your AI storage pipeline, go to