Accelerating Data Analytics and AI

Published: 05/24/2019  

Last Updated: 12/23/2019

By Mark L Skarpness, Ziya Ma

In 2017, the average Fiber to the Home (FTTH) household generated 85 GB of Internet traffic and is expected to generate approximately 264 GB1 of Internet traffic per month in 2022. For comparison, a smart car will generate 50 GB, a smart hospital 3,000 GB, a plane 40 TB, and a city safety system 50 PB—in a single day2. And these predictions are for 2019; by 2022 there will be 3X more connected devices (28.5 billion1) than the global population which means 3x more traffic. The quantity of data generated is difficult to comprehend, much less make actionable.

This exponential growth in volume and variety of data provides enterprises a tremendous opportunity to gain a competitive edge through analytics-driven insights. Those who turn the mountains of information into actionable intelligence will be positioned to make business operations more efficient, drive faster innovation, and deliver improved security.

With this goal in mind, Intel released a Data Analytics Reference Stack to help enterprises analyze, classify, recognize, and process large amounts of data. Using a modern system stack such as this, built on Intel® Xeon® Scalable platforms and featuring software optimizations at each layer, enterprise customers and developers can gain a significant performance boost, from hardware up to the application layer.

The Data Analytics Reference Stack is containerized software that integrates industry-leading components: Clear Linux*, Open Java Development Kit* (OpenJDK), Intel® Math Kernel Library (Intel® MKL), open source Basic Linear Algebra Subprograms (OpenBLAS), Apache Hadoop*, and Apache Spark*.

This stack is ready-to-use and gives application developers and architects a powerful way to store and process large amounts of data using a distributed processing framework to efficiently build big-data solutions and solve domain-specific problems. Having a streamlined system stack frees users from the complexity of integrating multiple components and software versions, and delivers a stable, performant platform upon which to quickly develop, test, and deploy solutions.

The Deep Learning Reference Stack is an integrated, highly-performant open source stack, that’s optimized for Intel® Xeon® Scalable processors. This stack includes Intel® Deep Learning Boost (Intel® DL Boost), Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI), designed to accelerate AI deep learning use cases such as image recognition, object detection, speech recognition, language translation and more. This latest technology significantly increases deep learning inference performance over previous generations.

Coupled with the AI capabilities of the Deep Learning Reference Stack and the deep learning inference capabilities built into Intel® Xeon® Scalable processors, the Data Analytics Reference Stack enables faster input, storage, and analysis of large data sets. Together, these stacks deliver higher performance for analytics and AI.

Visit the Clear Linux* Stacks page to learn more and download the Data Analytics Reference Stack and the Deep Learning Reference Stack code, contribute feedback, and join the Clear Linux community – sign up to receive our developer mailing list.

We want to hear how the community uses these stacks together for analytics and AI applications. Continue to send us your ideas for further enhancements.


  1. Service Provider Visual Networking Index (VNI)
  2. Cisco Global Cloud Index 2014–2019


Ziya Ma vice president of architectureZiya Ma, vice president of Architecture, Graphics and Software, director of Data Analytics Technologies at Intel.


Mark Skarpness, vice president of system software productsMark Skarpness, vice president of System Software Products and director of Data-Centric System Stacks at Intel.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at