Service providers and end users worldwide are seeing the benefits of artificial intelligence (AI) as machine learning algorithms are increasingly used to process the world’s data and enhance our digital services. Using AI to make the most of the data opportunity requires a complete workflow, from data science workstations up to cloud and eventually out to inference devices – not only for processing data, but moving and storing data as well.
Inference is How End Users Interact with AI
Today, we’re moving from training models to deploying them in the real world. Previously, much of the work in AI focused on training – refining the model for the application you hope to render. However, end users for AI services don’t experience training. They experience inference – the rendering of the AI service.
Services often must render results rapidly to be relevant to their end users – whether those are medical professionals, research scientists, or consumers of voice recognition services. As a result, we see more inference in local servers and Internet of Things (IoT) devices at the edge, driven by the need for low-latency, real-time inference results, in addition to the inference on less-time-sensitive data sent the cloud.
Real Value from Data
At Data-Centric Innovation Day, we are excited to highlight several AI deployments delivering rapid, real-world results for a seamless user experience by drawing on the latest additions to Intel’s diverse silicon portfolio, such as 2nd Generation Intel® Xeon® Scalable processors and Intel® Optane™ DC persistent memory.
The Texas Advanced Computing Center (TACC) will use Intel® Xeon® Platinum 8200 processors to power its Frontera system, supporting multi-faceted advanced research for the National Science Foundation. At Data-Centric Innovation Day, we are excited to share that Frontera will also incorporate more than 100 terabytes of Intel Optane DC persistent memory, the first installation of the technology at this scale. This store of persistent memory in close proximity to performant compute will enable simulations, AI algorithms, and in-memory analytics of unprecedented complexity. Frontera will help to reveal what’s possible with massively-parallel AI inference on high-performance computing systems. We eagerly look forward to the discoveries that Frontera will produce.
iFLYTEK is one of the most innovative companies in the People’s Republic of China and supports a variety of voice-based products in industries like communications, music, and intelligent toys. Customers will turn elsewhere if the company can’t process its daily volume of six billion voice recognition transactions expediently. iFLYTEK faces the continual challenge of expanding data center capacity to keep up with increasing customer demand, with total cost of ownership as one of their primary concerns. Adding to this challenge is iFLYTEK’s ongoing expansion into new businesses such as education and medical diagnostics.
For several years, iFLYTEK has actively migrated more of its business to Intel architecture, including 2nd Gen Intel Xeon Scalable processors with Intel® Deep Learning Boost (Intel DL Boost). The AI giant’s reliance on Intel reiterates the capabilities of Intel solutions to deliver leading AI products in a cost-effective manner to end users.
A Comprehensive AI Portfolio
Data-Centric Innovation Day features the debuts of technologies that will be under the hood of systems running complex AI workloads alongside the traditional data center and cloud applications at which Intel Xeon Scalable processor based systems excel.
- With the addition of Intel DL Boost – essentially an AI inference accelerator built into the CPU – 2nd Generation Intel Xeon Scalable processorshave demonstrated AI inference throughput increases of 14X in comparison to Intel Xeon Scalable processors at their launch in July 2017. 
- Intel Optane DC persistent memory fundamentally re-architects the storage pyramid, bringing large amounts of persistent memory nearer to compute than ever before and facilitating current AI use cases like high-content image analysis and allowing memory bottlenecks for the most complex current and future deep learning applications to be broken.
- The Intel® Optane™ DC SSD P4800X series of solid state drives offer reduced data access latency for drives under write pressure, enabling rapid data analysis even as more raw data and results are stored and increasing data availability for mission-critical applications.
- Intel® Select Solutions for AI Inferencing take the guesswork out of AI hardware selection and deployment, assembling all of the components needed to deliver an enterprise-ready inferencing system. With 2nd Gen Intel Xeon Scalable processors with Intel DL Boost, Intel Optane SSDs, and the Intel Distribution of OpenVINO Toolkit, these solutions maximize the capabilities of Intel architecture for AI inference, all within a package that makes it easier for customers to select the right system to suit their inference needs.
The results customers are realizing with versatile, efficient, performant Intel architecture, especially when dealing with very large workloads, demonstrate again and again that real-world AI solutions require systems able to balance the need to move, process, and store larger and larger quantities of data. It's not only about whether you have the right processors, accelerators, and storage. It's how you balance the entire system between compute, acceleration, memory, memory access, and interconnect.
AI Results at the Speeds End Users Demand
Ultimately, AI service providers will succeed or fail based on the quality of experience provided to their own end customers. In use cases like those discussed here, the speed of AI inference and accuracy of results delivered will determine whether a product is relevant or irrelevant to end users. The products Intel announced at Data-Centric Innovation Day will enable complete AI systems for scalable, deployable, real-world results at the speeds end users require. For more on Intel’s product portfolio for AI, please visit www.intel.ai.
 14x inference throughput improvement on Intel® Xeon® Platinum 8280 processor with Intel® DL Boost: Tested by Intel as of 2/20/2019. 2 socket Intel® Xeon® Platinum 8280 Processor, 28 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: SE5C620.86B.0D.01.0271.120720180605 (ucode: 0x200004d), Ubuntu 18.04.1 LTS, kernel 4.15.0-45-generic, SSD 1x sda INTEL SSDSC2BA80 SSD 745.2GB, nvme1n1 INTEL SSDPE2KX040T7 SSD 3.7TB, Deep Learning Framework: Intel® Optimization for Caffe version: 1.1.3 (commit hash: 7010334f159da247db3fe3a9d96a3116ca06b09a) , ICC version 18.0.1, MKL DNN version: v0.17 (commit hash: 830a10059a018cd2634d94195140cf2d8790a75a, model:
https://github.com/intel/caffe/blob/master/models/intel_optimized_models/int8/resnet50_int8_full_conv.prototxt, BS=64, synthetic Data, 4 instance/2 socket, Datatype: INT8 vs Tested by Intel as of July 11th 2017: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact‘, OMP_NUM_THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Caffe: (http://github.com/intel/caffe/), revision f96b759f71b2281835f690af267158b82b150b5c. Inference measured with “caffe time --forward_only” command, training measured with “caffe time” command. For “ConvNet” topologies, synthetic dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from
https://github.com/intel/caffe/tree/master/models/intel_optimized_models(ResNet-50). Intel C++ compiler ver. 17.0.2 20170213, Intel MKL small libraries version 2018.0.20170425. Caffe run with “numactl -l“.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks.
Performance results are based on testing or projections as of 7/11/2017 to 4/1/2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice (Notice Revision #20110804).
The benchmark results may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and workloads utilized in the testing, and may not be applicable to any particular user's components, computer system or workloads. The results are not necessarily representative of other benchmarks and other benchmark results may show greater or lesser impact from mitigations.
Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
Intel, the Intel logo, Xeon, Optane, and OpenVINO are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © Intel Corporation.