Intel® Memory Failure Prediction uses machine learning to send potential memory
failure alerts prior to hardware failure and thus reducing impact of downtime.
Business
Meituan, founded in March 2010, is a
company based in China that offers an
online delivery and social commerce
platform providing discounts for movie
tickets, groceries, food, restaurants,
entertainment and ...health/fitness
products and services.
Figure 1. Meituan Beijing Headquarter
Challenges • Real-time visibility into server memory health
• Predicting catastrophic server memory failures before they happen
Solution • Intel® Memory Failure Prediction
Executive Summary
Meituan-Dianping (Meituan), a Chinese leading e-commerce platform for services,
setup Intel® Memory Failure Prediction (Intel® MFP) for a test deployment over
several thousands of servers based on Intel® Xeon® Scalable Processors to help
improve the performance and reliability of its server memory which is essential to
fast data analytics computing.
Meituan deployed Intel® MFP in its data center, integrating it into their existing
management solutions to take advantage of its memory analysis and predictive
capabilities. The aim is to help them analyze and model server memory-failure
data in order to predict potential failures, prevent downtime, and optimize their
current Dual Inline Memory Module (DIMM) upgrade.
The Intel® MFP deployment resulted in improved memory reliability by predictions
based on the analysis of the micro-level memory failure logs. Intel® MFP allowed
data center staff to migrate workloads before catastrophic memory failures could
happen, use page offlining policies to isolate unreliable memory cells or pages, or
replace failing DIMMs before they reach a terminal stage, thus reducing downtime
by responding appropriately before server failure occurs.
“We would thank Intel for Memory Failure Prediction collaboration with Meituan”
said Rui Guo who is the leader of Infrastructure/Server technology at Meituan,
“the testing results indicates, with Intel® MFP’s prediction capabilities, it could
significantly reduce server hardware failures by up to 40 percent.”.
1
Case Study | Intel® Memory Failure Prediction Enhances Reliability And Eradicates Impact of Memory Failure
Background Meituan-Dianping, a China leading e-commerce platform
for services, has Meituan, Dianping, Takeaway, Taxi, Mobike
and other well-known apps for customers. Services include
catering, takeaway, taxi, bike, hotels. There are more than
200 categories such as tourism, film, entertainment and
the business covers 2800 cities. To remain successful and
competitive, Meituan has to be able to rely on the health
of its data center infrastructure and predict failures to act
proactively. Memory failures are one of the top three hardware failures
that occur in data centers today. Using Machine Learning to
analyze real-time memory health data would make it possible
to predict such failures ahead of time, and this ultimately
translates to a better experience for their customers.
This is why Meituan deployed Intel® MFP in a test
environment containing several thousands of servers with
Intel® Xeon® Scalable Processors. They integrated Intel® MFP
into their existing data center monitoring solution and were
able to gain greater insights into server memory health.
Intel® MFP is an ideal solution for organizations such as
online services platforms and cloud service providers
relying heavily on server hardware reliability, availability
and serviceability. The solution helps to significantly
reduce memory failure events by analyzing data and then
predicting catastrophic events before they happen.
Intel® MFP Provides Real-time Memory Health Visibility
Intel® MFP uses machine learning to analyze server memory
errors down to the DIMM, bank, column, row, and cell levels
to generate
Read the full Case Study
Intel® Memory Failure Prediction
Meituan eCommerce Platform for Services
®
Intel Memory Failure Prediction
Improves Reliability at Meituan.
Intel® Data Center Manager Links
Notices and Disclaimers
Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.
Memory failure prediction results provided through the use of Intel MFP are estimated and may vary based on differences in system hardware, software, or configuration. Results are derived using multi-dimensional models and algorithms to predict potential memory failures and do not constitute a representation or guarantee regarding memory failure.