Deep Learning with Analytic Zoo Optimizes Mastercard* Recommender AI Service

Published: 03/04/2019  

Last Updated: 03/04/2019

Co-author: Suqiang Song of Mastercard*.

This article introduces a joint initiative between Mastercard* and Intel in building users-items propensity models for a universal recommender AI service. Analytic Zoo1 is a unified analytics and AI platform that seamlessly unites Apache Spark*, TensorFlow*, Keras*, and BigDL2 programs into an integrated pipeline that can transparently scale out to large Apache Hadoop* and Spark clusters for distributed training or inference.

In the finance industry, users-items propensity can be used to calculate the probability of consumers to buy from a particular merchant or a retailer within a given industry. This model can be used to generate market research insights or to deliver personalized recommendations of relevant financial products or merchant deals. Using the deep learning-based neural recommendation models built on Spark, the recommender system can play an essential role in improving the consumer experience, campaign performance, and accuracy of targeted marketing offers/programs with relevant messages that encourage loyalty and rewards. This article uses a personalized marketing business use case as the running example, and focuses on predicting users-items propensity from formatted credit card transactions:

  1. The increased opportunity for higher return on investment (ROI) with offer matching and linking is shaping campaign design and marketing management strategies.
  2. For each target item (such as merchants, categories, geographical locations), estimate the propensity for all consumers to make a purchase within the next several days/weeks, and provide a ranked list of consumers as candidates. Similarly, the model can also recommend a ranked list of items for each of the consumers.
  3. The data engineering and deep learning pipeline should be able to run on top of existing enterprise Apache Hadoop clusters (with Spark services) in a limited time frame to produce the users-items propensity model.
  4. Model serving as an AI service: A universal recommender AI service that can integrate with existing applications/services at different serving contexts, such as real-time, streaming, and batch.


Mastercard, as a leading global provider of payment solutions, is integrating artificial intelligence (AI) into its platform to serve its customers better. Running Analytic Zoo, which supports BigDL on Spark running on large Intel® Xeon® Scalable processor clusters, is an ideal solution that meets enterprise requirements for deep learning, as it allows users to develop and run deep learning applications in production directly on existing big data (Apache Hadoop/Spark) infrastructure. In contrast, there are many challenges to deploying GPU-based solutions in enterprises (e.g., bad tool integration, expensive data duplication and movement, time-consuming and engineer-resource intensive, less monitoring, steep learning curve, etc.), as they are incompatible with existing data analytics infrastructure.

Deep learning can play an important role in driving a higher ROI through marketing campaign effectiveness. For this reason, greater emphasis is placed on sharper insights into consumer behavior to connect with customers according to their interests and preferences. For instance, an offer from a merchant is most effective if it can be sent to consumers with the highest purchase potential. Conventional machine learning algorithms played a vital role in previous solutions. However, the industry is seeking a more robust solution with simplified procedures to handle model complexity, labor-intensive feature engineering processes, and greater accuracy. Recently, many deep learning-based neural recommendation models are being proposed to improve the effectiveness of marketing campaigns further.

Overview of Recommender System

Recommender system (RS) is an information filtering tool for guiding users in a personalized way to discover their preferences from a large space of possible options. It is a critical tool to promote sales and services for many online websites and mobile applications. For instance, 80 percent of movies watched on Netflix* came from recommendations3, and 60 percent of video clicks came from home page recommendations on YouTube*4. Recent advances in deep learning-based recommender systems have gained significant attention by overcoming obstacles of conventional models and achieving high recommendation quality5.

Recommendation models can be classified into three categories: collaborative filtering, content-based, and hybrid systems. Collaborative filtering makes recommendations by learning from user-item historical interactions, either through explicit (e.g., user’s previous ratings) or implicit feedback (e.g., purchase histories). Due to data constraints, in this case, collaborative filtering is leveraged from implicit data.

Benchmark Traditional Machine Learning and Deep Learning

Benchmark Overview

As an integrated analytics and AI platform running natively on Spark, Analytic Zoo meets the standard requirements for enterprise deep learning applications.

Analyze a large amount of data on the same big data clusters where the data are stored (Hadoop Distributed File System (HFS), Apache HBase*, Apache Hive*, etc.) rather than move or duplicate data.

Add deep learning capabilities to existing analytic applications and machine learning workflows rather than rebuild all of them.

Leverage existing big data clusters and infrastructure (resource allocation, workloads management, and enterprise monitoring).

Reduce feature engineering workloads. Deep learning algorithms generate an exponential growth of hidden embedding features and perform the internal features selections and optimization automatically when performing cross-validation at the training stage. When building the model, algorithms only focus on a few pre-defined sliding features and custom overlap features, removing most of the loan-to-value (LTV) pre-calculations works, saving hours of time, and lots of resources.

Automated model optimization. The traditional machine learning (ML) approach relies heavily on human-machine learning experts to optimize the model. Analytic Zoo provides more options for finding an optimally performing robust configuration.

Zero deployment or operation costs since Analytic Zoo runs as a standard Spark program on Intel Xeon processors.

High-level pipeline API enablement, such as data frames, ML pipelines, autograd, transfer learning, Keras/Keras2, etc.

Considering Mastercard has run traditional machine learning for decades for similar models and has spent resources on the Spark ML ecosystem, such as Spark MLlib, the business stakeholders wanted to benchmark the two approaches and identify the differences. So, a benchmark test was conducted between traditional Spark machine learning and the BigDL models in Analytic Zoo.

Select data sets:

The data was collected from a specific channel over the past three years as the dataset.

  • Distinct, qualified consumers: 675,000
  • Target merchants (offers or campaigns) for benchmark: 2000
  • Known transactions:1.4 billion (53 GB of raw data) Time spent: 12 - 24 months for training and 1 - 2 months for validation

Production Environment Hadoop Cluster:

  • 9 node cluster (3 host master node (HMN) nodes, 6 Hortonworks Data Platform (HDP) nodes), for each single node in a physical box
  • 24 hyper cores, 384 GB memory, 21 TB disk
  • Hadoop distribution: Cloudera Distributed Hadoop (CDH) 5.12.1
  • Spark version: 2.2
  • Java* Platform, Standard Edition Development Kit (JDK*) 1.8

Benchmark libraries:

  • Analytics-zoo-bigdl_0.6.0-spark_2.2.0
  • Spark MLlib 2.2.0

For the traditional machine learning approach, an Alternating Least Squares (ALS)6 at Spark MLlib approach was chosen.

For the deep learning approach, based on the latest research and industry practice, a Neural Collaborative Filtering (NCF) and a wide and deep (WAD) model were chosen as the two candidates for the recommender. Keras-style APIs from Analytic Zoo were also used to build deep learning models with Python* and Scala*.

A L S model
Figure 1. Compare Deep Learning models with ALS model

Deep Learning Model Elaborations

Neural Collaborative Filtering (NCF) Model

The simple, generic NCF model, first proposed by Xiangnan He7, is designed to serve as a guideline for developing deep learning methods for recommendation services, aiming to capture the non-linear relationship between users and items. As there are a large number of unobserved instances, NCF utilizes negative sampling to reduce the training data size, which significantly improves learning efficiency. Traditional matrix factorization can be viewed as a special case of NCF. With Analytic Zoo, users can easily build an NCF model as shown in the following graph.

Neural Collaborative Filtering N C F model
Figure 2. Sample of a Neural Collaborative Filtering (NCF) model

Wide and Deep (WAD) Model

In 2016, Heng-Tze Cheng8 proposed an app recommender system for the Google Play* store with a wide and deep (WAD) model. The wide component is a single-layer perceptron, which works as a generalized linear model. The deep component is multilayer perceptron similar to NCF. Combining these two learning techniques enables the recommender system to capture both memorization and generalization. For this case, merchant ID and other features were used to generate the cross columns for the wide model.

Wide and Deep Model diagram
Figure 3. A Wide and Deep Model diagram

The WAD model used a SparseTensor, and quite a few layers explicitly designed for sparse data calculation, e.g., SparseLinear, SparseJoinTable, etc. Analytic Zoo supports both data frame and Resilient Distributed Dataset (RDD) interface for data preparation and training, providing flexibility for different scenarios and allowing compatibility across Spark 1.5 to the latest versions.

Model Evaluation

With the evaluation utilities from Spark MLlib ALS, the recommender implemented with NCF and WAD were measured with the following metrics.

  • Receiver operating characteristic area under curve (ROC AUC)
  • Precision recall area under curve (PR AUC)
  • Precision and recall
  • Top 20 precision of ranked results for each customer

To compare with traditional matrix factorization algorithms the same data and optimization parameters were also trained with ALS in Spark 2.2.0. Comparably, deep learning models bring significant improvements over the ALS model, as shown in the table below.

  NCF Model WAD Model
Recall Improvement over ALS 29% 26%
Precision Improvement over ALS 18% 21%
Top 20 Accuracy Improvement over ALS 14% 16%

Model Serving

Serving Approach

The Analytic Zoo model can be seamlessly integrated into web services such as Spark Streaming, Kafka*, etc., by using Plain Old Java Object (POJO), local Java APIs, or Scala/Python model loading APIs.

Mastercard uses data pipeline framework, Apache NiFi9 to build the enterprise data pipeline platform. It developed relevant, customized processors to embed the deep learning and model serving process into existing enterprise data pipelines by leveraging the serving APIs from Analytic Zoo.

  • Build the model serving capability by exporting the model to scoring/prediction/recommendation services and integration points.
  • Integrate the model serving services inside the business data pipelines. For example, embed them into Spark jobs for offline, Spark Streaming jobs for streaming, the real-time “dialogue” with Kafka messaging, and so forth.


This article describes our experience in building a recommender AI service based on consumer transaction history using deep learning with Analytic Zoo, which provides a great solution to meet the deep learning requirements of enterprises. Two deep learning models (NCF, WAD) are developed and evaluated. Compared with traditional machine learning algorithms (such as LR or ALS), deep learning models can significantly improve the recommender quality and simplify the model training procedures. As an end-to-end industry example, we showed how to leverage deep learning with Analytic Zoo to build an excellent recommender system to help power a critical element of Mastercard’s marketing and personalization capabilities.


  1. Analytics Zoo
  2. BigDL
  3. Carlos A Gomez-Uribe and Neil Hunt. 2016. The Netflix Recommender System: Algorithms, Business Value, and Innovation. ACM Transactions on Management Information Systems (TMIS) 6, 4 (2016), 13.
  4. James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, and Dasarathi Sampath. 2010. The YouTube Video Recommendation System. In Proceedings of the Fourth ACM Conference on Recommender Systems (RecSys ’10). ACM, New York, NY, USA, 293–296.
  5. Shuai Zhang, Lina Yao, and Aixin Sun. Deep learning-based Recommender System: A Survey and New Perspectives. arXiv preprint arXiv:1707.07435, 2017.
  6. Robert M. Bell and Yehuda Koren. Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights
  7. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 173–182.
  8. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide and Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7–10.
  9. Apache NiFi

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at