Business Results

  • Up to 96% faster training in batch with K-means

  • Up to 88% faster training in batch with DBSCAN



View All Reference Kits


Over 50 percent of enterprises worldwide will spend more to deliver an omnichannel experience and to stay connected intelligently with their customers.1

According to a 2021 survey, 45 percent of retailers see an increasing need to enhance personalization of their customers' experiences. AI and machine learning algorithms (such as cluster analysis) can distinguish customer groups that transact differently. Using AI and machine learning enables retailers to tailor offerings to the needs of those customer groups.2 

Retailers have a wealth of data that can help them personalize customer experiences and offers. Data can include purchase behavior patterns to demographics such as channel type and store location. Processing considerable amounts of data and delivering offers in real time requires large computing resources to scale to a varying customer base. 

The customer analytics market is worth up to $20 billion globally (by 2029) and growing at a 19.3 percent Compound Annual Growth Rate (CAGR).3  

Retailers can use this AI reference kit to personalize customer experiences and offerings with more effective, faster insights that help grow consumer brand loyalty. 

Enterprises can use this kit to build and analyze customer segmentation with their customer transaction data. This is done using AI clustering algorithms that are optimized with Intel® software products to deliver faster insights. 

The banking industry can use customer segmentation to better understand their customers, based on parameters such as demographics, attitudes, behaviors, and customer lifetime value. By knowing this level of detail about their customers, banks can provide more personalized products and superior customer service.

This reference kit uses the Intel® AI Analytics Toolkit for machine learning training and inference to help enable faster training and retraining of clusters. This speed of training and retraining helps deliver faster customer segmentation.  


In collaboration with Accenture*, Intel developed an AI reference kit to help retailers implement customer segmentation solutions and deliver improved, personalized customer service and offerings. This reference kit includes:

  • Training data
  • An open source, trained model
  • Libraries
  • User guides
  • Intel® AI software products 

At a Glance

  • Industry: Retail, as well as financial or banking industries
  • Task: Train two AI-based clustering algorithms to identify critical candidate transactions and customer segmentation categories
  • Dataset: Purchase transactions from a multinational retailer
  • Type of Learning: Machine learning
  • Models: K-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
  • Output: Cluster assignments for each customer segment
  • Intel AI Software Products: 
    • Intel AI Analytics Toolkit
    • Intel® Extension for Scikit-learn*
    • Intel® Optimization for XGBoost*

This experiment analyzes and explores two AI-based clustering algorithms to create a targeted understanding of a segment of customers. 

In reality, an analyst runs the same clustering algorithm multiple times on the same dataset, scanning across different hyperparameters. To capture this data, the total amount of time is measured to generate clustering results across a grid of hyperparameters for a fixed algorithm, which is defined as hyperparameter analysis. The results of each hyperparameter analysis provide the analyst with many different clusters that they can further analyze.


Optimized Intel® AI Software Products for Better Performance.

  • Intel Extension for Scikit-learn using K-means and DBSCAN algorithms.
  • Intel AI Analytics Toolkit 
  • Intel Optimization for XGBoost

Performance was tested on Microsoft Azure* Standard_D4_V5 using 3rd generation Intel® Xeon® processors to optimize the kit.

Intel has been directly upstreaming many optimizations to provide improved performance on Intel® CPUs. XGBoost is a well-known machine learning package for gradient-boosted decision trees now includes seamless, drop-in acceleration for Intel® architectures to significantly speed up model training and improve accuracy for better predictions.


A data scientist looks at clustering to narrow the scope of the analysis by identifying high-value clusters. However, clustering is a task that requires a lot of training and retraining, making the job tedious. Getting clusters at a faster speed accelerates the machine learning pipeline. The faster a data scientist can run an algorithm, the more results and experiments that can be run in a limited timeframe.

A clustering algorithm can help enterprises understand the buying behavior for a more personalized experience and aid with business process automation to streamline marketing operations. This reference kit and Intel software products, such as Intel AI Analytics Toolkit, helps with faster processing and lower computing costs, and contributes to a decrease in the total cost of ownership.

Download Kit

Related Reference Kits

Additional Resources

Intel AI Software Portfolio

Intel Extension for Scikit-learn

Intel AI Analytics Kit 

Intel Optimization for XGBoost


  1. IDC FutureScape: Worldwide Retail 2022 Predictions. October 2021. IDC: The Premier Global Market Intelligence Company. Retrieved August 18, 2022,
  2. IDC MarketScape: Worldwide Retail and CPG Customer Data Platform 2022 Vendor Assessment. April 2022. IDC: The Premier Global Market Intelligence Company. Retrieved August 18, 2022, 
  3. Verified Market Research* (September 28, 2021). Customer Analytics Market Size Worth $20.82 Billion, Globally, by 2028 at 19.30% CAGR: Verified Market Research. PR Newswire. 

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at