Developer Guide and Reference

  • 2021.6
  • 04/11/2022
  • Public Content


Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96]. It is a density-based clustering non-parametric algorithm: given a set of observations in some space, it groups together observations that are closely packed together (observations with many nearby neighbors), marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).
Computational methods
Programming Interface
Default method

Mathematical formulation

Given the set LaTex Math image. of LaTex Math image. LaTex Math image.-dimensional feature vectors (further referred as observations), a positive floating-point number
and a positive integer
, the problem is to get clustering assignments for each input observation, based on the definitions below [Ester96]: two observations LaTex Math image. and LaTex Math image. are considered to be in the same cluster if there is a core observation LaTex Math image., and LaTex Math image. and LaTex Math image. are both reachable from LaTex Math image..
Each cluster gets a unique identifier, an integer number from LaTex Math image. to LaTex Math image.. Each observation is assigned an identifier of the cluster it belongs to, or LaTex Math image. if the observation considered to be a noise observation.

Programming Interface

Distributed mode

The algorithm supports distributed execution in SMPD mode (only on GPU).

Usage example

void run_compute(const table& data, const table& weights) { double epsilon = 1.0; std::int64_t max_observations = 5; const auto dbscan_desc = kmeans::descriptor<float>{epsilon, max_observations} .set_result_options(dal::dbscan::result_options::responses); const auto result = compute(dbscan_desc, data, weights); print_table("responses", result.get_responses()); }


oneAPI DPC++
Batch Processing:
oneAPI C++
Batch Processing:
Python* with DPC++ support
Batch Processing:

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at