## Developer Guide and Reference

• 2021.6
• 04/11/2022
• Public Content
Contents

# DBSCAN

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96]. It is a density-based clustering non-parametric algorithm: given a set of observations in some space, it groups together observations that are closely packed together (observations with many nearby neighbors), marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).
 Operation Computational methods Programming Interface Default method

## Mathematical formulation

Computation
Given the set of -dimensional feature vectors (further referred as observations), a positive floating-point number
epsilon
and a positive integer
minObservations
, the problem is to get clustering assignments for each input observation, based on the definitions below [Ester96]: two observations and are considered to be in the same cluster if there is a core observation , and and are both reachable from .
Each cluster gets a unique identifier, an integer number from to . Each observation is assigned an identifier of the cluster it belongs to, or if the observation considered to be a noise observation.

## Distributed mode

The algorithm supports distributed execution in SMPD mode (only on GPU).

## Usage example

Compute
``````void run_compute(const table& data,
const table& weights) {
double epsilon = 1.0;
std::int64_t max_observations = 5;
const auto dbscan_desc = kmeans::descriptor<float>{epsilon, max_observations}
.set_result_options(dal::dbscan::result_options::responses);

const auto result = compute(dbscan_desc, data, weights);

print_table("responses", result.get_responses());
}``````

## Examples

oneAPI DPC++
Batch Processing:
oneAPI C++
Batch Processing:
Python* with DPC++ support
Batch Processing:

#### Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.