DBSCAN

Intel® oneAPI Data Analytics Library Developer Guide and Reference

Download PDF

ID 772611

Date 12/16/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-76A93565-A2C8-427A-A0EC-7DCA6C89FA47

View Details

DBSCAN

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96]. It is a density-based clustering non-parametric algorithm: given a set of observations in some space, it groups together observations that are closely packed together (observations with many nearby neighbors), marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).

Operation	Computational methods	Programming Interface
Compute	Default method	compute(…)	compute_input	compute_result

Mathematical formulation

Refer to Developer Guide: DBSCAN.

Programming Interface

All types and functions in this section are declared in the oneapi::dal::dbscan namespace and are available via inclusion of the oneapi/dal/algo/dbscan.hpp header file.

Descriptor

template<typenameFloat=float,typenameMethod=method::by_default,typenameTask=task::by_default>classdescriptor

Template Parameters

Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of algorithm. Can be method::brute_force.
Task – Tag-type that specifies the type of the problem to solve. Can be task::clustering.

Constructors

descriptor(doubleepsilon, std::int64_tmin_observations)

Creates a new instance of the class with the given epsilon, min_observations.

Properties

doubleepsilon

The distance epsilon for neighbor search.

Getter & Setter: double get_epsilon() const
auto & set_epsilon(double value)
Invariants: epsilon >= 0.0

boolmem_save_mode

The flag for memory saving mode.

Getter & Setter: bool get_mem_save_mode() const
auto & set_mem_save_mode(bool value)

std::int64_tmin_observations

The number of neighbors.

Getter & Setter: std::int64_t get_min_observations() const
auto & set_min_observations(std::int64_t value)

result_option_idresult_options

Choose which results should be computed and returned.

Getter & Setter: result_option_id get_result_options() const
auto & set_result_options(const result_option_id &value)

Method tags

structbrute_force

usingby_default=brute_force

Task tags

structclustering

Tag-type that parameterizes entities used for solving clustering problem.

usingby_default=clustering

Alias tag-type for the clustering task.

Computation compute(...)

Input

template<typenameTask=task::by_default>classcompute_input

Template Parameters: Task – Tag-type that specifies type of the problem to solve. Can be task::clustering.

Constructors

compute_input(consttable&data={}, consttable&weights={})

Creates a new instance of the class with the given data and weights.

Properties

consttable&data

An table with the data to be clustered, where each row stores one feature vector.

Getter & Setter: const table & get_data() const
auto & set_data(const table &data)

consttable&weights

A single column table with the weights, where each row stores one weight per observation.

Getter & Setter: const table & get_weights() const
auto & set_weights(const table &weights)

Result

template<typenameTask=task::by_default>classcompute_result

Template Parameters: Task – Tag-type that specifies type of the problem to solve. Can be task::clustering.

Constructors

compute_result()

Creates a new instance of the class with the default property values.

Properties

consttable&responses

An table with the responses assigned to the samples in the input data. Default value: table{}.

Getter & Setter: const table & get_responses() const
auto & set_responses(const table &value)

consttable&core_flags

An table with the core flags assigned to the samples in the input data.

Getter & Setter: const table & get_core_flags() const
auto & set_core_flags(const table &value)

constresult_option_id&result_options

Result options that indicates availability of the properties. Default value: default_result_options<Task>.

Getter & Setter: const result_option_id & get_result_options() const
auto & set_result_options(const result_option_id &value)

consttable&core_observations

An table with the core observations in the input data. is a number of core observations.

Getter & Setter: const table & get_core_observations() const
auto & set_core_observations(const table &value)

consttable&core_observation_indices

An table with the indices of core observations in the input data. is a number of core observations.

Getter & Setter: const table & get_core_observation_indices() const
auto & set_core_observation_indices(const table &value)

std::int64_tcluster_count

The number of clusters found by the algorithm.

Getter & Setter: std::int64_t get_cluster_count() const
auto & set_cluster_count(std::int64_t value)
Invariants: cluster_count >= 0

Operation

template<typenameDescriptor>dbscan::compute_resultcompute(constDescriptor&desc, constdbscan::compute_input&input)

Parameters

desc – DBSCAN algorithm descriptor dbscan::descriptor
input – Input data for the compute operation

Preconditions: input.data.has_data  ==  true
!input.weights.has_data  ||  input.weights.row_count  ==  input.data.row_count  &&  input.weights.column_count  ==  1

Usage example

Compute

void run_compute(const table& data,
                           const table& weights) {
   double epsilon = 1.0;
   std::int64_t max_observations = 5;
   const auto dbscan_desc = kmeans::descriptor<float>{epsilon, max_observations}
      .set_result_options(dal::dbscan::result_options::responses);

   const auto result = compute(dbscan_desc, data, weights);

   print_table("responses", result.get_responses());
}

Examples

oneAPI DPC++

Batch Processing:

dpc_dbscan_brute_force_batch.cpp

oneAPI C++

Batch Processing:

cpp_dbscan_brute_force_batch.cpp

Python* with DPC++ support

Batch Processing:

dbscan_batch.py

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Data Analytics Library Developer Guide and Reference

DBSCAN

Mathematical formulation

Programming Interface

Usage example

Examples