Developer Guide and Reference

  • 2021.6
  • 04/11/2022
  • Public Content
Contents

DBSCAN

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96]. It is a density-based clustering non-parametric algorithm: given a set of observations in some space, it groups together observations that are closely packed together (observations with many nearby neighbors), marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).
Operation
Computational methods
Programming Interface
Default method

Mathematical formulation

Programming Interface

All types and functions in this section are declared in the
oneapi::dal::dbscan
namespace and are available via inclusion of the
oneapi/dal/algo/dbscan.hpp
header file.
Descriptor
template<typename
Float
= float, typename
Method
= method::by_default, typename
Task
= task::by_default>
class
descriptor
Template Parameters
  • Float
    – The floating-point type that the algorithm uses for intermediate computations. Can be
    float
    or
    double
    .
  • Method
    – Tag-type that specifies an implementation of algorithm. Can be
    method::brute_force
    .
  • Task
    – Tag-type that specifies the type of the problem to solve. Can be .
Constructors
descriptor
(double
epsilon
, std::int64_t
min_observations
)
Creates a new instance of the class with the given
epsilon
,
min_observations
.
Properties
std::int64_t
min_observations
The number of neighbors.
Getter & Setter


std::int64_t get_min_observations() const
auto & set_min_observations(int64_t value)

bool
mem_save_mode
The flag for memory saving mode.
Getter & Setter


bool get_mem_save_mode() const
auto & set_mem_save_mode(bool value)

double
epsilon
The distance
epsilon
for neighbor search.
Getter & Setter


double get_epsilon() const
auto & set_epsilon(double value)

Invariants


epsilon >= 0.0

result_option_id
result_options
Choose which results should be computed and returned.
Getter & Setter


result_option_id get_result_options() const
auto & set_result_options(const result_option_id &value)

Method tags
struct
brute_force
using
by_default
= brute_force
Task tags
struct
clustering
Tag-type that parameterizes entities used for solving clustering problem.
using
by_default
= clustering
Alias tag-type for the clustering task.
Computation
compute(...)
Input
template<typename
Task
= task::by_default>
class
compute_input
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be .
Constructors
compute_input
(
const
table &
data
= {},
const
table &
weights
= {})
Creates a new instance of the class with the given
data
and
weights
.
Properties
const
table &
weights
A single column table with the weights, where each row stores one weight per observation.
Getter & Setter


const table & get_weights() const
auto & set_weights(const table &weights)

const
table &
data
An LaTex Math image. table with the data to be clustered, where each row stores one feature vector.
Getter & Setter


const table & get_data() const
auto & set_data(const table &data)

Result
template<typename
Task
= task::by_default>
class
compute_result
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be .
Constructors
compute_result
()
Creates a new instance of the class with the default property values.
Properties
const
table &
core_observations
An LaTex Math image. table with the core observations in the input data. LaTex Math image. is a number of core observations.
Getter & Setter


const table & get_core_observations() const
auto & set_core_observations(const table &value)

const
result_option_id &
result_options
Result options that indicates availability of the properties.
Default value
: default_result_options<Task>.
Getter & Setter


const result_option_id & get_result_options() const
auto & set_result_options(const result_option_id &value)

std::int64_t
cluster_count
The number of clusters found by the algorithm.
Getter & Setter


std::int64_t get_cluster_count() const
auto & set_cluster_count(std::int64_t value)

Invariants


cluster_count >= 0

const
table &
core_observation_indices
An LaTex Math image. table with the indices of core observations in the input data. LaTex Math image. is a number of core observations.
Getter & Setter


const table & get_core_observation_indices() const
auto & set_core_observation_indices(const table &value)

const
table &
responses
An LaTex Math image. table with the responses LaTex Math image. assigned to the samples LaTex Math image. in the input data.
Default value
: table{}.
Getter & Setter


const table & get_responses() const
auto & set_responses(const table &value)

const
table &
core_flags
An LaTex Math image. table with the core flags LaTex Math image. assigned to the samples LaTex Math image. in the input data.
Getter & Setter


const table & get_core_flags() const
auto & set_core_flags(const table &value)

Operation
template<typename
Descriptor
> dbscan::compute_result
compute
(
const
Descriptor &
desc
,
const
dbscan::compute_input &
input
)
Parameters
  • desc
    – DBSCAN algorithm descriptor
    dbscan::descriptor
  • input
    – Input data for the compute operation
Preconditions


input.data.has_data == true
!input.weights.has_data || input.weights.row_count == input.data.row_count && input.weights.column_count == 1

Usage example

Compute
void run_compute(const table& data, const table& weights) { double epsilon = 1.0; std::int64_t max_observations = 5; const auto dbscan_desc = kmeans::descriptor<float>{epsilon, max_observations} .set_result_options(dal::dbscan::result_options::responses); const auto result = compute(dbscan_desc, data, weights); print_table("responses", result.get_responses()); }

Examples

oneAPI DPC++
Batch Processing:
oneAPI C++
Batch Processing:
Python* with DPC++ support
Batch Processing:

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.