Get Started with the Intel® oneAPI Data Analytics Library
Intel® oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks
for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making)
in batch, online, and distributed processing modes of computation.
For general information about oneDAL,
visit oneDAL official page.
Before You Begin
oneDAL is located in
<install_dir>/dal
directory where <install_dir>
is the directory in which Intel® oneAPI Base Toolkit was installed.The current version of oneDAL with SYCL support is available for Linux* and Windows* 64-bit operating systems.
The prebuilt oneDAL libraries can be found in the
<install_dir>/dal/<version>/redist
directory.To learn about the system requirements and the dependencies needed to build examples,
refer to the System Requirements page.
End-to-end Example
Below you can find a typical usage workflow for a oneDAL algorithm on GPU. The
example is provided for Principal Component Analysis algorithm (PCA).
The following steps depict how to:
- Read the data from CSV file
- Run the training and inference operations for PCA
- Access intermediate results obtained at the training stage
- Include the following header that makes all oneDAL declarations available.#include "oneapi/dal.hpp" /* Standard library headers required by this example */ #include <cassert> #include <iostream>
- Create a SYCL* queue with the desired device selector. In this case, GPU selector is used:const auto queue = sycl::queue{ sycl::gpu_selector{} };
- Since all oneDAL declarations are in theoneapi::dalnamespace, import all declarations from theoneapinamespace to usedalinstead ofoneapi::dalfor brevity:using namespace oneapi;
- Use CSV data source to read the data from the CSV file into a table:const auto data = dal::read<dal::table>(queue, dal::csv::data_source{"data.csv"});
- Create a PCA descriptor, configure its parameters, and run the training algorithm on the data loaded from CSV.const auto pca_desc = dal::pca::descriptor<float> .set_component_count(3) .set_deterministic(true); const dal::pca::train_result train_res = dal::train(queue, pca_desc, data);
- Print the learned eigenvectors:const dal::table eigenvectors = train_res.get_eigenvectors(); const auto acc = dal::row_accessor<const float>{eigenvectors}; for (std::int64_t i = 0; i < eigenvectors.row_count(); i++) { /* Get i-th row from the table, the eigenvector stores pointer to USM */ const dal::array<float> eigenvector = acc.pull(queue, {i, i + 1}); assert(eigenvector.get_count() == eigenvectors.get_column_count()); std::cout << i << "-th eigenvector: "; for (std::int64_t j = 0; j < eigenvector.get_count(); j++) { std::cout << eigenvector[j] << " "; } std::cout << std::endl; }
- Use the trained model for inference to reduce dimensionality of the data:const dal::pca::model model = train_res.get_model(); const dal::table data_transformed = dal::infer(queue, pca_desc, data).get_transformed_data(); assert(data_transformed.column_count() == 3);
Build and Run Examples
Perform the following steps to build and run examples demonstrating the
basic usage scenarios of oneDAL with SYCL support. Go to
<install_dir>/dal/<version>
and then set up an environment as shown in the example below:All content below that starts with
#
is considered a comment and
should not be run with the code.- Set up the required environment for oneDAL (variables such asCPATH,LIBRARY_PATH, andLD_LIBRARY_PATH):
- On Linux, there are two possible ways to set up the required environment: viavars.shscript or viamodulefiles.
- Setting up oneDAL environment viavars.shscriptRun the following command:source ./env/vars.sh
- Setting up oneDAL environment viamodulefiles
- Initializemodules:source $MODULESHOME/init/bashRefer to Environment Modules documentation for details.
- Providemoduleswith a path to themodulefilesdirectory:module use ./modulefiles
- Run the module:module load dal
- On Windows, run the following command:/env/vars.bat
- Copy./examples/oneapi/dpcto a writable directory if necessary (since it creates temporary files):cp –r ./examples/oneapi/dpc ${WRITABLE_DIR}
- Set up the compiler environment for Intel® oneAPI DPC++/C++ Compiler. See Get Started with Intel® oneAPI DPC++/C++ Compiler for details.
- Build and run the examples that show how to use oneDAL with SYCL support:You need to have write permissions to theexamplesfolder to build examples, and execute permissions to run them. Otherwise, you need to copyexamples/oneapi/dpcandexamples/oneapi/datafolders to the directory with right permissions. These two folders must be retained in the same directory level relative to each other.
- On Linux:# Navigate to the directory containing examples and then build them: cd /examples/oneapi/dpc make so example=svm_two_class_thunder_dense_batch # This will compile and run Correlation example using Intel(R) oneAPI DPC++/C++ Compiler make so mode=build # This compiles all examples in the current directory
- On Windows:# Navigate to the directory containing examples and then build them: cd /examples/oneapi/dpc nmake dll example=svm_two_class_thunder_dense_batch+ # This will compile and run Correlation example using Intel(R) oneAPI DPC++/C++ Compiler nmake dll mode=build # This compiles all examples in the current directory
To see all available parameters of the build procedure, typemakeon Linux* ornmakeon Windows*. - The resulting example binaries and log files are written into the_resultsdirectory.You should run the examples fromexamples/oneapi/dpcfolder, not from_resultsfolder. Most examples require data to be stored inexamples/oneapi/datafolder and to have a relative link to it started fromexamples/oneapi/dpcfolder.You can build traditional C++ examples located inexamples/oneapi/cppfolder in a similar way.
Compile and build applications with pkg-config
The pkg-config tool is a widely used tool for building software with dependencies.
Intel® oneAPI Data Analytics Library provides files with pkg-config metadata for compiling and linking an application to the library.
Set up the environment
To use pkg-config, build the library and then set up the environment using
vars.sh
or vars.bat
scripts:- On Linux:source ./env/vars.sh
- On Windows:/env/vars.bat
Choose a metadata file
The metadata files provided by oneDAL cover only host device configuration on 64-bit Linux, macOS, or Windows operating system for C++.
Choose the metadata file based on oneDAL threading mode and linking method you use:
Single-threaded (non-threaded) | Multi-threaded (internally threaded) | |
---|---|---|
Static linking | dal-static-sequential-host | dal-static-threading-host |
Dynamic linking | dal-dynamic-sequential-host | dal-dynamic-threading-host |
Compile a program using pkg-config
To compile a
test.cpp
program with oneDAL and pkg-config,
provide the name of the oneDAL pkg-config metadata file as an input parameter. For example:- On Linux or macOS:icc test.cpp pkg-config --cflags --libs dal-dynamic-threading-host
- On Windows:for /F "delims=," %i in ('pkg-config --cflags --libs dal-dynamic-threading-host) do icl test.cpp %i
A sample code for
svm_two_class_thunder_dense_batch
example with SYCL support.
Run the following from the examples/oneapi/cpp
directory:- On Linux or macOS:icc -I source/ source/svm/svm_two_class_thunder_dense_batch.cpp icc test.cpp pkg-config --cflags --libs dal-dynamic-threading-host
- On Windows:for /F "delims=," %i in ('pkg-config --cflags --libs dal-dynamic-threading-host) do icl -I source/ icl svm_two_class_thunder_dense_batch.cpp %i
Find More
Document | Description |
---|---|
Refer to oneDAL Developer Guide and Reference for detailed information about implemented algorithms. | |
Check system requirements before you install Intel® oneAPI Data Analytics Library. | |
Refer to release notes for Intel® oneAPI Data Analytics Library to learn about new updates in the latest release. | |
Learn how to use oneDAL with daal4py, a Python* API. | |
Learn about requirements for implementations of oneAPI Data Analytics Library. |
Notices and Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.