Use Nested Algorithms to Increase Scalability

Intel® oneAPI Threading Building Blocks Developer Guide and API Reference

Download PDF

ID 772616

Date 3/31/2025

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Use Nested Algorithms to Increase Scalability

One powerful way to increase the scalability of a flow graph is to nest other parallel algorithms inside of node bodies. Doing so, you can use a flow graph as a coordination language, expressing the most coarse-grained parallelism at the level of the graph, with finer grained parallelism nested within.

In the example below, five nodes are created: an input_node, matrix_source, that reads a sequence of matrices from a file, two function_nodes, n1 and n2, that receive these matrices and generate two new matrices by applying a function to each element, and two final function_nodes, n1_sink and n2_sink, that process these resulting matrices. The matrix_source is connected to both n1 and n2. The node n1 is connected to n1_sink, and n2 is connected to n2_sink. In the lambda expressions for n1 and n2, a parallel_for is used to apply the functions to the elements of the matrix in parallel. The functions read_next_matrix, f1, f2, consume_f1 and consume_f2 are not provided below.

graph g;
input_node< double * > matrix_source( g, [&]( oneapi::tbb::flow_control &fc ) -> double* {
  double *a = read_next_matrix();
  if ( a ) {
    return a;
  } else {
    fc.stop();
    return nullptr;
  }
} );
function_node< double *, double * > n1( g, unlimited, [&]( double *a ) -> double * {
  double *b = new double[N];
  parallel_for( 0, N, [&](int i) {
    b[i] = f1(a[i]);
  } );
  return b;
} );
function_node< double *, double * > n2( g, unlimited, [&]( double *a ) -> double * {
  double *b = new double[N];
  parallel_for( 0, N, [&](int i) {
    b[i] = f2(a[i]);
  } );
  return b;
} );
function_node< double *, double * > n1_sink( g, unlimited,
  []( double *b ) -> double * {
    return consume_f1(b);
} );
function_node< double *, double * > n2_sink( g, unlimited,
  []( double *b ) -> double * {
    return consume_f2(b);
} );
make_edge( matrix_source, n1 );
make_edge( matrix_source, n2 );
make_edge( n1, n1_sink );
make_edge( n2, n2_sink );
matrix_source.activate();
g.wait_for_all();

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Threading Building Blocks Developer Guide and API Reference

Use Nested Algorithms to Increase Scalability