The performance or scalability of a flow graph is not easy to predict.
However there are a few key points that can guide you in estimating the
limits on performance and speedup of some graphs.
The Critical Path Limits the Scalability in a Dependence
A critical path is the most time consuming path from a node with no
predecessors to a node with no successors. In a dependence graph, the
execution of the nodes along a path cannot be overlapped since they
have a strict ordering. Therefore, for a dependence graph, the
critical path limits scalability.
More formally, let T be the total time consumed by all of the nodes
in your graph if executed sequentially. Then let C be the time
consumed along the path that takes the most time. The nodes along
this path cannot be overlapped even in a parallel execution.
Therefore, even if all other paths are executed in parallel with C,
the wall clock time for the parallel execution is at least C, and the
maximum possible speedup (ignoring microarchitectural and memory
effects) is T/C.
There is Overhead in Spawning a Node’s Body as a Task
The bodies of
execute within spawned tasks by default. This
means that you need to take into account the overhead of task
scheduling when estimating the time it takes for a node to execute
its body. All of the rules of thumb for determining the appropriate
granularity of tasks therefore also apply to node bodies as well. If
you have many fine-grained nodes in your flow graph, the impact of
these overheads can noticeably impact your performance. However,
depending on the graph structure, you can reduce such overheads by
using lightweight policy with these nodes.