Developer Guide


Task Parallelism

While the compiler achieves concurrency by scheduling independent individual operations to execute simultaneously, it does not achieve concurrency at coarser granularities (for example, across loops).
For larger code structures to execute in parallel, you must write them as separate kernels that launch simultaneously. These kernels then run asynchronously with each other, and you can achieve synchronization and communication using pipes, as illustrated in the following figure:
Multiple Kernels Running Asynchronously
Multiple Kernels Running Asynchronously
This is similar to how a CPU program can leverage threads running on separate cores to achieve simultaneous asynchronous behavior.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at