Part 9: Distributed-Memory Parallelism and MPI

In the previous episodes of this chapter, we learned how to use vectorization to parallelize calculations across vector lanes in each core. Then we talked about how to use OpenMP* to scale applications across cores in each processor or coprocessor. Now, in this final episode 9 of this chapter, we will study the next level of parallelism: scaling across multiple compute devices and multiple compute nodes in a cluster environment.


Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at