The Parallel Universe Turns 10: Reflections on Where We're Going...and Where We've Been

Published: 04/07/2020  

Last Updated: 04/07/2020

By Henry A. Gabb

This issue marks the 10th anniversary of The Parallel Universe—and my third year as editor. As the name implies, the magazine was originally conceived as a venue for articles related to parallel computing. Our core topic is still largely the same because parallelism is even more prevalent today than it was 10 years ago. However, the mix of topics has changed with the times. New vector instruction sets provide better instruction-level parallelism. With AVX-512, Intel CPUs are practically accelerators. We’ve also witnessed the rise of data analytics, so our editorial policy was expanded to include parallel frameworks (e.g., Apache Spark*) and achieving high performance with productivity languages (e.g., Python*, Julia*). The universe of parallel computing has also expanded to include heterogeneous parallelism.

As we discussed in our last issue, oneAPI is the next big thing in heterogeneous parallel computing. That's why the first three articles in this issue cover topics related to oneAPI. Our feature article, GPU-Quicksort walks you through a step-by-step translation of OpenCL™ to Data Parallel C++. This is followed by Optimizing Performance of oneAPI Applications and Speeding Up Monte Carlo Simulation with Intel® oneMKL. The former describes Intel’s programming tools and provides an optimization case study for oneAPI applications. The latter shows how to offload Intel® Math Kernel Library functions to accelerators using Intel® oneMKL.

Next, Venkat Krishnamurthy and Kathryn Vandiver from OmniSci discuss the unification of data science and traditional analytics in Bringing Accelerated Analytics at Scale to Intel® Architecture. This is followed by A New Approach to Parallel Computing Using Automatic Differentiation, in which Dmitri Goloubentsev (Matlogica) and Evgeny Lakshtanov (University of Aviero) describe a tool to convert object-oriented, single-threaded scalar code into vectorized, multithreaded lambda functions.

In the first issue of The Parallel Universe, our founding editor, James Reinders, published 8 Rules for Parallel Programming for Multicore that are still relevant and correct today so we’re republishing them in honor of our 10th anniversary.

Finally, we close this issue with something we’ve never done before—a book review. Ruud van der Pas from Oracle Corporation was kind enough to review The OpenMP Common Core: Making OpenMP Simple Again by Timothy G. Mattson, Yun (Helen) He, and Alice E. Koniges. The original OpenMP specification (published in 1997) was a marvel of technical writing. It was concise (only 63 pages) and full of code examples to help you get started. The OpenMP specification has since grown to 666 pages because important new capabilities have been added to give programmers fine control of vectorization and thread placement, plus accelerator offload. However, the “common core” of OpenMP referred to in the book title is largely the same as it was 23 years ago.

As always, don’t forget to check out Tech.Decoded for more information on Intel's solutions for code modernization, visual computing, data center and cloud computing, data science, systems and IoT development, and heterogeneous parallel programming with oneAPI.

Read It


Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at