I am pleased to announce the latest Intel® oneAPI Toolkits update (2021.4).
This is the fourth release of oneAPI tools, marking nearly a year of oneAPI implementations.
Intel® oneAPI tools are based on decades of software products from Intel that have been widely used for building many applications in the world and built from multiple open source projects to which Intel contributes heavily.
oneAPI bold vision
oneAPI is a bold vision for CPUs and XPUs (accelerators) that aims to help ensure an open multivendor multiarchitecture future for software developers. That future has been under attack for some time now, and it is a battle for openness like none we’ve ever seen.
oneAPI spec and initiative
oneAPI is an open specification, with an active community of advisors and implementors, all committed to this open, multivendor, and multiarchitecture approach. This community means that oneAPI is also an initiative.
The oneAPI Technical Advisory Board (TAB) meetings notes are posted on github. The Advisory Board is an invitation-only forum comprised of industry experts that help guide the oneAPI parallel programming ecosystem. You are encouraged to join the conversation by reviewing the oneAPI Specification, as well as the advisory notes/slides, and post comments or questions as github issues.
oneAPI products – these tools
Of course, specifications are nothing without implementations. Intel oneAPI tools are prebuilt and validated products to support oneAPI for Intel architectures.
The 2021.4 update contains some stunning graphics for both Hollywood folks (OSPRay) and performance tuning experts (Flame Graphs in Intel® VTune™ Profiler), expands implementations of standards (SYCL, OpenMP, Fortran), and performance enhancements through Python, libraries, and compilers. The tools are true to the oneAPI vision of supporting XPUs; we see constantly expanding support for CPUs, GPUs, FPGAs, and more.
Flames help us see how to VTune
VTune Profiler introduces support for flame graphs as a hotspot analysis type. We can now switch to a Flame Graph window to quickly identify the hottest code paths in an application. We can use flame graphs to analyze the time spent on each function and its callee functions. Learn more by visiting the VTune Profiler Cookbook where there are many great recipes for tuning including a new one using flame graphs, or consult the documentation for the flame graph feature.
Another nice VTune Profiler addition is that the GPU Offload analysis now presents a richer set of information about execution on the GPU by including more context from the CPU including execution and data transfers.
Intel Advisor Cookbook
The team behind Intel® Advisor has an online cookbook filled with step-by-step recipes for using the tool for a number of important performance tuning scenerios. A new capability, with this 2021.4 update, is for modeling GPU application performance when moving to support a different GPU. This can be very helpful in identification of potential performance changes that may need to be addressed for future GPU designs. I highly recommend looking at this recipe, and others, in the Intel Advisor Cookbook.
OSPRay: XPUs, Python, it’s a Blur
The benefits of thinking “whole platform” are on display with the oneAPI Rendering Toolkit. By embracing XPUs (acceleration capabilities in general), instead of thinking only in terms of CPUs or GPUs, Jim Jeffers was able to show stunning results in his SIGGRAPH talk.
The latest updates include support for OSPRay's camera Motion Blur and Transformation Motion Blur for animated glTF scenes, and Python 3.7 bindings for OSPRay Studio.
Recently, the developers of Intel® OSPRay Studio, a flexible design offering high-quality physically-based rendering and scientific visualization workflows, dove into their motivation, design philosophy, features, targeted use-cases and real-world applications along with future opportunities for OSPRay at the VisGap workshop, co-located with the EuroVis 2021 conference. You can read their paper online, and watch the video of their presentation.
Diagnostics Utility gives system status
You asked – and we listened: we are introducing a new Diagnostics Utility for Intel® oneAPI. Previously, this has been a tool for our support engineers, and now we include it. It helps diagnose system status by reporting installation issues (permissions, versions, driver mismatches, etc.) that could interfere with proper operation of your system and oneAPI tools. This Preview release has been tested for Linux Ubuntu 20.04 LTS, SLES 15 SP2 and RHEL 8.2. Feedback from this preview will guide future releases which will expand functionality as well as platforms that are supported.
I recently shared updates on our LLVM adoption for our compilers, showing that our C/C++ compiler work had exceeded its non-LLVM classic compiler.
In this release, our Fortran move to LLVM continues. In addition to more optimization tuning, great progress on Fortran 2008 and OpenMP features has been made. As always, we carefully document feature-by-feature status in our Fortran and OpenMP feature status table so you can evaluate where our support stands relative to your needs. Fortran compiler release notes can be found together for both the classic and beta (LLVM-based) compilers. The compiler outputs are binary compatible, so the new LLVM-based compiler can be evaluated on portions of an application without needing to move from the classic Fortran product all at once.
I continue to be delighted by the value of being tightly integrated into the LLVM compiler world. We will continue to see significant dividends for users as we all benefit from this investment in fully embracing LLVM.
Extensions for Visual Studio Code (VS Code)
Some new extensions help with common developer tasks for better productivity: Visit the marketplace for Visual Studio Code and install the useful Extension Pack for the Intel oneAPI Toolkits, which includes a code sample browser, GPU debugging, and connections with Intel DevCloud. We rolled these extensions out this week, and they work best with the 2021.4 update. ISPC users can find an extension by typing in "ispc"; the online catalog is another way to see the list of Intel extensions for VS Code.
Performance and XPUs everywhere
There are hundreds of performance optimizations found across the tools, and the release notes summarize them. Working on tools at Intel has always started with a commitment to delivering reliable optimizations that matter to real applications.
Here is a sampling (see the release notes for more!) of performance and XPU related features to whet your appetite for this update:
- FPGA Simulation Flow allows oneAPI designs to run on industry-standard RTL simulators.
- A number of cool editions, visible in the open source repositories that feed into our tools, to bring FP16 (AVX-512) support into the LLVM compilers (C, C++, SYCL/DPC++) and gcc, and a number of other optimizations for upcoming XPUs from Intel.
- Intel® oneAPI Math Kernel Library (oneMKL) adds GPU support through DPC++ and OpenMP offload APIs to Random Number Generators (RNG) Multinomial, PoissonV, Hypergeometric, Negative Binomial, and Binomial distribution.
- Intel® Inspector improved C++ stack frames visualization and increased accuracy of libc and OpenCL libraries reporting.
- Fine tune the performance of Natural Language algorithms through the latest sparsity and pruning features introduced in AI Analytics Toolkit.
- Simple offloading capabilities for Python based scikit-learn algorithms and exploit the cutting edge features of underlying heterogeneous hardware spanning Intel CPUs and XPUs.
- Includes Intel® Open VKL Version 1.0 release which contains support for native use of Intel Open VKL on ARM CPU and new additions to the API such as interval and hit iterator contexts, iteration on any volume attribute, configurable background values, tricubic filtering and more.
- The Intel® oneAPI DPC++/C++ Compiler and Intel® oneAPI DPC++ Library improved SYCL 2020 feature set and conformance to improve programming productivity on various hardware accelerators.
- Intel Advisor's GPU Roofline offers actionable recommendations to maximize GPU utilization for user code analysis.
- Intel Advisor adds modeling to estimate performance benefits of moving an existing GPU code to a future GPU.
Get the Tools Now
The tools are also available in the Intel® DevCloud featuring very useful training resources. This in-the-cloud environment is a fantastic place to develop, test and optimize code across an array of Intel CPUs and accelerators for free and without having to set up your own systems.
Update Now for All the Best in oneAPI
This release (2021.4) is the final quarterly update for the 2021 product. It is a worthwhile update across the board for the tools! I walked through many of the significant changes with this update; a more complete list is available in the release notes
I encourage you to update and get the best that Intel oneAPI tools have to offer. When you care about top performance, it is always worth keeping up-to-date.
Visit my blog for more encouragement on learning about the heterogeneous programming that is possible with oneAPI.
oneAPI is an attitude, a joy, and a journey - to help all software developers.