Training for Intel® Parallel Computing Centers

Published: 02/14/2020  

Last Updated: 02/14/2020

Universities, institutions, and labs that work to optimize open-source applications.


Get Started | Intermediate & Advanced | Upcoming Webinars

As the world of high-performance computing (HPC) evolves and becomes accessible, use this starting point to optimization and gaining better compute performance. While many applications already use features of modern hardware, many more do not extract parallelism in their algorithms, nor do they leverage other new capabilities including larger caches, Single Instruction Multiple Data (SIMD), threading, fabric technology, new file architecture, and nonvolatile memory technology.

Get Started 

This collection of self-paced training and reference materials  provides an overview of parallel programming on Intel® architecture.

Intel® Xeon® Processors & Intel® Xeon Phi™ Product Family

Learn how to modernize code for the Intel® Xeon Phi™ processors. Gain insight for OpenMP*, Intel® MPI Library, and Intel® software to write code using better vectorization and parallelism for hardware optimization.

Why Use Code Modernization?
The Purpose of Intel® Many Integrated Core Architecture
Think Parallel: Modern Applications for Modern Hardware
Parallel Programming Models - Tips and Tricks
Deep Dive with Code Modernization Experts


Code for Speed with High-Bandwidth Memory on Intel Xeon Phi Processors
Optimize for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) with or without Intel AVX-512 Hardware


A Crash Course on Multithreading with OpenMP
An Overview of Programming Options
Vectorization: The "Other" Parallelism You Need
Beyond Traditional Shared Memory Parallel Programming


Arm Forge, Development Tools, and Software
Intel® HPC Orchestrator


Leverage Open-Source Software Defined Visualization

Solutions for Lustre* Training

Use these materials to further your knowledge of the Lustre* file system, gain deeper insight into solutions from Intel, and explore fundamental concepts and advanced implementation and configuration details.

Colfax International
High-Performance Parallel Storage for the Enterprise

Cornelis* Omni-Path Architecture (Cornelis OPA) Training

The next generation of HPC switch technology, Cornelis* Omni-Path Fabric (Cornelis OP Fabric), is designed for improving system-level packaging and network efficiency. It enables a broad class of computations requiring scalable, tightly coupled processor, memory, and storage resources. These training materials help you become familiar with Cornelis OPA.

Webinar Series
Design Fabrics
Next-Generation Fabric: Details on the Cornelis OPA
Advanced Features of the Cornelis OPA Network Layers
Democratize Best-in-Class Interconnect Performance
The Cornelis OPA Launch
Maximize HPC Storage Performance

Intermediate & Advanced

Access hands-on workshops, code samples, case studies, and domain-focused training to get the most out of your code on Intel architecture. We also encourage you to check out the Intel® Software Innovator and Intel® Black Belt Software Developer Program.

Intel Xeon Processors & Intel Xeon Phi Product Family

Get continued training for OpenMP*, Intel MPI Library, Intel® Parallel Studio, Intel Xeon Phi processor and coprocessor, expressing parallelism, and performance optimization methods.


Program and Optimize with Parallel Architectures from Intel


Multi-Channel DRAM (MCDRAM) on Intel Xeon Phi Products – Analysis Methods and Tools
How to Detect Intel AVX-512 Support (Intel Xeon Phi Processor)
Scale your Application Across Shared and Distributed Memory
Squashing Races, Deadlocks, and Memory Bugs


Software Defined Visualization: Data Analysis for Current and Future Cyber Infrastructure
Benefits of Leveraging Software Defined Visualization (Intel® OSPRay)
From Correct to Correct and Efficient with Molecular Dynamics Benchmarks
From Correct to Correct and Efficient with Hydro2D


Optimization of Vector Arithmetics in Intel Architecture
Optimization of Multithreading in Intel Architecture
Gain Performance through Vectorization Using Fortran
Exploit Multilevel Parallelism in HPC Applications
Roofline Analysis: Visualize Impact of Compute Versus Memory Optimizations

Data Layout

SIMD Parallelism and Intrinsics


Analyze Python* App Performance with Intel® VTune™ Amplifier
How Non-Uniform Memory Access (NUMA) Affects Your Workloads for Intel VTune Amplifier

Solutions for Lustre Training

This advanced training is for anyone who wishes to further their knowledge of the file system and gain deeper insights into solutions from Intel for software. The training exposes you to many implementation concepts and configuration details.

Analyze Whole Human Genomes for as Little as $22

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at