Parallel Universe Issue 53

The Parallel Universe, Issue 53

Quarterly Magazine
July 2023

author-image

By

LETTER FROM THE EDITOR

Openness, Sustainability, and the Cambrian Explosion of Generative AI Models

Since our last issue, two news items have caused a stir in the AI community. The first was a fireside chat between Intel and Hugging Face luminaries about Taking on the Compute and Sustainability Challenges of Generative AI, which among other things discusses how the democratization of generative AI has led to an explosion of pretrained models on The Model Hub. The second is the leaked internal Google memo, We Have No Moat, And Neither Does OpenAI, which laments how smaller, faster, cheaper, and more customizable open-source AI will outcompete closed models.

We have three articles along these lines in this issue. The first is a guest editorial from Huma Abidi (General Manager and Senior Director of AI Software Products) and Haihao Shen (AI Software Architect) that directly addresses the figurative “moat” around AI: The Moat Is Trust, Or Maybe Just Responsible AI. The second, Create Your Own Custom Chatbot, demonstrates how to use open models and readily available hardware to build customized, high-performing chatbots. Finally, Fine-Tuning the Falcon 7-Billion Parameter Model with Hugging Face and oneAPI shows how to optimize another open-source large language model on Intel® Xeon® processors with Intel® Advanced Matrix Extensions (Intel® AMX).

Our feature article, Beyond Direct Memory Access: Reducing the Data Center Tax with Intel® Data Streaming Accelerator, provides code examples and advice to take advantage of on-chip acceleration for data transformation. The Intel Data Streaming Accelerator is new in 4th Gen Intel Xeon Scalable processors.

The Case for SYCL* (The Parallel Universe, Issue 51) discussed the limitations of ISO C++ with respect to heterogeneous computing. In the second guest editorial of this issue, Cultivating Parallel Standards, John Pennycook (Software Enabling and Optimization Architect) describes the process of aligning SYCL with future C++ language concepts.

During my first 15 years in computational science, if I needed to TRANslate a FORmula into code, I did it almost exclusively in FORTRAN. So, I read the recent report from Los Alamos National Laboratory with great interest: An Evaluation of Risks Associated with Relying on Fortran for Mission Critical Codes for the Next 15 Years. Support for heterogeneous parallelism is one of the risk factors, so Ron Green (Compiler Engineering Manager and fellow Fortran enthusiast) and I take a look at Using Fortran DO CONCURRENT for Accelerator Offload. This article focuses on language features, but we’re working on a follow-up article that will analyze DO CONCURRENT performance on CPUs and GPUs. Stay tuned.

Finally, we close this issue with a deep dive into performance tuning on the Intel Xeon CPU Max Series: Performance Optimization on Intel® Processors with High Bandwidth Memory.

As always, don’t forget to check out Tech.Decoded for more information on Intel solutions for code modernization, visual computing, data center and cloud computing, data science, systems and IoT development, and heterogeneous parallel programming with oneAPI.

Henry A. Gabb

July 2023

 


 

Henry A. Gabb, Senior Principal Engineer at Intel Corporation, is a longtime high-performance and parallel computing practitioner who has published numerous articles on parallel programming. He was editor/coauthor of “Developing Multithreaded Applications: A Platform Consistent Approach” and program manager of the Intel/Microsoft Universal Parallel Computing Research Centers.

LinkedIn | Twitter