Removing Load Imbalance in Burrows-Wheeler DNA Sequence Alignment


One of the initial goals of the ExaScience Life Lab is to examine how supercomputers can accelerate the processing of whole-genome sequences. Currently, the processing time of a single whole genome is measured in days rather than hours. Sequencing costs have decreased dramatically over the last years, and with the new generation of machines the mythical $1000 human genome has becom...e reality in 2014. As this will have an immediate effect on the sample sizes used in sequencing studies, it is crucial to improve the efficiency of the computing process.

This article focuses on recent advances of the ExaScience Life Lab in optimizing the alignment phase of whole-genome processing. We show how the use of Intel tools such as Pintools, VTune™ tools, and the Intel® Cilk™ language allow analyzing and optimizing the performance of the widely-used BWA aln program for alignment. With minimal programming effort, we achieve up to factor two speedup compared to the original code, making the improvements ready to use in the software pipeline of Janssen Pharmaceutica today.

