Advancements in genomics are opening new doors for understanding human diseases, and informing precision treatment plans. New discoveries are dependent on processing, storing, and analyzing a growing amount of genomic sequencing data. In 2015, worldwide sequencing storage capacity approached a petabyte per year, and it continues to double every seven months. At this rate, genomics sequencing will ...generate hundreds of petabytes per year in the next five years, and could require nearly a zettabyte of storage per year by 2025.
The Broad Institute of MIT and Harvard (broadinstitute.org) is one of the world’s largest producers of human genomic data, creating about 24 TB of new data per day. Currently, Broad Institute manages more than 50 PB of data. Researchers require tools to analyze these enormous volumes of data in a timely manner to gain insights into disease and possible treatments. They need tools like the Genome Analysis Toolkit* (GATK*), a set of leading software methods created by the Broad Institute and trusted by the majority of genomics centers worldwide.
In 2017, Intel and Broad Institute launched a new eﬀort—the Intel-Broad Center for Genomic Data Engineering is a five-year collaboration between the two organizations to simplify and accelerate genomics workﬂow execution using GATK, Burrow-Wheeler Aligner (BWA), Cromwell, Intel® Genomics Kernel Library (Intel® GKL), GenomicsDB*, and other tools and techniques.