Featured content from our partner, Lenovo.
Background
CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB) is the top-ranked and most prestigious genome sequencing research institute in India. Based in Delhi, it is part of the country’s Council of Scientific and Industrial Research (CSIR)—the largest and most respected research and development organization in India. CSIR-IGIB engages in research of national importance in the areas of genomics, molecular medicine, bioinformatics, proteomics, and environmental biotechnology.
A vital part of the institute’s work revolves around human genetics research, which plays a critical role in identifying genetic disorders, characterizing the mutations that drive cancer progression, and tracking disease outbreaks. CSIR-IGIB has joined hands with Lenovo to support this vital work, helping researchers in their efforts to solve some of humanity’s greatest challenges.
Challenge
To support its genomics research activities, CSIR-IGIB undertakes two main kinds of analysis: whole genome sequencing (WGS), which analyzes an organism’s entire DNA sequence, and whole exome sequencing (WES), which studies the genome’s protein-coding regions (roughly 2% of the whole genome).
Even with advances in technology, this sequencing work remains incredibly computationally intensive and time-consuming. It typically takes days to complete a single WGS analysis, for instance; meanwhile, researchers need to study more samples in shorter time frames to advance important scientific work.
Meeting these needs takes serious computing power, which is why CSIR-IGIB was keen to give its high-performance computing (HPC) infrastructure a boost. The institute wanted to add more compute throughput and capacity to support ever-growing workload volumes and cut the time taken to sequence genomes. In this way, it could help researchers tackle more complex questions and gain insights faster.
“Computing speed and scale are both critical to genetic sequencing. Our goal is to help researchers analyze more samples faster, and we need very high-performing technology to achieve this.” —Dr. Anurag Agrawal, Director, CSIR-IGIB
Why Lenovo? Breakthrough Performance for Genomic Analytics
For CSIR-IGIB, one solution stood out above the rest: Lenovo’s Genomics Optimization and Scalability Tool (GOAST), an HPC architecture engineered specifically for demanding genomics workloads.
“We find Lenovo GOAST works exactly as advertised. The system has met all benchmarks that we expected.” —Dr. Anurag Agrawal, Director, CSIR-IGIB
GOAST includes pre-configured hardware and pre-installed bioinformatics software, calibrated for high performance. For its deployment, CSIR-IGIB selected the 2-socket Lenovo GOAST Base system, built on Lenovo ThinkSystem SR630 appliances featuring 2nd Gen Intel® Xeon® Scalable processors.
Working closely with Lenovo HPC Services, CSIR-IGIB deployed a 28-node system—the largest GOAST installation in India to date. By leveraging an optimized architecture and efficient open-source software, GOAST offers GPU-level performance at CPU-level costs, making it an attractive proposition for a public sector body such as CSIR-IGIB.
Answering Some of Today’s Most Pressing Research Questions
Today, CSIR-IGIB is using the Lenovo GOAST architecture to support a wide array of research initiatives, including several projects focused on exploring the potential genetic roots of cancer.
For example, researchers can compare cancer genomes against a standard reference genome to identify potential germline mutations (passed directly from a parent to a child) that can trigger or advance cancer development in humans. As well as improving our understanding of the genetic changes that can contribute to cancer, such analysis can also offer valuable insights into how an individual’s cancer might progress and its likely response to treatment.
“We were previously using open-code software for genome sequencing with good results, but it’s honestly no match for what Lenovo GOAST delivers. Lenovo has truly optimized both hardware and software for genomics analysis, and it pays dividends in terms of performance and efficiency.” —Dr. Anurag Agrawal, Director, CSIR-IGIB
CSIR-IGIB also has plans to leverage the Lenovo GOAST architecture to support a major upcoming cancer research project: the Indian Cancer Genome Atlas (ICGA). The initial focus will be on breast cancer, looking at both germline mutations and somatic mutations (occurring from damage to genes in an individual cell during a person’s life).
“For the upcoming ICGA project, we want to reimagine how GOAST is used optimally,” says Dr. Agrawal. “We’ll be tackling more complex questions and we have been working closely with Lenovo to advance the system’s capabilities to help us go further. Lenovo has been a very willing partner and we’re excited to push the boundaries of what we can do together.”
“Analysis of genome sequencing data is much faster and more reliable with Lenovo GOAST. We are able to leverage world-class architecture and software to accelerate the analysis of whole genomes and exomes, helping researchers get results faster.” —Dr. Anurag Agrawal, Director, CSIR-IGIB
Results
CSIR-IGIB has seen a significant performance impact from the Lenovo GOAST system for both WGS and WES workflows. On latency runs—where all resources of one node are assigned to executing a single job—the institute can complete a typical WGS and WES workflow 6.5 times faster.1
“The higher computing throughput and capacity delivered by Lenovo GOAST is helping accelerate the pace of research and increase our output, helping us drive scientific progress that makes a real impact on people’s health and lives.” —Dr. Anurag Agrawal, Director, CSIR-IGIB
The Lenovo architecture delivers similarly strong performance on throughput runs, where multiple jobs run concurrently on one node. Here, performance for WGS and WES workflows is around 6.5 times better.1
Accelerated execution speeds mean researchers can process more samples and answer more complex questions in less time. By sustaining a more rapid pace of research, CSIR-IGIB can push vital scientific work further, driving the breakthroughs needed to improve our understanding of diseases like cancer and find better treatments that improve patient outcomes and even save lives.
- 6.5 times faster WGS and WES workflows in both latency and throughput runs1
- Supports efficient analysis of complex sequencing workloads
- Speeds time-to-answer for vital research into diseases like cancer