QIAGEN Bioinformatics solutions deliver faster time to insight by combining powerful analytics that are able to interpret complex biological processes.
The QIAGEN Biomedical Genomics Server* solution is able to minimize the total cost of ownership (TCO) over a four-year period for the largest NGS instrument on today’s market. Calculations of the total ownership costs show that with given specifications, the cost can be as low as USD 22 per whole human genome analyzed.12 With very high throughput enabled by a HiSeq X Ten*, the savings can be sizable.
An Illumina HiSeq X Ten* system is able to sequence a total of 18,000 whole genome sequences per year—averaging the rate of analysis to 30 minutes per one whole human genome sequence. By optimizing and fine-tuning the speed of the QIAGEN Biomedical Genomics Server* solution, we demonstrated that the solution is able to analyze the NGS data at the pace they are produced by the instrument running at maximum throughput— with less computing nodes than recommended by others3. The testing scenario revealed that QIAGEN Biomedical Genomics Server requires a compute cluster of only 32 nodes, as contrasted (in contrast) to the 85 nodes recommended by Illumina (variant calling based on BWA+GATK Best Practices pipeline in the HiSeq X system lab setup and Site Prep Guide (Document 15050093 v03, January 2016). The comparison benchmark testing was carried out by installing the QIAGEN Biomedical Genomics Server solution on a compute cluster of 32 nodes.12 each equipped with a 28-core E5-2697 v3 @ 2.60GHz, 128 GB RAM on a shared Lustre file system. We used the standard CLC variant calling workflow that comes with the Biomedical Genomics Server solution. Furthermore the benchmarks show that the same specified system is capable of analyzing up to 1440 whole human exomes per 24 hours.
Reducing the hardware requirements from 85 nodes to just 32 nodes results in a dramatic impact on the total cost of ownership on the solution over a four-year period, which includes everything from software licenses and hardware, to power, cooling, networking, and floor space. Calculations of the total ownership costs show that the cost can be brought down as low as USD 22 per whole human genome analysis. Given the 18,000 whole human genome throughput enabled by a HiSeq X Ten, the savings can be as high as USD 1.3 million over four years. Additional benchmarks also show that the reference architecture is capable of analyzing up to 1,440 whole exome sequences per 24 hours.
A popular software package for mapping low-divergent sequences against a large-reference genome, such as the human genome.
An open-source implementation of the HMMER* protein sequence analysis suite.
An algorithm for comparing primary biological sequence information.
A software package developed at the Broad Institute to analyze next-generation sequencing data.
QIAGEN Bioinformatics* solutions deliver faster time to insight by combining powerful analytics that are able to interpret complex biological processes.
Halvade* is a MapReduce implementation of the best-practice DNA sequencing pipeline as recommended by Broad Institute.
ABySS* is an open-source de novo genome assembler for short paired-end reads.
DIDA* performs large-scale alignment tasks by distributing the indexing and alignment stages into smaller subtasks over a cluster of compute nodes.
elPrep* is a high-performance tool for preparing SAM/BAM/CRAM files for variant calling in genomic sequencing pipelines.
Based on internal performance tests and a total cost of ownership analysis performed by QIAGEN Bioinformatics and Intel. Performance tests were conducted on a 16-node high performance computing (HPC) cluster. Each node was configured with 2 x Intel® Xeon® processor E5-2697 v3 (2.6 GHz, 14 core), 128 GB memory, and a 500 GB storage drive. All nodes shared a 165 TB storage system based on Intel® Enterprise Edition for Lustre* software, 256 TB of 1000 RPM disk storage and 4 x 800 GB Intel® Solid State Drive Data Center S3700 Series. The interconnect fabric featured 2x Intel® True Scale single-port HCAs (2x QLE7340, QDR-80) per node and a 36-port Intel True Scale switch 12300 (40 Gbps). The TCO analysis was performed using an internal Intel® tool and publicly available product pricing and availability as of October 9, 2015. The TCO for the test cluster was estimated over a 4-year period and compared with the estimated TCO of an 85-node cluster, as described in the Illumina HiSeq X System Lab Setup and Site Prep Guide, Document # 15050093 v01, September 2015. To quantify the TCO comparison, specific products were chosen that would fulfill the general specifications defined within the Illumina guide. Support costs for both systems were estimated as 60 percent of TCO. The performance and TCO results should only be used as a general guide for evaluating the cost/benefit or feasibility of a future purchase of systems. Actual performance results and economic benefits will vary, and there may be additional unaccounted costs related to the use and deployment of the solution that are not or cannot be accounted for. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance.
The WGS dataset used in the performance tests is NA12878D, which was generated with Illumina's TruSeq Nano* kit using 350bp inserts and sequenced on a single lane of an Illumina HiSeq X* with > 87% bases and with quality > Q30. Average coverage: 35.57%, Read lengths: 2 x 151, 120 Gb. This dataset can be accessed at https://dnanexus-rnd.s3.amazonaws.com/NA12878-xten.html.
Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown". Implementation of these updates may make these results inapplicable to your device or system.
Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit https://www.intel.com/benchmarks.
Intel is a sponsor and member of the BenchmarkXPRT Development Community, and was the major developer of the XPRT family of benchmarks. Principled Technologies is the publisher of the XPRT family of benchmarks.