About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Biomedizinische Technik
Paper
Genome assembly framework on massively parallel, distributed memory supercomputers
Abstract
Genome Assembly describes the process of assembling a long Deoxyribonucleic acid sequence out of next generation sequencing (NGS) data. Computational resources can become a bottleneck or large scale routine use. We propose a genome assembly framework for massively parallel, distributed memory supercomputers. Our frameworks builds on the simple idea to equally distribute the number of reads to each processor. Each processor holds the whole reference genome. Each processor aligns the short reads independently and sends the reads back to root processor together with the corresponding position and the whole genome can be aligned. We run our alignment framework on up to 8,196 processors of the IBM Blue Gene/Q "Avoca" at the Victorian Life Science Computation Initiative. The results show that more than 6 Million reads of over 324 Million nucleotides can be assembled in under 20 minutes versus previously requiring hours. Thus, our framework allows fast assembly of NGS data.