Exploiting different levels of parallelism in the biological sequence comparison problem
Document typeConference report
Rights accessOpen Access
In the last years the fast growth of bioinformatics field has atracted the attention of computer scientists. At the same time, de exponential growth of databases that contains biological information (such as protein and DNA data) demands great efforts to improve the performance of computational platforms. In this work, we investigate how bioinformatics applications benefit from parallel architectures that combine different alternatives to exploit coarse- and fine-grain parallelism. As a case of analysis, we study the performance behavior of the Ssearch application that implements the Smith-Waterman algorithm (SW), which is a dynamic programing approach that explores the similarity between a pair of sequences. The inherent large parallelism of the application makes it ideal for architectures supporting multiple dimensions of parallelism (thread-level parallelism, TLP; data-level parallelism, DLP; instruction-level parallelism, ILP). We study how this algorithm can take advantage of different parallel machines like the SGI Altix, IBM Power6, IBM Cell BE and MareNostrum machines. Our study includes a qualitative analysis of the parallelization opportunities and also the quantification of the performance in terms of speedup and execution time. These measures are collected taking into account the specific characteristics of each architecture. As an example, our results show that a share memory multiprocessor architecture (SMP) like the PowerPC 970MP of Marenostrum machine can surpasses a heterogeneous multi- processor machine like the current IBM Cell BE.
CitationSánchez, F.; Ramírez, A.; Valero, M. Exploiting different levels of parallelism in the biological sequence comparison problem. A: 2009 Colombian Computing Conference. "Cuarto Congreso Colombiano de Computación, 4CCC: abril 23-25, 2009, Bucaramanga, Colombia". Bucaramanga: 2009, p. 140-149.