Formalization of block pruning: reducing the number of cells computed in exact biological sequence comparison algorithms
Rights accessOpen Access
Biological sequence comparison algorithms that compute the optimal local and global alignments calculate a dynamic programming (DP) matrix with quadratic time complexity. The DP matrix H is calculated with a recurrence relation in which the value of each cell Hi,j is the result of a maximum operation on the cells’ values Hi-1,j-1, Hi-1,j and Hi,j-1 added or subtracted by a constant value. Therefore, it can be noticed that the difference between the value of cell Hi,j being calculated and the values of direct neighbor cells previously computed respect well-defined upper and lower bounds. Using these bounds, we can show that it is possible to determine the maximum and the minimum value of every cell in H, for a given reference cell. We use this result to define a generic pruning method which determines the cells that can pruned (i.e. no need to be computed since they will not contribute to the final solution), accelerating the computation but keeping the guarantee that the optimal result will be produced. The goal of this paper is thus to investigate and formalize properties of the DP matrix in order to estimate and increase the pruning method efficiency. We also show that the pruning efficiency depends mainly on three characteristics: (a) the order in which the cells of H are calculated, (b) the values of the parameters used in the recurrence relation and (c) the contents of the sequences compared.
This is a pre-copyedited, author-produced version of an article accepted for publication in The Computer Journal following peer review. The version of record Edans F O Sandes, George L M Teodoro, Maria Emilia M T Walter, Xavier Martorell, Eduard Ayguade, Alba C M A Melo; Formalization of Block Pruning: Reducing the Number of Cells Computed in Exact Biological Sequence Comparison Algorithms, The Computer Journal, Volume 61, Issue 5, 1 May 2018, Pages 687–713 is available online at: The Computer Journal https://academic.oup.com/comjnl/article-abstract/61/5/687/4539903 and https://doi.org/10.1093/comjnl/bxx090.
CitationDe Sandes, E., Teodoro, G., Walter, M. E., Martorell, X., Ayguade, E., Melo, A. Formalization of block pruning: reducing the number of cells computed in exact biological sequence comparison algorithms. "Computer journal", 1 Maig 2018, vol. 61, núm. 5, p. 687-713.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder