|
BMC Bioinformatics 2006
Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap costAbstract: We propose a novel group-to-group sequence alignment algorithm that uses a piecewise linear gap cost. We developed a program called PRIME, which employs our proposed algorithm to optimize the well-defined sum-of-pairs score. PRIME stands for Profile-based Randomized Iteration MEthod. We evaluated PRIME and some recent MSA programs using BAliBASE version 3.0 and PREFAB version 4.0 benchmarks. The results of benchmark tests showed that PRIME can construct accurate alignments comparable to the most accurate programs currently available, including L-INS-i of MAFFT, ProbCons, and T-Coffee.PRIME enables users to construct accurate alignments without having to employ pairwise alignment information. PRIME is available at http://prime.cbrc.jp/ webcite.Multiple sequence alignment (MSA) is a useful tool for elucidating the relationships among function, evolution, sequence, and structure of biological macromolecules such as genes and proteins [1-3]. Although we can calculate the optimal alignment of a set of sequences by n-dimensional dynamic programming (DP), the DP method is applicable to only a small number of sequences. In fact, even when a sum-of-pairs (SP) score with the simplest gap cost is used as an objective function, computation of optimal MSA is an NP-hard problem [4]. Hence, many heuristic methods have been developed. Almost all practical methods presently available adopt either a progressive [5-7] or an iterative [8-10] heuristic strategy.The group-to-group sequence alignment algorithm is a straightforward extension of the pairwise sequence alignment algorithm, and is the core of progressive and iterative methods. The essential difference of group-to-group sequence alignment from pairwise sequence alignment is the existence of gaps within each group of prealigned sequences. The gap opening penalty used in an affine gap cost disrupts the independence between adjacent columns, and hence calculating the optimal alignment between two groups with respect to the SP scor
|