|
BMC Bioinformatics 2007
Lower bounds on multiple sequence alignment using exact 3-way alignmentAbstract: Two standard and two new methods are considered for using exact 2-way and 3-way alignments to compute lower bounds on total SP alignment cost; one new method fares well with respect to accuracy, while the other reduces the computation time. The first employs exhaustive computation of exact 3-way alignments, while the second employs an efficient heuristic to compute a much smaller number of exact 3-way alignments. Calculating all 3-way alignments exactly and computing their average improves lower bounds on sum of SP cost in v-way alignments. However judicious selection of a subset of all 3-way alignments can yield a further improvement with minimal additional effort. On the other hand, a simple heuristic to select a random subset of 3-way alignments (a random packing) yields accuracy comparable to averaging all 3-way alignments with substantially less computational effort.Calculation of lower bounds on SP cost (and thus the quality of an alignment) can be improved by employing a mixture of 3-way and 2-way alignments.Let Σ be a finite alphabet, and consider v words (sequences) w1,..., wv over Σ. For 1 ≤ i ≤ v, let ?i denote the length of word wi, and let σij denote the jth character of wi. A string xi over Σ ∪ {-} is an extension of wi of length M if there exist indices 1 ≤ d1
|