|
Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary treesAbstract: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting webcite. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP.The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.Comparisons among species are fundamental to elucidate evolutionary history. In evolutionary biology, for example, they can be used to detect character associations [1-3]. In this context, it is important to use statistically independent comparisons, i.e., any two comparisons must have disjoint evolutionary histories (phylogenetic independence). The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that models this situation: Given an arbitrary phylogenetic tree T and weights ωxy for the paths between any two pairs of leaves (x, y) (representing a particular comparison), what is the collection of pairs of leaves with maximum total weight so that the connecting paths do not intersect in edges?Algorithms for special cases of the MPP that are restricted to binary trees and equal weights (which thus simply maximizes the number of pairs) have been described, but not implemented [2]. Since different pairs of taxa may contribu
|