|
BMC Bioinformatics 2006
Recrafting the neighbor-joining methodAbstract: In this paper we present techniques for speeding up the canonical neighbor-joining method. Our algorithms construct the same phylogenetic trees as the canonical neighbor-joining method. The best-case running time of our algorithms are O(n2) but the worst-case remains O(n3). We empirically evaluate the performance of our algoritms on distance matrices obtained from the Pfam collection of alignments. The experiments indicate that the running time of our algorithms evolve as Θ(n2) on the examined instance collection. We also compare the running time with that of the QuickTree tool, a widely used efficient implementation of the canonical neighbor-joining method.The experiments show that our algorithms also yield a significant speed-up, already for medium sized instances.The neighbor-joining method is a distance based method for constructing evolutionary trees. It was introduced by Saitou and Nei [1], and the running time was later improved by Studier and Keppler [2]. It has become a mainstay of phylogeny reconstruction, and is probably the most widely used distance based algorithm in practice. With a running time of O(n3) on n taxa [2], it is fast for small input, and empirical work shows it to be reasonable accurate, at least for cases where the rate of evolution is not extremely high or low. St. John et al. [3] even suggest it as a standard against which new phylogenetic methods should be evaluated. The aim of this paper is to improve on the running time of neighbor-joining tree reconstruction to make it applicable for larger datasets, e.g. [4]. Whether the accuracy supplied by the neighbor-joining method is useful for a particular data set in a particular situation is an independent issue outside of the scope of this paper.The neighbor-joining method is a greedy algorithm which attempts to minimize the sum of all branch-lengths on the constructed phylogenetic tree. Conceptually, it starts out with a star-formed tree where each leaf corresponds to a species, and itera
|