%0 Journal Article
%T Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches
%A Stephen A Smith
%A Jeremy M Beaulieu
%A Michael J Donoghue
%J BMC Evolutionary Biology
%D 2009
%I BioMed Central
%R 10.1186/1471-2148-9-37
%X Here we describe and demonstrate a modified supermatrix method termed mega-phylogeny that uses databased sequences as well as taxonomic hierarchies to make extremely large trees with denser matrices than supermatrices. The two major challenges facing large-scale supermatrix phylogenetics are assembling large data matrices from databases and reconstructing trees from those datasets. The mega-phylogeny approach addresses the former as the latter is accomplished by employing recently developed methods that have greatly reduced the run time of large phylogeny construction. We present an algorithm that requires relatively little human intervention. The implemented algorithm is demonstrated with a dataset and phylogeny for Asterales (within Campanulidae) containing 4954 species and 12,033 sites and an rbcL matrix for green plants (Viridiplantae) with 13,533 species and 1,401 sites.By examining much larger phylogenies, patterns emerge that were otherwise unseen. The phylogeny of Viridiplantae successfully reconstructs major relationships of vascular plants that previously required many more genes. These demonstrations underscore the importance of using large phylogenies to uncover important evolutionary patterns and we present a fast and simple method for constructing these phylogenies.All species on Earth 每 current estimates exceed 1.8 million 每 are related through common ancestors in the evolutionary Tree of Life. The construction of this phylogeny is a major endeavor for biology and largely now depends on the unprecedented growth of molecular sequence data available in public databases. Efforts focused on single clades, whole genome sequencing, genomic library construction (ESTs, BACs), and large collaborative efforts, such as NSF's Assembling the Tree of Life project, are contributing to the fast-paced growth of public databases, with more than 92 million sequences stored in the current release of GenBank (release 167). Current efforts to infer really large phylogeneti
%U http://www.biomedcentral.com/1471-2148/9/37