|
BMC Bioinformatics 2008
Automated simultaneous analysis phylogenetics (ASAP): an enabling tool for phlyogenomicsAbstract: To keep pace with the exponentially growing volume of molecular data in the genomic era, we have developed an automated technique, ASAP (Automated Simultaneous Analysis Phylogenetics), to assemble these multigene/multi species matrices and to evaluate the significance of individual genes within the context of a given phylogenetic hypothesis.Applications of ASAP may enable scientists to re-evaluate species relationships and to develop new phylogenomic hypotheses based on genome-scale data.In the post-genomic era, the necessity of developing automated methods for the construction and updating of matrices for complete genome-level phylogenetic analyses of the tree of life have been acknowledged [1]. However, to date a solution for automating gene partition based approaches has been lacking. The main impetus for automated phylogenetic matrix construction is related to the fact that contemporary phylogenetic matrices used to approach the tree of life experience "growing pains" in two dimensions as a result of modern genomics: first, the number of taxa with sequence information "grows" and second the number of data partitions or kinds of genome sequence information "grows" as partial or full genome sequences of species become available.Several bottlenecks exist in the data acquisition pipeline that can prevent easy construction and updating of phylogenetic matrices. Matrix assembly at the genome scale involves the acquisition (through sequencing or download from databases) of hundreds to thousands of gene regions for the taxa of interest, the formatting of these sequences for use in an alignment program, aligning them, and finally the export of the data partitions into formats used by phylogenetic analysis packages. A phylogenetic analysis package (such as PAUP* [2]) can then be used to infer the phylogenetic tree from the combined matrix. To automate matrix assembly and facilitate the calculation of character-based assessments of tree reliability relative to each of the
|