全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

The Interaction between Base Compositional Heterogeneity and Among-Site Rate Variation in Models of Molecular Evolution

DOI: 10.5402/2013/391561

Full-Text   Cite this paper   Add to My Lib

Abstract:

Many commonly used models of molecular evolution assume homogeneous nucleotide frequencies. A deviation from this assumption has been shown to cause problems for phylogenetic inference. However, some claim that only extreme heterogeneity affects phylogenetic accuracy and suggest that violations of other model assumptions, such as variable rates among sites, are more problematic. In order to explore the interaction between compositional heterogeneity and variable rates among sites, I reanalyzed 3 real heterogeneous datasets using several models. My Bayesian inference recovers accurate topologies under variable rates-among-sites models, but fails under some models that account for compositional heterogeneity. I also ran simulations and found that accounting for rates among sites improves topology accuracy in compositionally heterogeneous data. This indicates that in some cases, models accounting for among-site rate variation can improve outcomes for data that violates the assumption of compositional homogeneity. 1. Introduction Recent phylogenetic studies have explored the effect of compositional heterogeneity on phylogenetic methods. Compositional heterogeneity can arise in a dataset as a result of nonstationary evolution (when the substitution pattern is not uniform across an evolutionary tree). If two nonsister subtrees have similar substitution bias, this can lead to a convergence in nucleotide composition (CNC). The taxa may then look similar due to convergent evolution rather than common ancestry, which can mislead phylogenetic analysis. There are several methods to detect and quantify the level of compositional heterogeneity in a dataset, including chi-squared tests (e.g., [1]), Disparity Index [2], and relative-rates tests [3]. When found, the presence of compositional heterogeneity is often assumed to cause problems for both parametric and nonparametric phylogenetic methods [4]. However, this assumption has been challenged; Conant and Lewis [5] claimed that “extreme amounts of heterogeneity must be present before it can mislead phylogenetics” and Rosenberg and Kumar [6] “did not find a significant interaction between phylogenetic accuracy and substitution pattern heterogeneity among lineages.” Another commonly studied modeling question is the variation of substitution rates among sites. It has been established that accounting for among-site rate variation is important in phylogenetics [7]. This is most commonly done by assuming the substitution rates among sites vary according to a discrete gamma distribution with a fixed number of categories.

References

[1]  J. Mallatt and J. Sullivan, “28S and 18S rDNA sequences support the monophyly of lampreys and hag fishes,” Molecular Biology and Evolution, vol. 15, no. 12, pp. 1706–1718, 1998.
[2]  S. Kumar and S. R. Gadagkar, “Disparity index: a simple statistic to measure and test the homogeneity of substitution patterns between molecular sequences,” Genetics, vol. 158, no. 3, pp. 1321–1327, 2001.
[3]  S. V. Muse and B. S. Weir, “Testing for equality of evolutionary rates,” Genetics, vol. 132, no. 1, pp. 269–276, 1992.
[4]  S. Y. W. Ho and L. S. Jermiin, “Tracing the decay of the historical signal in biological sequence data,” Systematic Biology, vol. 53, no. 4, pp. 623–637, 2004.
[5]  G. C. Conant and P. O. Lewis, “Effects of nucleotide composition bias on the success of the parsimony criterion in phylogenetic inference,” Molecular Biology and Evolution, vol. 18, no. 6, pp. 1024–1033, 2001.
[6]  M. S. Rosenberg and S. Kumar, “Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference,” Molecular Biology and Evolution, vol. 20, no. 4, pp. 610–621, 2003.
[7]  Z. Yang, “Among-site rate variation and its impact on phylogenetic analyses,” Trends in Ecology and Evolution, vol. 11, no. 9, pp. 367–372, 1996.
[8]  R. A. van den Bussche, R. J. Baker, J. P. Huelsenbeck, and D. M. Hillis, “Base compositional bias and phylogenetic analyses: a test of the “flying DNA” hypothesis,” Molecular Phylogenetics and Evolution, vol. 10, no. 3, pp. 408–416, 1998.
[9]  M. Steel, D. Huson, and P. J. Lockhart, “Invariable sites models and their use in phylogeny reconstruction,” Systematic Biology, vol. 49, no. 2, pp. 225–232, 2000.
[10]  N. Rodréguez-Ezpeleta, H. Brinkmann, B. Roure, N. Lartillot, B. F. Lang, and H. Philippe, “Detecting and overcoming systematic errors in genome-scale phylogenies,” Systematic Biology, vol. 56, no. 3, pp. 389–399, 2007.
[11]  G. E. P. Box and N. R. Draper, Empirical Model-Building and Response Surfaces, John Wiley & Sons, New York, NY, USA, 1987.
[12]  K. F. Gruber, R. S. Voss, and S. A. Jansa, “Base-compositional heterogeneity in the RAG1 locus among didelphid marsupials: implications for phylogenetic inference and the evolution of GC content,” Systematic Biology, vol. 56, no. 1, pp. 83–96, 2007.
[13]  P. J. Lockhart, M. A. Steel, M. D. Hendy, and D. Penny, “Recovering evolutionary trees under a more realistic model of sequence evolution,” Molecular Biology and Evolution, vol. 11, no. 4, pp. 605–612, 1994.
[14]  N. Galtier and M. Gouy, “Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis,” Molecular Biology and Evolution, vol. 15, no. 7, pp. 871–879, 1998.
[15]  W. H. Li, C. I. Wu, and C. C. Luo, “A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes,” Molecular Biology and Evolution, vol. 2, no. 2, pp. 150–174, 1985.
[16]  J. M. Mallatt, J. R. Garey, and J. W. Shultz, “Ecdysozoan phylogeny and Bayesian inference: first use of nearly complete 28S and 18S rRNA gene sequences to classify the arthropods and their kin,” Molecular Phylogenetics and Evolution, vol. 31, no. 1, pp. 178–191, 2004.
[17]  F. Ronquist and J. P. Huelsenbeck, “MrBayes 3: Bayesian phylogenetic inference under mixed models,” Bioinformatics, vol. 19, no. 12, pp. 1572–1574, 2003.
[18]  J. A. A. Nylander, J. C. Wilgenbusch, D. L. Warren, and D. L. Swofford, “AWTY (are we there yet?): a system for graphical exploration of MCMC convergence in Bayesian phylogenetics,” Bioinformatics, vol. 24, no. 4, pp. 581–583, 2008.
[19]  D. L. Swofford, PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods), Version 4, Sinauer Associates, Sunderland, Mass, USA, 2003.
[20]  P. G. Foster, “Modeling compositional heterogeneity,” Systematic Biology, vol. 53, no. 3, pp. 485–495, 2004.
[21]  S. Guindon, J. F. Dufayard, V. Lefort, M. Anisimova, W. Hordijk, and O. Gascuel, “New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0,” Systematic Biology, vol. 59, no. 3, pp. 307–321, 2010.
[22]  D. F. Robinson and L. R. Foulds, “Comparison of phylogenetic trees,” Mathematical Biosciences, vol. 53, no. 1-2, pp. 131–147, 1981.
[23]  J. Adachi and M. Hasegawa, “Improved dating of the human/chimpanzee separation in the mitochondrial DNA tree: heterogeneity among amino acid sites,” Journal of Molecular Evolution, vol. 40, no. 6, pp. 622–628, 1995.
[24]  C. J. Winchell, J. Sullivan, C. B. Cameron, B. J. Swalla, and J. Mallatt, “Evaluating hypotheses of deuterostome phylogeny and chordate evolution with new LSU and SSU ribosomal DNA data,” Molecular Biology and Evolution, vol. 19, no. 5, pp. 762–776, 2002.
[25]  B. Boussau and M. Gouy, “Efficient likelihood computations with nonreversible models of evolution,” Systematic Biology, vol. 55, no. 5, pp. 756–768, 2006.
[26]  M. Thollesson, “LDDist: a Perl module for calculating LogDet pair-wise distances for protein and nucleotide sequences,” Bioinformatics, vol. 20, no. 3, pp. 416–418, 2004.
[27]  S. Blanquart and N. Lartillot, “A Bayesian compound stochastic process for modeling nonstationary and nonhomogeneous sequence evolution,” Molecular Biology and Evolution, vol. 23, no. 11, pp. 2058–2071, 2006.
[28]  V. Gowri-Shankar and M. Rattray, “A reversible jump method for Bayesian phylogenetic inference with a nonhomogeneous substitution model,” Molecular Biology and Evolution, vol. 24, no. 6, pp. 1286–1299, 2007.
[29]  N. C. Sheffield, H. Song, S. L. Cameron, and M. F. Whiting, “Nonstationary evolution and compositional heterogeneity in beetle mitochondrial phylogenomics,” Systematic Biology, vol. 58, no. 4, pp. 381–394, 2009.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133