全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Phylogenetic assessment of alignments reveals neglected tree signal in gaps

DOI: 10.1186/gb-2010-11-4-r37

Full-Text   Cite this paper   Add to My Lib

Abstract:

Here, we introduce tree-based tests of alignment accuracy, which not only use large and representative samples of real biological data, but also enable the evaluation of the effect of gap placement on phylogenetic inference. We show that (i) the current belief that consistency-based alignments outperform scoring matrix-based alignments is misguided; (ii) gaps carry substantial phylogenetic signal, but are poorly exploited by most alignment and tree building programs; (iii) even so, excluding gaps and variable regions is detrimental; (iv) disagreement among alignment programs says little about the accuracy of resulting trees.This study provides the broad community relying on sequence alignment with important practical recommendations, sets superior standards for assessing alignment accuracy, and paves the way for the development of phylogenetic inference methods of significantly higher resolution.The study of biological sequences almost inevitably begins with the process of alignment. The goal of this process is usually to match homologous characters, that is, characters that have a common ancestry [1]. In turn, these sets of homologs, the columns of the alignment, can be used for a variety of applications, such as identifying residues with analogous structural or functional role, or inferring the phylogenetic tree of the underlying sequences. The accuracy of multiple sequence alignment programs has been the object of numerous comparative studies [2-4], which evaluate alignments either by using trusted reference alignments obtained from structural data, or by using simulation. Unfortunately, both approaches have flaws. Trusted benchmark alignments such as Balibase, Prefab, Homstrad, or Sabmark [5-8] are all derived from protein structure information, exploiting the tendency of structure to evolve more slowly than sequence [9].However, proteins with resolved structure remain a small and highly biased sample of all proteins [10,11]. In addition, homology inferred from

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133