全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Predicting functional sites with an automated algorithm suitable for heterogeneous datasets

DOI: 10.1186/1471-2105-6-116

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this report, we present an algorithmic approach that determines thresholds without human subjectivity. The approach relies on significant raw data preprocessing to improve signal detection. Subsequently, Partition Around Medoids Clustering (PAMC) of the similarity scores assesses sequence fragments where functional annotation remains in question. The accuracy of the approach is confirmed through comparisons to our previous (manual) results and structural analyses. Triosephosphate isomerase and arginyl-tRNA synthetase are discussed as exemplar cases. A quantitative functional site prediction assessment algorithm indicates that the phylogenetic motif predictions, which require sequence information only, are nearly as good as those from evolutionary trace methods that do incorporate structure.The automated threshold detection algorithm has been incorporated into MINER, our web-based phylogenetic motif identification server. MINER is freely available on the web at http://www.pmap.csupomona.edu/MINER/ webcite. Pre-calculated functional site predictions of the COG database and an implementation of the threshold detection algorithm, in the R statistical language, can also be accessed at the website.Due to the exponential growth of genomic and protein sequence data, development of automated strategies for large scale functional site identification is an important post-genomic challenge. Many recent efforts predict functional sites from sequence alone. Strong candidates for functional sites include individual highly conserved positions within a sequence alignment and highly conserved sequence motifs [1-5]. Although attractive due to their relative simplicity, conservation-based approaches frequently result in too many false positives to be satisfactory [3]. Further, sequence regions with significant variability can also be functionally important [6], especially when their composition may define sub-family functional specificity. The Evolutionary Trace (ET) procedure [7],

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133