全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

FACT: Functional annotation transfer between proteins with similar feature architectures

DOI: 10.1186/1471-2105-11-417

Full-Text   Cite this paper   Add to My Lib

Abstract:

We present the Feature Architecture Comparison Tool http://www.cibiv.at/FACT webcite to search for functionally equivalent proteins. FACT uses the similarity between feature architectures of two proteins, i.e., the arrangements of functional domains, secondary structure elements and compositional properties, as a proxy for their functional equivalence. A scoring function measures feature architecture similarities, which enables searching for functional equivalents in entire proteomes. Our evaluation of 9,570 EC classified enzymes revealed that FACT, using the full feature, set outperformed the existing architecture-based approaches by identifying significantly more functional equivalents as highest scoring proteins. We show that FACT can identify functional equivalents that share no significant sequence similarity. However, when the highest scoring protein of FACT is also the protein with the highest local sequence similarity, it is in 99% of the cases functionally equivalent to the query. We demonstrate the versatility of FACT by identifying a missing link in the yeast glutathione metabolism and also by searching for the human GolgA5 equivalent in Trypanosoma brucei.FACT facilitates a quick and sensitive search for functionally equivalent proteins in entire proteomes. FACT is complementary to approaches using sequence similarity to identify proteins with the same function. Thus, FACT is particularly useful when functional equivalents need to be identified in evolutionarily distant species, or when functional equivalents are not homologous. The most reliable annotation transfers, however, are achieved when feature architecture similarity and sequence similarity are jointly taken into account.The sequencing of entire genomes has become a routine task in molecular biology. To date, about 240 fully sequenced eukaryotic genomes comprising more than 3.7 Million protein coding sequences are available in the public domain [1]. Only a small fraction of these species are mod

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133