全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition

DOI: 10.1186/1471-2105-8-466

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper, we propose a novel and general predicting method by combining techniques for sequence alignment and feature vectors based on amino acid composition. We implemented this method with support vector machines on plant data sets extracted from the TargetP database. Through fivefold cross validation tests, the obtained overall accuracies and average MCC were 0.9096 and 0.8655 respectively. We also applied our method to other datasets including that of WoLF PSORT.Although there is a predictor which uses the information of gene ontology and yields higher accuracy than ours, our accuracies are higher than existing predictors which use only sequence information. Since such information as gene ontology can be obtained only for known proteins, our predictor is considered to be useful for subcellular location prediction of newly-discovered proteins. Furthermore, the idea of combination of alignment and amino acid frequency is novel and general so that it may be applied to other problems in bioinformatics. Our method for plant is also implemented as a web-system and available on http://sunflower.kuicr.kyoto-u.ac.jp/~tamura/slpfa.html webcite.Predicting subcellular location of proteins is one of the major problems in bioinformatics. This is a problem of predicting which part (e.g., Mitochondria, Chloroplast, etc.) in a cell a given protein is transported to, where an amino acid sequence (i.e., string data) of the protein is given as an input as shown in Fig. 1. This problem is becoming more important since information on subcellular location is helpful for annotation of proteins and genes and the number of complete genomes is rapidly increasing. Many methods have been proposed using various computational techniques. Furthermore, many web-based prediction systems have been developed based on these proposed methods.PSORT [1,2] is historically the first subcellular location predictor. PSORT and its major extension, such as WoLF PSORT [3,4], use various sequence-derived

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133