All Title Author
Keywords Abstract

PLOS ONE  2012 

SVMTriP: A Method to Predict Antigenic Epitopes Using Support Vector Machine to Integrate Tri-Peptide Similarity and Propensity

DOI: 10.1371/journal.pone.0045152

Full-Text   Cite this paper   Add to My Lib


Identifying protein surface regions preferentially recognizable by antibodies (antigenic epitopes) is at the heart of new immuno-diagnostic reagent discovery and vaccine design, and computational methods for antigenic epitope prediction provide crucial means to serve this purpose. Many linear B-cell epitope prediction methods were developed, such as BepiPred, ABCPred, AAP, BCPred, BayesB, BEOracle/BROracle, and BEST, towards this goal. However, effective immunological research demands more robust performance of the prediction method than what the current algorithms could provide. In this work, a new method to predict linear antigenic epitopes is developed; Support Vector Machine has been utilized by combining the Tri-peptide similarity and Propensity scores (SVMTriP). Applied to non-redundant B-cell linear epitopes extracted from IEDB, SVMTriP achieves a sensitivity of 80.1% and a precision of 55.2% with a five-fold cross-validation. The AUC value is 0.702. The combination of similarity and propensity of tri-peptide subsequences can improve the prediction performance for linear B-cell epitopes. Moreover, SVMTriP is capable of recognizing viral peptides from a human protein sequence background. A web server based on our method is constructed for public use. The server and all datasets used in the current study are available at


[1]  Getzoff ED, Tainer JA, Lerner RA, Geysen HM (1988) The chemistry and mechanism of antibody binding to protein antigens. Advances in immunology 43: 1–98.
[2]  Milich DR (1989) Synthetic T and B cell recognition sites: implications for vaccine development. Advances in immunology 45: 195–282.
[3]  Parker JM, Guo D, Hodges RS (1986) New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. Biochemistry 25: 5425–5432.
[4]  Hopp TP, Woods KR (1981) Prediction of protein antigenic determinants from amino acid sequences. Proceedings of the National Academy of Sciences of the United States of America 78: 3824–3828.
[5]  Emini EA, Hughes JV, Perlow DS, Boger J (1985) Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide. Journal of virology 55: 836–839.
[6]  Pellequer JL, Westhof E, Vanregenmortel MHV (1993) Correlation between the Location of Antigenic Sites and the Prediction of Turns in Proteins. Immunology Letters 36: 83–100.
[7]  Karplus PA, Schulz GE (1985) Prediction of Chain Flexibility in Proteins – a Tool for the Selection of Peptide Antigens. Naturwissenschaften 72: 212–213.
[8]  Kolaskar AS, Tongaonkar PC (1990) A semi-empirical method for prediction of antigenic determinants on protein antigens. FEBS letters 276: 172–174.
[9]  Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, et al. (2010) The immune epitope database 2.0. Nucleic acids research 38: D854–862.
[10]  Saha S, Bhasin M, Raghava GP (2005) Bcipep: a database of B-cell epitopes. BMC genomics 6: 79.
[11]  Schonbach C, Koh JLY, Sheng X, Wong L, Brusic V (2000) FIMM, a database of functional molecular immunology. Nucleic acids research 28: 222–224.
[12]  Larsen JE, Lund O, Nielsen M (2006) Improved method for predicting linear B-cell epitopes. Immunome research 2: 2.
[13]  Saha S, Raghava GPS (2006) Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins-Structure Function and Bioinformatics 65: 40–48.
[14]  Chen J, Liu H, Yang J, Chou KC (2007) Prediction of linear B-cell epitopes using amino acid pair antigenicity scale. Amino Acids 33: 423–428.
[15]  El-Manzalawy Y, Dobbs D, Honavar V (2008) Predicting linear B-cell epitopes using string kernels. Journal of Molecular Recognition 21: 243–255.
[16]  Pellequer JL, Westhof E (1993) PREDITOP: a program for antigenicity prediction. Journal of molecular graphics 11: 204-210, 191–202.
[17]  Alix AJ (1999) Predictive estimation of protein linear epitopes by using the program PEOPLE. Vaccine 18: 311–314.
[18]  Odorico M, Pellequer JL (2003) BEPITOPE: predicting the location of continuous epitopes and patterns in proteins. Journal of molecular recognition : JMR 16: 20–22.
[19]  Wee LJ, Simarmata D, Kam YW, Ng LF, Tong JC (2010) SVM-based prediction of linear B-cell epitopes using Bayes Feature Extraction. BMC genomics 11 Suppl 4S21.
[20]  Wang Y, Wu W, Negre NN, White KP, Li C, et al. (2011) Determinants of antigenicity and specificity in immune response for protein sequences. BMC bioinformatics 12: 251.
[21]  Gao J, Faraggi E, Zhou Y, Ruan J, Kurgan L (2012) BEST: Improved Prediction of B-Cell Epitopes from Antigen Sequences. PloS one 7: e40104.
[22]  Chou PY, Fasman GD (1974) Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry 13: 211–222.
[23]  Pruitt KD, Tatusova T, Klimke W, Maglott DR (2009) NCBI Reference Sequences: current status, policy and new initiatives. Nucleic acids research 37: D32–36.
[24]  Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research 25: 3389–3402.
[25]  Lei Z, Dai Y (2005) An SVM-based system for predicting protein subnuclear localizations. BMC bioinformatics 6: 291.
[26]  Joachims T (1999) Making large-Scale SVM Learning Practical. Advances in Kernel Methods – Support Vector Learning.
[27]  DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44: 837–845.
[28]  Vergara IA, Norambuena T, Ferrada E, Slater AW, Melo F (2008) StAR: a simple tool for the statistical comparison of ROC curves. BMC bioinformatics 9: 265.


comments powered by Disqus