%0 Journal Article %T Glycosylation site prediction using ensembles of Support Vector Machine classifiers %A Cornelia Caragea %A Jivko Sinapov %A Adrian Silvescu %A Drena Dobbs %A Vasant Honavar %J BMC Bioinformatics %D 2007 %I BioMed Central %R 10.1186/1471-2105-8-438 %X We explore machine learning methods for training classifiers to predict the amino acid residues that are likely to be glycosylated using information derived from the target amino acid residue and its sequence neighbors. We compare the performance of Support Vector Machine classifiers and ensembles of Support Vector Machine classifiers trained on a dataset of experimentally determined N-linked, O-linked, and C-linked glycosylation sites extracted from O-GlycBase version 6.00, a database of 242 proteins from several different species. The results of our experiments show that the ensembles of Support Vector Machine classifiers outperform single Support Vector Machine classifiers on the problem of predicting glycosylation sites in terms of a range of standard measures for comparing the performance of classifiers. The resulting methods have been implemented in EnsembleGly, a web server for glycosylation site prediction.Ensembles of Support Vector Machine classifiers offer an accurate and reliable approach to automated identification of putative glycosylation sites in glycoprotein sequences.Glycosylation is one of the most complex and ubiquitous post-translational modifications (PTMs) of proteins in eukaryotic cells. It is a dynamic enzymatic process in which saccharides are attached to proteins or lipoproteins, usually on serine (S), threonine (T), asparagine (N), and tryptophan (W) residues. Glycosylation, like phosphorylation, is clinically important because of its role in a wide variety of cellular, developmental and immunological processes, including protein folding, protein trafficking and localization, cell-cell interactions, and epitope recognition [1-8].Glycosylation can be classified into four types based on the nature of chemical linkage between specific acceptor residues in the protein and sugar: N-linked and O-linked glycosylation, C-mannosylation, and GPI (glycosylphosphatidylinositol) anchors. The acceptor residues represent the glycosylation sites.In N-lin %U http://www.biomedcentral.com/1471-2105/8/438