Modification with SUMO protein has many key roles in eukaryotic systems which renders the identification of its target proteins and sites of considerable importance. Information regarding the SUMOylation of a protein may tell us about its subcellular localization, function, and spatial orientation. This modification occurs at particular and not all lysine residues in a given protein. In competition with biochemical means of modified-site recognition, computational methods are strong contenders in the prediction of SUMOylation-undergoing sites on proteins. In this research, physicochemical properties of amino acids retrieved from AAIndex, especially those involved in docking of modifier and target proteins and optimal presentation of target lysine, in combination with sequence information and random forest-based classifier presented in WEKA have been used to develop a prediction model, SUMOhunt, with statistics significantly better than all previous predictors. In this model 97.56% accuracy, 100% sensitivity, 94% specificity, and 0.95 MCC have been achieved which shows that proposed amino acid properties have a significant role in SUMO attachment. SUMOhunt will hence bring great reliability and efficiency in SUMOylation prediction. 1. Introduction Posttranslational modifications on proteins offer spectacular diversity and functional variety to an organism’s otherwise constrained proteome. SUMOylation is one such PTM whose vast expanse of biological implications in organisms has brought it under attention; still till now many of its functional outcomes are not known. To name a few, SUMOylation is involved in transcriptional regulation [1–3], mRNA metabolism [4], apoptosis [5, 6], nuclear and subcellular transport [7, 8], protein trafficking [9], signal transduction [10], regulation of DNA damage and replication, cell-cycle progression, competition with other members of the ubiquitin family [2, 11, 12], prevention or promotion of deacetylation [13], chromosome segregation [14], structural integrity of chromatin and many proteins, and mitosis [15]. It has been reported to be involved in the perception of sound as well [16]. Also, it is known to participate in early developmental processes like cell differentiation, specification, division, and lineage commitment [17]. SUMOylation of a target protein can change its localization in a cell by altering its intermolecular and intramolecular interactions [18]. Hence, by determining whether a protein is SUMOylated or not, vital evidences can be gathered regarding its function and spatial association [19]. SUMO, a
References
[1]
D. W. H. Girdwood, M. H. Tatham, and R. T. Hay, “SUMO and transcriptional regulation,” Seminars in Cell and Developmental Biology, vol. 15, no. 2, pp. 201–210, 2004.
[2]
R. T. Hay, “SUMO: a history of modification,” Molecular Cell, vol. 18, no. 1, pp. 1–12, 2005.
[3]
P. J. Hamard, M. Boyer-Guittaut, B. Camuzeaux et al., “Sumoylation delays the ATF7 transcription factor subcellular localization and inhibits its transcriptional activity,” Nucleic Acids Research, vol. 35, no. 4, pp. 1134–1144, 2007.
[4]
T. Li, E. Evdokimov, R. F. Shen et al., “Sumoylation of heterogeneous nuclear ribonucleoproteins, zinc finger proteins, and nuclear pore complex proteins: a proteomic analysis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 101, no. 23, pp. 8551–8556, 2004.
[5]
M. S. Y. Huen and J. Chen, “The DNA damage response pathways: at the crossroad of protein modifications,” Cell Research, vol. 18, no. 1, pp. 8–16, 2008.
[6]
T. Li, R. Santockyte, R. F. Shen et al., “Expression of SUMO-2/3 induced senescence through p53- and pRB-mediated pathways,” Journal of Biological Chemistry, vol. 281, no. 47, pp. 36221–36227, 2006.
[7]
C. Fu, K. Ahmed, H. Ding et al., “Stabilization of PML nuclear localization by conjugation and oligomerization of SUMO-3,” Oncogene, vol. 24, no. 35, pp. 5401–5413, 2005.
[8]
F. Melchior, M. Schergaut, and A. Pichler, “SUMO: ligases, isopeptidases and nuclear pores,” Trends in Biochemical Sciences, vol. 28, no. 11, pp. 612–618, 2003.
[9]
V. Dorval and P. E. Fraser, “Small ubiquitin-like modifier (SUMO) modification of natively unfolded proteins tau and α-synuclein,” Journal of Biological Chemistry, vol. 281, no. 15, pp. 9919–9924, 2006.
[10]
M. Liang, F. Melchior, X. H. Feng, and X. Lin, “Regulation of Smad4 sumoylation and transforming growth factor-β signaling by protein inhibitor of activated STAT1,” Journal of Biological Chemistry, vol. 279, no. 22, pp. 22857–22865, 2004.
[11]
M. B. Kroetz, “SUMO: a ubiquitin-like protein modifier,” The Yale Journal of Biology and Medicine, vol. 78, no. 4, pp. 197–201, 2005.
[12]
J. S. Seeler and A. Dejean, “Nuclear and unclear functions of sumo,” Nature Reviews Molecular Cell Biology, vol. 4, no. 9, pp. 690–699, 2003.
[13]
J. Zhao, “Sumoylation regulates diverse biological processes,” Cellular and Molecular Life Sciences, vol. 64, no. 23, pp. 3017–3033, 2007.
[14]
F. Z. Watts, “The role of SUMO in chromosome segregation,” Chromosoma, vol. 116, no. 1, pp. 15–20, 2007.
[15]
M. Dasso, “Emerging roles of the SUMO pathway in mitosis,” Cell Division, vol. 3, article 5, 2008.
[16]
F. Zhou, Y. Xue, H. Lu, G. Chen, and X. Yao, “A genome-wide analysis of sumoylation-related biological processes and functions in human nucleus,” FEBS Letters, vol. 579, no. 16, pp. 3369–3375, 2005.
[17]
H. Lomelí and M. Vázquez, “Emerging roles of the SUMO pathway in development,” Cellular and Molecular Life Sciences, vol. 68, no. 24, pp. 4045–4064, 2011.
[18]
R. Giess-Friedlander and F. Melchoir, “Concepts in sumoylation: a decade on,” Nature Reviews Molecular Cell Biology, vol. 8, no. 12, pp. 947–956, 2007.
[19]
D. C. Bauer, F. A. Buske, and M. Bod’en, “Predicting SUMOylation sites,” in Pattern Recognition in Bioinformatics, Lecture Notes in Computer Science Series, pp. 28–40, Springer, Berlin, Germany, 2008.
[20]
J. S. Seeler and A. Dejean, “Sumo: of branched proteins and nuclear bodies,” Oncogene, vol. 20, no. 49, pp. 7243–7249, 2001.
[21]
J. Xu, Y. He, B. Qiang, J. Yuan, X. Peng, and X. M. Pan, “A novel method for high accuracy sumoylation site prediction from protein sequences,” BMC Bioinformatics, vol. 9, article 8, 2008.
[22]
D. C. Schwartz and M. Hochstrasser, “A superfamily of protein tags: ubiquitin, SUMO and related modifiers,” Trends in Biochemical Sciences, vol. 28, no. 6, pp. 321–328, 2003.
[23]
A. S. Yavuz and U. Sezerman, “SUMOtr: SUMOylation site prediction based on 3D structure and hydrophobicity,” in Proceedings of the 5th International Symposium on Health Informatics and Bioinformatics (HIBIT '10), pp. 93–97, IEEE, Antalya, Turkey, April 2010.
[24]
K. M. Bohren, V. Nadkarni, J. H. Song, K. H. Gabbay, and D. Owerbach, “A M55V polymorphism in a novel SUMO gene (SUMO-4) differentially activates heat shock transcription factors and is associated with susceptibility to type I diabetes mellitus,” Journal of Biological Chemistry, vol. 279, no. 26, pp. 27233–27238, 2004.
[25]
M. J. Matunis, E. Coutavas, and G. Blobel, “A novel ubiquitin-like modification modulates the partitioning of the Ran-GTPase-activating protein RanGAP1 between the cytosol and the nuclear pore complex,” Journal of Cell Biology, vol. 135, no. 6, pp. 1457–1470, 1996.
[26]
H. Zhang, H. Saitoh, and M. J. Matunis, “Enzymes of the SUMO modification pathway localize to filaments of the nuclear pore complex,” Molecular and Cellular Biology, vol. 22, no. 18, pp. 6498–6508, 2002.
[27]
A. Pichler and F. Melchior, “Ubiquitin-related modifier SUMO1 and nucleocytoplasmic transport,” Traffic, vol. 3, no. 6, pp. 381–387, 2002.
[28]
L. D. Wood, B. J. Irvin, G. Nucifora, K. S. Luce, and S. W. Hiebert, “Small ubiquitin-like modifier conjugation regulates nuclear export of TEL, a putative tumor suppressor,” Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 6, pp. 3257–3262, 2003.
[29]
Y. Shinbo, T. Niki, T. Taira et al., “Proper SUMO-1 conjugation is essential to DJ-1 to exert its full activities,” Cell Death and Differentiation, vol. 13, no. 1, pp. 96–108, 2006.
[30]
M. Li, D. Guo, C. M. Isales et al., “SUMO wrestling with type 1 diabetes,” Journal of Molecular Medicine, vol. 83, no. 7, pp. 504–513, 2005.
[31]
J. M. P. Desterro, J. Thomson, and R. T. Hay, “Ubch9 conjugates SUMO but not ubiquitin,” FEBS Letters, vol. 417, no. 3, pp. 297–300, 1997.
[32]
A. C. O. Vertegaal, “Small ubiquitin-related modifiers in chains,” Biochemical Society Transactions, vol. 35, no. 6, pp. 1422–1423, 2007.
[33]
E. van Damme, K. Laukens, T. H. Dang, and X. van Ostade, “A manually curated network of the pml nuclear body interactome reveals an important role for PML-NBs in SUMOylation dynamics,” International Journal of Biological Sciences, vol. 6, no. 1, pp. 51–67, 2010.
[34]
R. S. Hilgarth, L. A. Murphy, H. S. Skaggs, D. C. Wilkerson, H. Xing, and K. D. Sarge, “Regulation and function of SUMO modification,” Journal of Biological Chemistry, vol. 279, no. 52, pp. 53899–53902, 2004.
[35]
E. S. Johnson, I. Schwienhorst, R. J. Dohmen, and G. Blobel, “The ubiquitin-like protein Smt3p is activated for conjugation to other proteins by an Aos1p/Uba2p heterodimer,” EMBO Journal, vol. 16, no. 18, pp. 5509–5519, 1997.
[36]
T. Okuma, R. Honda, G. Ichikawa, N. Tsumagari, and H. Yasuda, “In vitro SUMO-1 modification requires two enzymatic steps, E1 and E2,” Biochemical and Biophysical Research Communications, vol. 254, no. 3, pp. 693–698, 1999.
[37]
L. Gong, T. Kamitani, K. Fujise, L. S. Caskey, and E. T. H. Yeh, “Preferential interaction of sentrin with a ubiquitin-conjugating enzyme, Ubc9,” Journal of Biological Chemistry, vol. 272, no. 45, pp. 28198–28201, 1997.
[38]
M. S. Rodriguez, C. Dargemont, and R. T. Hay, “SUMO-1 conjugation in vivo requires both a consensus modification motif and nuclear targeting,” Journal of Biological Chemistry, vol. 276, no. 16, pp. 12654–12659, 2001.
[39]
A. Pichler, A. Gast, J. S. Seeler, A. Dejean, and F. Melchior, “The nucleoporin RanBP2 has SUMO1 E3 ligase activity,” Cell, vol. 108, no. 1, pp. 109–120, 2002.
[40]
S. Weger, E. Hammer, and R. Heilbronn, “Topors acts as a SUMO-1 E3 ligase for p53 in vitro and in vivo,” FEBS Letters, vol. 579, no. 22, pp. 5007–5012, 2005.
[41]
P. Pungaliya, D. Kulkarni, H. J. Park et al., “TOPORS functions as a SUMO-1 E3 ligase for chromatin-modifying proteins,” Journal of Proteome Research, vol. 6, no. 10, pp. 3918–3923, 2007.
[42]
H. Yamamoto, M. Ihara, Y. Matsuura, and A. Kikuchi, “Sumoylation is involved in β-catenin-dependent activation of Tcf-4,” EMBO Journal, vol. 22, no. 9, pp. 2047–2059, 2003.
[43]
M. H. Kagey, T. A. Melhuish, and D. Wotton, “The polycomb protein Pc2 is a SUMO E3,” Cell, vol. 113, no. 1, pp. 127–137, 2003.
[44]
M. H. Tatham, M. C. Geoffroy, L. Shen et al., “RNF4 is a poly-SUMO-specific E3 ubiquitin ligase required for arsenic-induced PML degradation,” Nature Cell Biology, vol. 10, no. 5, pp. 538–546, 2008.
[45]
Y. Xue, F. Zhou, C. Fu, Y. Xu, and X. Yao, “SUMOsp: a web server for sumoylation site prediction,” Nucleic Acids Research, vol. 34, pp. W254–W257, 2006.
[46]
D. Nathan, K. Ingvarsdottir, D. E. Sterner et al., “Histone sumoylation is a negative regulator in Saccharomyces cerevisiae and shows dynamic interplay with positive-acting histone modifications,” Genes and Development, vol. 20, no. 8, pp. 966–976, 2006.
[47]
S. H. Yang, A. Galanis, J. Witty, and A. D. Sharrocks, “An extended consensus motif enhances the specificity of substrate modification by SUMO,” EMBO Journal, vol. 25, no. 21, pp. 5083–5093, 2006.
[48]
X. J. Yang and S. Grégoire, “A recurrent phospho-sumoyl switch in transcriptional repression and beyond,” Molecular Cell, vol. 23, no. 4, pp. 779–786, 2006.
[49]
M. J. Matunis and C. M. Pickart, “Beginning at the end with SUMO,” Nature Structural and Molecular Biology, vol. 12, no. 7, pp. 565–566, 2005.
[50]
M. Mann and O. N. Jensen, “Proteomic analysis of post-translational modifications,” Nature Biotechnology, vol. 21, no. 3, pp. 255–261, 2003.
[51]
L. Lu, X. H. Shi, S. J. Li et al., “Protein sumoylation sites prediction based on two-stage feature selection,” Molecular Diversity, vol. 14, no. 1, pp. 81–86, 2010.
[52]
E. Frank, M. Hall, L. Trigg, G. Holmes, and I. H. Witten, “Data mining in bioinformatics using Weka,” Bioinformatics, vol. 20, no. 15, pp. 2479–2481, 2004.
[53]
T. Y. Lee, H. D. Huang, J. H. Hung, H. Y. Huang, Y. S. Yang, and T. H. Wang, “dbPTM: an information repository of protein post-translational modification,” Nucleic Acids Research, vol. 34, pp. D622–627, 2006.
[54]
Y. Xue, F. Zhou, H. Lu, G. Chen, and X. Yao, “SUMO substrates and site prediction: combining pattern recognition and phylogenetic conservation,” http://arxiv.org/ftp/q-bio/papers/0409/0409011.pdf.
[55]
E. Boutet, D. Lieberherr, M. Tognolli, M. Schneider, and A. Bairoch, “UniProtKB/Swiss-Prot: the manually annotated section of the UniProt KnowledgeBase,” Methods in Molecular Biology, vol. 406, pp. 89–112, 2007.
[56]
I. Ahmad, D. C. Hoessli, W. M. Qazi et al., “MAPRes: an efficient method to analyze protein sequence around post-translational modification sites,” Journal of Cellular Biochemistry, vol. 104, no. 4, pp. 1220–1231, 2008.
[57]
S. Kawashima, H. Ogata, and M. Kanehisa, “AAindex: amino acid index database,” Nucleic Acids Research, vol. 27, no. 1, pp. 368–369, 1999.
[58]
L. Breiman, “Random forests,” Statistics Department, University of California, 2001, http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm.
[59]
S. E. Hamby and J. D. Hirst, “Prediction of glycosylation sites using random forests,” BMC Bioinformatics, vol. 9, article 500, 2008.
[60]
J. A. Wohlschlegel, E. S. Johnson, S. I. Reed, and J. R. Yates III, “Global analysis of protein sumoylation in Saccharomyces cerevisiae,” Journal of Biological Chemistry, vol. 279, no. 44, pp. 45662–45668, 2004.
[61]
C. W. Tung and S. Y. Ho, “Computational identification of ubiquitylation sites from protein sequences,” BMC Bioinformatics, vol. 9, article 310, 2008.
B. Liu, S. Li, Y. Wang, L. Lu, Y. Li, and Y. Cai, “Predicting the protein SUMO modification sites based on Properties Sequential Forward Selection (PSFS),” Biochemical and Biophysical Research Communications, vol. 358, no. 1, pp. 136–139, 2007.
[64]
S. Teng, H. Luo, and L. Wang, “Random forest-based prediction of protein sumoylation sites from sequence features,” in Proceedings of the ACM International Conference on Bioinformatics and Computational Biology (BCB '10), pp. 120–126, New York, NY, USA, August 2010.
[65]
C. Friedline, X. Zhang, Z. Zehner, and Z. Zhao, “FindSUMO: a PSSM-based method for sumoylation site prediction,” in Advanced Intelligent Computing Theories and Applications with Aspects of Artificial Intelligence, Lecture Notes in Computer Science Series, pp. 1004–1011, Springer, Berlin, Germany, 2008.