Thousands of pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ~3,000 pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of pseudogenes and their parent genes showed mostly positive correlations, suggesting that pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the PTEN pseudogene, PTENP1. Our analysis of the ENCODE project data also found many transcriptionally active pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human pseudogenes produced small RNAs (sRNAs) and some pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in pseudogenes that produced sRNAs at pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of pseudogene transcription uncovers a complex picture of how pseudogene ncRNAs could influence gene and pseudogene expression, at both epigenetic and post-transcriptional levels.
References
[1]
Balakirev ES, Ayala FJ (2003) Pseudogenes: are they “junk” or functional DNA? Annu Rev Genet 37: 123–151. doi: 10.1146/annurev.genet.37.040103.103949
[2]
Zheng D, Gerstein MB (2007) The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they? Trends Genet 23: 219–224. doi: 10.1016/j.tig.2007.03.003
[3]
Mighell AJ, Smith NR, Robinson PA, Markham AF (2000) Vertebrate pseudogenes. FEBS Lett 468: 109–114. doi: 10.1016/s0014-5793(00)01199-6
[4]
Zhang Z, Harrison PM, Liu Y, Gerstein M (2003) Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome. Genome Res 13: 2541–2558. doi: 10.1101/gr.1429003
[5]
Zhang Z, Gerstein M (2004) Large-scale analysis of pseudogenes in the human genome. Curr Opin Genet Dev 14: 328–335. doi: 10.1016/j.gde.2004.06.003
[6]
Ohshima K, Hattori M, Yada T, Gojobori T, Sakaki Y, et al. (2003) Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates. Genome Biol 4: R74.
[7]
Torrents D, Suyama M, Zdobnov E, Bork P (2003) A genome-wide survey of human pseudogenes. Genome Res 13: 2559–2567. doi: 10.1101/gr.1455503
[8]
Svensson O, Arvestad L, Lagergren J (2006) Genome-wide survey for biologically functional pseudogenes. PLoS Comput Biol 2: e46. doi: 10.1371/journal.pcbi.0020046.eor
[9]
Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, et al. (2007) Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res 17: 839–851. doi: 10.1101/gr.5586307
[10]
Frith MC, Wilming LG, Forrest A, Kawaji H, Tan SL, et al. (2006) Pseudo-messenger RNA: phantoms of the transcriptome. PLoS Genet 2: e23. doi: 10.1371/journal.pgen.0020023
[11]
Pei B, Sisu C, Frankish A, Howald C, Habegger L, et al. (2012) The GENCODE pseudogene resource. Genome Biol 13: R51. doi: 10.1186/gb-2012-13-9-r51
[12]
Kalyana-Sundaram S, Kumar-Sinha C, Shankar S, Robinson DR, Wu YM, et al. (2012) Expressed pseudogenes in the transcriptional landscape of human cancers. Cell 149: 1622–1634. doi: 10.1016/j.cell.2012.04.041
[13]
Ota T, Nei M (1995) Evolution of immunoglobulin VH pseudogenes in chickens. Mol Biol Evol 12: 94–102. doi: 10.1093/oxfordjournals.molbev.a040194
[14]
Korneev SA, Straub V, Kemenes I, Korneeva EI, Ott SR, et al. (2005) Timed and targeted differential regulation of nitric oxide synthase (NOS) and anti-NOS genes by reward conditioning leading to long-term memory formation. J Neurosci 25: 1188–1192. doi: 10.1523/jneurosci.4671-04.2005
[15]
Trinklein ND, Karaoz U, Wu J, Halees A, Force Aldred S, et al. (2007) Integrated analysis of experimental data sets reveals many novel promoters in 1% of the human genome. Genome Res 17: 720–731. doi: 10.1101/gr.5716607
[16]
The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74.
[17]
Pink RC, Wicks K, Caley DP, Punch EK, Jacobs L, et al. (2011) Pseudogenes: pseudo-functional or key regulators in health and disease? Rna 17: 792–798. doi: 10.1261/rna.2658311
[18]
Wen YZ, Zheng LL, Liao JY, Wang MH, Wei Y, et al. (2011) Pseudogene-derived small interference RNAs regulate gene expression in African Trypanosoma brucei. Proc Natl Acad Sci U S A 108: 8345–8350. doi: 10.1073/pnas.1103894108
[19]
Kerin T, Ramanathan A, Rivas K, Grepo N, Coetzee GA, et al. (2012) A noncoding RNA antisense to moesin at 5p14.1 in autism. Sci Transl Med 4: 128ra140. doi: 10.1126/scitranslmed.3003479
[20]
Korneev SA, Kemenes I, Bettini NL, Kemenes G, Staras K, et al. (2013) Axonal trafficking of an antisense RNA transcribed from a pseudogene is regulated by classical conditioning. Sci Rep 3: 1027. doi: 10.1038/srep01027
[21]
Muro EM, Mah N, Andrade-Navarro MA (2011) Functional evidence of post-transcriptional regulation by pseudogenes. Biochimie 93: 1916–1921. doi: 10.1016/j.biochi.2011.07.024
[22]
Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, et al. (2008) Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. Nature 453: 534–538. doi: 10.1038/nature06904
[23]
Watanabe T, Totoki Y, Toyoda A, Kaneda M, Kuramochi-Miyagawa S, et al. (2008) Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 453: 539–543. doi: 10.1038/nature06908
[24]
Guo X, Zhang Z, Gerstein MB, Zheng D (2009) Small RNAs originated from pseudogenes: cis- or trans-acting? PLoS Comput Biol 5: e1000449. doi: 10.1371/journal.pcbi.1000449
[25]
Hawkins PG, Morris KV (2010) Transcriptional regulation of Oct4 by a long non-coding RNA antisense to Oct4-pseudogene 5. Transcription 1: 165–175. doi: 10.4161/trns.1.3.13332
[26]
Korneev SA, Park JH, O'Shea M (1999) Neuronal expression of neural nitric oxide synthase (nNOS) protein is suppressed by an antisense RNA transcribed from an NOS pseudogene. J Neurosci 19: 7711–7720.
[27]
Muro EM, Andrade-Navarro MA (2010) Pseudogenes as an alternative source of natural antisense transcripts. BMC Evol Biol 10: 338. doi: 10.1186/1471-2148-10-338
[28]
Poliseno L, Salmena L, Zhang J, Carver B, Haveman WJ, et al. (2010) A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature 465: 1033–1038. doi: 10.1038/nature09144
[29]
Sakai H, Koyanagi KO, Imanishi T, Itoh T, Gojobori T (2007) Frequent emergence and functional resurrection of processed pseudogenes in the human and mouse genomes. Gene 389: 196–203. doi: 10.1016/j.gene.2006.11.007
[30]
Chan WL, Yuo CY, Yang WK, Hung SY, Chang YS, et al. (2013) Transcribed pseudogene psiPPM1K generates endogenous siRNA to suppress oncogenic cell growth in hepatocellular carcinoma. Nucleic Acids Res 41: 3734–3747. doi: 10.1093/nar/gkt047
[31]
Johnsson P, Ackley A, Vidarsdottir L, Lui WO, Corcoran M, et al. (2013) A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells. Nat Struct Mol Biol 20: 440–446. doi: 10.1038/nsmb.2516
[32]
Mercer TR, Dinger ME, Mattick JS (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10: 155–159. doi: 10.1038/nrg2521
[33]
Jacquier A (2009) The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat Rev Genet 10: 833–844. doi: 10.1038/nrg2683
[34]
Wang KC, Chang HY (2011) Molecular mechanisms of long noncoding RNAs. Mol Cell 43: 904–914. doi: 10.1016/j.molcel.2011.08.018
[35]
Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, et al. (2011) lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature 477: 295–300. doi: 10.1038/nature10398
[36]
Esteller M (2011) Non-coding RNAs in human disease. Nat Rev Genet 12: 861–874. doi: 10.1038/nrg3074
[37]
Poliseno L (2012) Pseudogenes: newly discovered players in human cancer. Sci Signal 5: re5. doi: 10.1126/scisignal.2002858
[38]
Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.
[39]
Myers RM, Stamatoyannopoulos J, Snyder M, Dunham I, Hardison RC, et al. (2011) A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9: e1001046. doi: 10.1371/journal.pbio.1001046
Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, et al. (2011) Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev 25: 1915–1927. doi: 10.1101/gad.17446611
[42]
Yano Y, Saito R, Yoshida N, Yoshiki A, Wynshaw-Boris A, et al. (2004) A new role for expressed pseudogenes as ncRNA: regulation of mRNA stability of its homologous coding gene. J Mol Med (Berl) 82: 414–422. doi: 10.1007/s00109-004-0550-3
[43]
Harrison PM, Zheng D, Zhang Z, Carriero N, Gerstein M (2005) Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic Acids Res 33: 2374–2383. doi: 10.1093/nar/gki531
[44]
Zheng D, Zhang Z, Harrison PM, Karro J, Carriero N, et al. (2005) Integrated pseudogene annotation for human chromosome 22: evidence for transcription. J Mol Biol 349: 27–45. doi: 10.1016/j.jmb.2005.02.072
[45]
Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP (2011) A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell 146: 353–358. doi: 10.1016/j.cell.2011.07.014
[46]
Gertz J, Varley KE, Davis NS, Baas BJ, Goryshin IY, et al. (2012) Transposase mediated construction of RNA-seq libraries. Genome Res 22: 134–141. doi: 10.1101/gr.127373.111
[47]
Liang Y, Ridzon D, Wong L, Chen C (2007) Characterization of microRNA expression profiles in normal human tissues. BMC Genomics 8: 166. doi: 10.1186/1471-2164-8-166
[48]
Hafner M, Landthaler M, Burger L, Khorshid M, Hausser J, et al. (2010) Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141: 129–141. doi: 10.1016/j.cell.2010.03.009
[49]
Helwak A, Kudla G, Dudnakova T, Tollervey D (2013) Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding. Cell 153: 654–665. doi: 10.1016/j.cell.2013.03.043
[50]
Kim VN, Han J, Siomi MC (2009) Biogenesis of small RNAs in animals. Nat Rev Mol Cell Biol 10: 126–139. doi: 10.1038/nrm2632
[51]
Ghildiyal M, Zamore PD (2009) Small silencing RNAs: an expanding universe. Nat Rev Genet 10: 94–108. doi: 10.1038/nrg2504
Castel SE, Martienssen RA (2013) RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond. Nat Rev Genet 14: 100–112. doi: 10.1038/nrg3355
[54]
Sasidharan R, Gerstein M (2008) Genomics: protein fossils live on as RNA. Nature 453: 729–731. doi: 10.1038/453729a
[55]
Consortium EP, Dunham I, Kundaje A, Aldred SF, Collins PJ, et al. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74.
[56]
Peng JC, Lin H (2013) Beyond transposons: the epigenetic and somatic functions of the Piwi-piRNA mechanism. Curr Opin Cell Biol 25: 190–194. doi: 10.1016/j.ceb.2013.01.010
[57]
Yang N, Kazazian HH Jr (2006) L1 retrotransposition is suppressed by endogenously encoded small interfering RNAs in human cultured cells. Nat Struct Mol Biol 13: 763–771. doi: 10.1038/nsmb1141
[58]
Chen L, Dahlstrom JE, Lee SH, Rangasamy D (2012) Naturally occurring endo-siRNA silences LINE-1 retrotransposons in human cells through DNA methylation. Epigenetics 7: 758–771. doi: 10.4161/epi.20706
[59]
Zhong X, Hale CJ, Law JA, Johnson LM, Feng S, et al. (2012) DDR complex facilitates global association of RNA polymerase V to promoters and evolutionarily young transposons. Nat Struct Mol Biol 19: 870–875. doi: 10.1038/nsmb.2354
[60]
Wierzbicki AT, Cocklin R, Mayampurath A, Lister R, Rowley MJ, et al. (2012) Spatial and functional relationships among Pol V-associated loci, Pol IV-dependent siRNAs, and cytosine methylation in the Arabidopsis epigenome. Genes Dev 26: 1825–1836. doi: 10.1101/gad.197772.112
[61]
Martens JH, O'Sullivan RJ, Braunschweig U, Opravil S, Radolf M, et al. (2005) The profile of repeat-associated histone lysine methylation states in the mouse epigenome. Embo J 24: 800–812. doi: 10.1038/sj.emboj.7600545
[62]
Hon GC, Hawkins RD, Caballero OL, Lo C, Lister R, et al. (2011) Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer. Genome Res.
[63]
Zhu J, Adli M, Zou JY, Verstappen G, Coyne M, et al. (2013) Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152: 642–654. doi: 10.1016/j.cell.2012.12.033
[64]
Vaucheret H (2006) Post-transcriptional small RNA pathways in plants: mechanisms and regulations. Genes Dev 20: 759–771. doi: 10.1101/gad.1410506
[65]
Wang X, Grus WE, Zhang J (2006) Gene losses during human origins. PLoS Biol 4: e52. doi: 10.1371/journal.pbio.0040052
[66]
Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, et al. (2007) Comparative genomics search for losses of long-established genes on the human lineage. PLoS Comput Biol 3: e247. doi: 10.1371/journal.pcbi.0030247.eor
[67]
Marques AC, Tan J, Lee S, Kong L, Heger A, et al. (2012) Evidence for conserved post-transcriptional roles of unitary pseudogenes and for frequent bifunctionality of mRNAs. Genome Biol 13: R102. doi: 10.1186/gb-2012-13-11-r102
[68]
Ozsolak F, Platt AR, Jones DR, Reifenberger JG, Sass LE, et al. (2009) Direct RNA sequencing. Nature 461: 814–818. doi: 10.1038/nature08390
[69]
Kim DH, Villeneuve LM, Morris KV, Rossi JJ (2006) Argonaute-1 directs siRNA-mediated transcriptional gene silencing in human cells. Nat Struct Mol Biol 13: 793–797. doi: 10.1038/nsmb1142
[70]
Ting AH, Schuebel KE, Herman JG, Baylin SB (2005) Short double-stranded RNA induces transcriptional gene silencing in human cancer cells in the absence of DNA methylation. Nat Genet 37: 906–910. doi: 10.1038/ng1611
[71]
Matzke M, Kanno T, Huettel B, Daxinger L, Matzke AJ (2007) Targets of RNA-directed DNA methylation. Curr Opin Plant Biol 10: 512–519. doi: 10.1016/j.pbi.2007.06.007
[72]
Maida Y, Yasukawa M, Furuuchi M, Lassmann T, Possemato R, et al. (2009) An RNA-dependent RNA polymerase formed by TERT and the RMRP RNA. Nature 461: 230–235. doi: 10.1038/nature08283
[73]
Maida Y, Masutomi K (2011) RNA-dependent RNA polymerases in RNA silencing. Biol Chem 392: 299–304. doi: 10.1515/bc.2011.035
[74]
Friedlander MR, Mackowiak SD, Li N, Chen W, Rajewsky N (2012) miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Res 40: 37–52. doi: 10.1093/nar/gkr688
[75]
Kanellopoulou C, Muljo SA, Kung AL, Ganesan S, Drapkin R, et al. (2005) Dicer-deficient mouse embryonic stem cells are defective in differentiation and centromeric silencing. Genes Dev 19: 489–501. doi: 10.1101/gad.1248505
[76]
Fukagawa T, Nogami M, Yoshikawa M, Ikeno M, Okazaki T, et al. (2004) Dicer is essential for formation of the heterochromatin structure in vertebrate cells. Nat Cell Biol 6: 784–791. doi: 10.1038/ncb1155
[77]
Moazed D (2009) Small RNAs in transcriptional gene silencing and genome defence. Nature 457: 413–420. doi: 10.1038/nature07756
[78]
Khalil AM, Guttman M, Huarte M, Garber M, Raj A, et al. (2009) Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A 106: 11667–11672. doi: 10.1073/pnas.0904715106
[79]
Han J, Kim D, Morris KV (2007) Promoter-associated RNA is required for RNA-directed transcriptional gene silencing in human cells. Proc Natl Acad Sci U S A 104: 12422–12427. doi: 10.1073/pnas.0701635104
[80]
Lee JT, Davidow LS, Warshawsky D (1999) Tsix, a gene antisense to Xist at the X-inactivation centre. Nat Genet 21: 400–404. doi: 10.1038/7734
[81]
Olovnikov I, Aravin AA, Fejes Toth K (2012) Small RNA in the nucleus: the RNA-chromatin ping-pong. Curr Opin Genet Dev 22: 164–171. doi: 10.1016/j.gde.2012.01.002
[82]
O'Geen H, Squazzo SL, Iyengar S, Blahnik K, Rinn JL, et al. (2007) Genome-wide analysis of KAP1 binding suggests autoregulation of KRAB-ZNFs. PLoS Genet 3: e89. doi: 10.1371/journal.pgen.0030089.eor
[83]
Blahnik KR, Dou L, Echipare L, Iyengar S, O'Geen H, et al. (2011) Characterization of the contradictory chromatin signatures at the 3′ exons of zinc finger genes. PLoS One 6: e17121. doi: 10.1371/journal.pone.0017121
[84]
Khachane AN, Harrison PM (2009) Assessing the genomic evidence for conserved transcribed pseudogenes under selection. BMC Genomics 10: 435. doi: 10.1186/1471-2164-10-435
[85]
Tay Y, Kats L, Salmena L, Weiss D, Tan SM, et al. (2011) Coding-independent regulation of the tumor suppressor PTEN by competing endogenous mRNAs. Cell 147: 344–357. doi: 10.1016/j.cell.2011.09.029
Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, et al. (2007) Target mimicry provides a new mechanism for regulation of microRNA activity. Nat Genet 39: 1033–1037. doi: 10.1038/ng2079
Sumazin P, Yang X, Chiu HS, Chung WJ, Iyer A, et al. (2011) An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell 147: 370–381. doi: 10.1016/j.cell.2011.09.041
[90]
Lam HY, Khurana E, Fang G, Cayting P, Carriero N, et al. (2009) Pseudofam: the pseudogene families database. Nucleic Acids Res 37: D738–743. doi: 10.1093/nar/gkn758
[91]
Zhang Z, Carriero N, Zheng D, Karro J, Harrison PM, et al. (2006) PseudoPipe: an automated pseudogene identification pipeline. Bioinformatics 22: 1437–1439. doi: 10.1093/bioinformatics/btl116
Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, et al. (2008) Widespread changes in protein synthesis induced by microRNAs. Nature 455: 58–63. doi: 10.1038/nature07228
[94]
Notredame C, Higgins DG, Heringa J (2000) T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol 302: 205–217. doi: 10.1006/jmbi.2000.4042
[95]
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, et al. (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28: 511–515. doi: 10.1038/nbt.1621
[96]
Friedman RC, Farh KK, Burge CB, Bartel DP (2009) Most mammalian mRNAs are conserved targets of microRNAs. Genome Res 19: 92–105. doi: 10.1101/gr.082701.108
[97]
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15: 1034–1050. doi: 10.1101/gr.3715005
[98]
Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.
[99]
Yu N, Jensen-Seaman MI, Chemnick L, Ryder O, Li WH (2004) Nucleotide diversity in gorillas. Genetics 166: 1375–1383. doi: 10.1534/genetics.166.3.1375
[100]
Han YJ, Ma SF, Yourek G, Park YD, Garcia JG (2011) A transcribed pseudogene of MYLK promotes cell proliferation. FASEB J 25: 2305–2312. doi: 10.1096/fj.10-177808