Approximately 80 long interspersed element (LINE-1 or L1) copies are able to retrotranspose actively in the human genome, and these are termed retrotransposition-competent L1s. The 5′ untranslated region (UTR) of the human-specific L1 contains an internal promoter and several transcription factor binding sites. To better understand the effect of the L1 5′ UTR on the evolution of human-specific L1s, we examined this population of elements, focusing on the sequence diversity and accumulated substitutions within their 5′ UTRs. Using network analysis, we estimated the age of each L1 component (the 5′ UTR, ORF1, ORF2, and 3′ UTR). Through the comparison of the L1 components based on their estimated ages, we found that the 5′ UTR of human-specific L1s accumulates mutations at a faster rate than the other components. To further investigate the L1 5′ UTR, we examined the substitution frequency per nucleotide position among them. The results showed that the L1 5′ UTRs shared relatively conserved transcription factor binding sites, despite their high sequence diversity. Thus, we suggest that the high level of sequence diversity in the 5′ UTRs could be one of the factors controlling the number of retrotransposition-competent L1s in the human genome during the evolutionary battle between L1s and their host genomes. 1. Introduction Transposable elements are a considerable component of the human genome, responsible for approximately 45% of the human genome sequence [1]. These elements are associated with genomic instability via de novo insertions, insertion-mediated deletions, and recombination events [2–8] and are responsible for a number of genetic disorders [9]. Almost all of the transposable elements belong to one of four types: long interspersed elements (LINEs), short interspersed elements (SINEs), long terminal repeat (LTR) retrotransposons, and DNA transposons [1, 10–12]. Among them, LINE-1s or L1s are one of the most successful retrotransposon families in the human genome, with 516,000 copies comprising 17% of the human genomic sequence [1]. A full-length functional L1 element is about 6?kb in length and contains a 5′ untranslated region (UTR) bearing an internal RNA polymerase II promoter, two open reading frames (ORF1 and ORF2), and a 3′ UTR terminating in a poly(A) tail [13]; ORF1 encodes an RNA-binding protein that has demonstrated nucleic acid chaperone activity in vitro, and ORF2 encodes a protein with both endonuclease (EN) and reverse transcriptase (RT) activities, which are required for L1 retrotransposition [14–16]. The generally accepted model for
References
[1]
E. S. Lander, L. M. Linton, B. Birren et al., “Initial sequencing and analysis of the human genome,” Nature, vol. 409, no. 6822, pp. 860–921, 2001.
[2]
P. L. Deininger and M. A. Batzer, “Alu repeats and human disease,” Molecular Genetics and Metabolism, vol. 67, no. 3, pp. 183–193, 1999.
[3]
P. A. Callinan, J. Wang, S. W. Herke, R. K. Garber, P. Liang, and M. A. Batzer, “Alu retrotransposition-mediated deletion,” Journal of Molecular Biology, vol. 348, no. 4, pp. 791–800, 2005.
[4]
D. E. Symer, C. Connelly, S. T. Szak et al., “Human L1 retrotransposition is associated with genetic instability in vivo,” Cell, vol. 110, no. 3, pp. 327–338, 2002.
[5]
N. Gilbert, S. Lutz-Prigge, and J. V. Moran, “Genomic deletions created upon LINE-1 retrotransposition,” Cell, vol. 110, no. 3, pp. 315–325, 2002.
[6]
K. Han, S. K. Sen, J. Wang et al., “Genomic rearrangements by LINE-1 insertion-mediated deletion in the human and chimpanzee lineages,” Nucleic Acids Research, vol. 33, no. 13, pp. 4040–4052, 2005.
[7]
S. K. Sen, K. Han, J. Wang et al., “Human genomic deletions mediated by recombination between Alu elements,” The American Journal of Human Genetics, vol. 79, no. 1, pp. 41–53, 2006.
[8]
J. Xing, H. Wang, V. P. Belancio, R. Cordaux, P. L. Deininger, and M. A. Batzer, “Emergence of primate genes by retrotransposon-mediated sequence transduction,” Proceedings of the National Academy of Sciences of the United States of America, vol. 103, no. 47, pp. 17608–17613, 2006.
[9]
J. M. Chen, P. D. Stenson, D. N. Cooper, and C. Férec, “A systematic analysis of LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease,” Human Genetics, vol. 117, no. 5, pp. 411–427, 2005.
[10]
P. L. Deininger and M. A. Batzer, “Mammalian retroelements,” Genome Research, vol. 12, no. 10, pp. 1455–1465, 2002.
[11]
H. H. Kazazian Jr., “Mobile elements: drivers of genome evolution,” Science, vol. 303, no. 5664, pp. 1626–1632, 2004.
[12]
A. F. Smit, “The origin of interspersed repeats in the human genome,” Current Opinion in Genetics and Development, vol. 6, no. 6, pp. 743–748, 1996.
[13]
H. H. Kazazian Jr. and J. V. Moran, “The impact of L1 retrotransposons on the human genome,” Nature Genetics, vol. 19, no. 1, pp. 19–24, 1998.
[14]
Q. Feng, J. V. Moran, H. H. Kazazian Jr., and J. D. Boeke, “Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition,” Cell, vol. 87, no. 5, pp. 905–916, 1996.
[15]
V. O. Kolosha and S. L. Martin, “In vitro properties of the first ORF protein from mouse LINE-1 support its role in ribonucleoprotein particle formation during retrotransposition,” Proceedings of the National Academy of Sciences of the United States of America, vol. 94, no. 19, pp. 10155–10160, 1997.
[16]
S. L. Mathias, A. F. Scott, H. H. Kazazian Jr., J. D. Boeke, and A. Gabriel, “Reverse transcriptase encoded by a human transposable element,” Science, vol. 254, no. 5039, pp. 1808–1810, 1991.
[17]
T. G. Fanning and M. F. Singer, “LINE-1: a mammalian transposable element,” Biochimica et Biophysica Acta, vol. 910, no. 3, pp. 203–212, 1987.
[18]
D. D. Luan, M. H. Korman, J. L. Jakubczak, and T. H. Eickbush, “Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition,” Cell, vol. 72, no. 4, pp. 595–605, 1993.
[19]
A. F. Smit, G. Toth, A. D. Riggs, and J. Jurka, “Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences,” Journal of Molecular Biology, vol. 246, no. 3, pp. 401–417, 1995.
[20]
S. T. Szak, O. K. Pickeral, W. Makalowski, M. S. Boguski, D. Landsman, and J. D. Boeke, “Molecular archeology of L1 insertions in the human genome,” Genome Biology, vol. 3, no. 10, research 0052, 2002.
[21]
J. N. Athanikar, R. M. Badge, and J. V. Moran, “A YY1-binding site is required for accurate human LINE-1 transcription initiation,” Nucleic Acids Research, vol. 32, no. 13, pp. 3846–3855, 2004.
[22]
H. Khan, A. Smit, and S. Boissinot, “Molecular evolution and tempo of amplification of human LINE-1 retrotransposons since the origin of primates,” Genome Research, vol. 16, no. 1, pp. 78–87, 2006.
[23]
J. Lee, R. Cordaux, K. Han et al., “Different evolutionary fates of recently integrated human and chimpanzee LINE-1 retrotransposons,” Gene, vol. 390, no. 1-2, pp. 18–27, 2007.
[24]
R. E. Mills, E. A. Bennett, R. C. Iskow et al., “Recently mobilized transposons in the human and chimpanzee genomes,” The American Journal of Human Genetics, vol. 78, no. 4, pp. 671–679, 2006.
[25]
B. Brouha, J. Schustak, R. M. Badge et al., “Hot L1s account for the bulk of retrotransposition in the human population,” Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 9, pp. 5280–5285, 2003.
[26]
W. J. Kent, “BLAT—the BLAST-like alignment tool,” Genome Research, vol. 12, no. 4, pp. 656–664, 2002.
[27]
T. A. Hall, “BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT,” Nucleic Acids Symposium Series, vol. 41, pp. 95–98, 1999.
[28]
T. Penzkofer, T. Dandekar, and T. Zemojtel, “L1Base: from functional annotation to prediction of active LINE-1 elements,” Nucleic Acids Research, vol. 33, pp. D498–D500, 2005.
[29]
H. J. Bandelt, P. Forster, and A. R?hl, “Median-joining networks for inferring intraspecific phylogenies,” Molecular Biology and Evolution, vol. 16, no. 1, pp. 37–48, 1999.
[30]
R. Cordaux, D. J. Hedges, and M. A. Batzer, “Retrotransposition of Alu elements: how many sources?” Trends in Genetics, vol. 20, no. 10, pp. 464–467, 2004.
[31]
M. M. Miyamoto, J. L. Slightom, and M. Goodman, “Phylogenetic relations of humans and African apes from DNA sequences in the psi eta-globin region,” Science, vol. 238, no. 4825, pp. 369–373, 1987.
[32]
L. Lavie, E. Maldener, B. Brouha, E. U. Meese, and J. Mayer, “The human L1 promoter: variable transcription initiation sites and a major impact of upstream flanking sequence on promoter activity,” Genome Research, vol. 14, no. 11, pp. 2253–2260, 2004.
[33]
E. Pascale, C. Liu, E. Valle, K. Usdin, and A. V. Furano, “The evolution of long interspersed repeated DNA (L1, LINE 1) as revealed by the analysis of an ancient rodent L1 DNA family,” Journal of Molecular Evolution, vol. 36, no. 1, pp. 9–20, 1993.
[34]
C. F. Voliva, S. L. Martin, C. A. Hutchison III, and M. H. Edgell, “Dispersal process associated with the L1 family of interspersed repetitive DNA sequences,” Journal of Molecular Biology, vol. 178, no. 4, pp. 795–813, 1984.
[35]
B. L. Welch, “The generalisation of student's problems when several different population variances are involved,” Biometrika, vol. 34, pp. 28–35, 1947.
[36]
J. Sved and A. Bird, “The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model,” Proceedings of the National Academy of Sciences of the United States of America, vol. 87, no. 12, pp. 4692–4696, 1990.
[37]
R. Anbazhagan, J. G. Herman, K. Enika, and E. Gabrielson, “Spreadsheet-based program for the analysis of DNA methylation,” BioTechniques, vol. 30, no. 1, pp. 110–114, 2001.
[38]
K. J. Fryxell and W. J. Moon, “CpG mutation rates in the human genome are highly dependent on local GC content,” Molecular Biology and Evolution, vol. 22, no. 3, pp. 650–658, 2005.
[39]
M. Li and S. S. Chen, “The tendency to recreate ancestral CG dinucleotides in the human genome,” BMC Evolutionary Biology, vol. 11, article 3, 2011.
[40]
M. Gardiner-Garden and M. Frommer, “CpG islands in vertebrate genomes,” Journal of Molecular Biology, vol. 196, no. 2, pp. 261–282, 1987.
[41]
D. Takai and P. A. Jones, “Comprehensive analysis of CpG islands in human chromosomes 21 and 22,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 6, pp. 3740–3745, 2002.