Differences between individual human genomes, or between human and cancer genomes, range in scale from single nucleotide variants (SNVs) through intermediate and large-scale duplications, deletions, and rearrangements of genomic segments. The latter class, called structural variants (SVs), have received considerable attention in the past several years as they are a previously under appreciated source of variation in human genomes. Much of this recent attention is the result of the availability of higher-resolution technologies for measuring these variants, including both microarray-based techniques, and more recently, high-throughput DNA sequencing. We describe the genomic technologies and computational techniques currently used to measure SVs, focusing on applications in human and cancer genomics.
References
[1]
Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, et al. (2009) Finding the missing heritability of complex diseases. Nature 461: 747–753. doi: 10.1038/nature08494
[2]
Stratton MR (2011) Exploring the genomes of cancer cells: progress and promise. Science 331: 1553–1558. doi: 10.1126/science.1204040
[3]
Frazer K, Ballinger D, Cox D, Hinds D, Stuve L, et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.
[4]
Sharp AJ, Cheng Z, Eichler EE (2006) Structural variation of the human genome. Annu Rev Genomics Hum Genet 7: 407–442. doi: 10.1146/annurev.genom.7.080505.115618
[5]
Iafrate A, Feuk L, Rivera M, Listewnik M, Donahoe P, et al. (2004) Detection of large-scale variation in the human genome. Nat Genet 36: 949–951. doi: 10.1038/ng1416
[6]
Redon R, Ishikawa S, Fitch K, Feuk L, Perry G, et al. (2006) Global variation in copy number in the human genome. Nature 444: 444–454. doi: 10.1038/nature05329
[7]
Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, et al. (2007) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315: 848–853. doi: 10.1126/science.1136678
[8]
Lower KM, Hughes JR, De Gobbi M, Henderson S, Viprakasit V, et al. (2009) Adventitious changes in long-range gene expression caused by polymorphic structural variation and promoter competi- tion. Proc Natl Acad Sci USA 106: 21771–21776. doi: 10.1073/pnas.0909331106
[9]
Marshall C, Noor A, Vincent J, Lionel A, Feuk L, et al. (2008) Structural variation of chromosomes in autism spectrum disorder. Am J Hum Genet 82: 477–488. doi: 10.1016/j.ajhg.2007.12.009
[10]
Stone JL, O'Donovan MC, Gurling H, Kirov GK, Blackwood DH, et al. (2008) Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455: 237–241. doi: 10.1038/nature07239
[11]
Sindi SS, Raphael BJ (2009) Identification and frequency estimation of inversion polymorphisms from haplotype data. In: RECOMB. pp. 418–433.
[12]
Nowell PC (1976) The clonal evolution of tumor cell populations. Science 194: 23–28. doi: 10.1126/science.959840
[13]
Merlo LM, Pepper JW, Reid BJ, Maley CC (2006) Cancer as an evolutionary and ecological process. Nat Rev Cancer 6: 924–935. doi: 10.1038/nrc2013
Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, et al. (2005) Recurrent fusion of tmprss2 and ets transcription factor genes in prostate cancer. Science 310: 644–648. doi: 10.1126/science.1117679
[16]
Soda M, Choi Y, Enomoto M, Takada S, Yamashita Y, et al. (2007) Identification of the trans- forming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 448: 561–566. doi: 10.1038/nature05945
[17]
Mitelman F, Johansson B, Mertens F (2004) Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. Nat Genet 36: 331–334. doi: 10.1038/ng1335
[18]
Meyerson M, Gabriel S, Getz G (2010) Advances in understanding cancer genomes through second-generation sequencing. Nat Rev Genet 11: 685–696. doi: 10.1038/nrg2841
[19]
Mardis ER (2012) Genome sequencing and cancer. Curr Opin Genet Dev 22: 245–250. doi: 10.1016/j.gde.2012.03.005
[20]
International Cancer Genome Consortium (2010) Hudson TJ, Anderson W, Artez A, Barker AD, et al. (2010) International network of cancer genome projects. Nature 464: 993–998.
[21]
Bignell GR, Santarius T, Pole JCM, Butler AP, Perry J, et al. (2007) Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. Genome Res 17: 1296–1303. doi: 10.1101/gr.6522707
[22]
Campbell P, Stephens P, Pleasance E, O'Meara S, Li H, et al. (2008) Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat Genet 40: 722–729. doi: 10.1038/ng.128
[23]
Kidd J, Cooper G, Donahue W, Hayden H, Sampas N, et al. (2008) Mapping and sequencing of structural variation from eight human genomes. Nature 453: 56–64. doi: 10.1038/nature06862
[24]
Kolomietz E, Meyn MS, Pandita A, Squire JA (2002) The role of Alu repeat clusters as mediators of recurrent chromosomal aberrations in tumors. Genes Chromosomes Cancer 35: 97–112. doi: 10.1002/gcc.10111
[25]
Darai-Ramqvist E, Sandlund A, Mller S, Klein G, Imreh S, et al. (2008) Segmental duplications and evolutionary plasticity at tumor chromosome break-prone regions. Genome Res 18: 370–379. doi: 10.1101/gr.7010208
[26]
Bailey J, Eichler E (2006) Primate segmental duplications: crucibles of evolution, diversity and disease. Nat Rev Genet 7: 552–564. doi: 10.1038/nrg1895
[27]
Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, et al. (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470: 59–65. doi: 10.1038/nature09708
[28]
Stankiewicz P, Lupski JR (2010) Structural variation in the human genome and its role in disease. Annu Rev Med 61: 437–455. doi: 10.1146/annurev-med-100708-204735
[29]
Raphael B, Volik S, Yu P, Wu C, Huang G, et al. (2008) A sequence-based survey of the complex structural organization of tumor genomes. Genome Biol 9: R59. doi: 10.1186/gb-2008-9-3-r59
[30]
Pinkel D, Albertson DG (2005) Array comparative genomic hybridization and its applications in cancer. Nat Genet 37 Suppl: S11–S7. doi: 10.1038/ng1569
[31]
Schatz MC, Delcher AL, Salzberg SL (2010) Assembly of large genomes using second-generation sequencing. Genome Res 20: 1165–1173. doi: 10.1101/gr.101360.109
[32]
Eid J, Fehr A, Gray J, Luong K, Lyle J, et al. (2009) Real-time DNA sequencing from single polymerase molecules. Science 323: 133–138. doi: 10.1126/science.1162986
[33]
Ritz A, Bashir A, Raphael BJ (2010) Structural variation analysis with strobe reads. Bioinformatics 26: 1291–1298. doi: 10.1093/bioinformatics/btq153
[34]
Medvedev P, Stanciu M, Brudno M (2009) Computational methods for discovering structural variation with next-generation sequencing. Nat Methods 6: 13–20. doi: 10.1038/nmeth.1374
[35]
Alkan C, Coe BP, Eichler EE (2011) Genome structural variation discovery and genotyping. Nat Rev Genet 12: 363–376. doi: 10.1038/nrg2958
[36]
Scherer S, Lee C, Birney E, Altshuler D, Eichler E, et al. (2007) Challenges and standards in integrating surveys of structural variation. Nat Genet 39: 7–15. doi: 10.1038/ng2093
[37]
Perry G, Ben-Dor A, Tsalenko A, Sampas N, Rodriguez-Revenga L, et al. (2008) The fine-scale and complex architecture of human copy-number variation. Am J Hum Genet 82: 685–695. doi: 10.1016/j.ajhg.2007.12.010
[38]
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18: 1851–1858. doi: 10.1101/gr.078212.108
[39]
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–1760. doi: 10.1093/bioinformatics/btp324
[40]
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10: R25. doi: 10.1186/gb-2009-10-3-r25
[41]
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with bowtie 2. Nat Methods 9: 357–359. doi: 10.1038/nmeth.1923
[42]
Homer N, Merriman B, Nelson SF (2009) BFAST: an alignment tool for large scale genome resequencing. PLoS ONE 4: e7767 doi:10.1371/journal.pone.0007767.
[43]
Hach F, Hormozdiari F, Alkan C, Hormozdiari F, Birol I, et al. (2010) mrsfast: a cache-oblivious algorithm for short-read mapping. Nat Methods 7: 576–577. doi: 10.1038/nmeth0810-576
[44]
Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, et al. (2009) Transcriptome sequencing to detect gene fusions in cancer. Nature 458: 97–101. doi: 10.1038/nature07638
[45]
Mills RE, Luttig CT, Larkins CE, Beauchamp A, Tsui C, et al. (2006) An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res 16: 1182–1190. doi: 10.1101/gr.4565806
[46]
Ye K, Schulz MH, Long Q, Apweiler R, Ning Z (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25: 2865–2871. doi: 10.1093/bioinformatics/btp394
[47]
Chiang DY, Getz G, Jaffe DB, O'Kelly MJ, Zhao X, et al. (2009) High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6: 99–103. doi: 10.1038/nmeth.1276
[48]
Yoon S, Xuan Z, Makarov V, Ye K, Sebat J (2009) Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res 19: 1586–1592. doi: 10.1101/gr.092981.109
[49]
Volik S, Zhao S, Chin K, Brebner J, Herndon D, et al. (2003) End-sequence profiling: sequence- based analysis of aberrant genomes. Proc Natl Acad Sci USA 100: 7696–7701. doi: 10.1073/pnas.1232418100
Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, et al. (2005) Fine-scale structural variation of the human genome. Nat Genet 37: 727–32. doi: 10.1038/ng1562
[52]
Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, et al. (2007) Paired-end mapping reveals extensive structural variation in the human genome. Science 318: 420–426. doi: 10.1126/science.1149504
[53]
Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, et al. (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6: 677–681. doi: 10.1038/nmeth.1363
[54]
Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, et al. (2009) PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from mas-sive paired-end sequencing data. Genome Biol 10: R23. doi: 10.1186/gb-2009-10-2-r23
[55]
Sindi S, Helman E, Bashir A, Raphael BJ (2009) A geometric approach for classification and comparison of structural variants. Bioinformatics 25: i222–230. doi: 10.1093/bioinformatics/btp208
[56]
Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19: 1270–1278. doi: 10.1101/gr.088633.108
[57]
Quinlan AR, Clark RA, Sokolova S, Leibowitz ML, Zhang Y, et al. (2010) Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome. Genome Res 20: 623–635. doi: 10.1101/gr.102970.109
[58]
Lee S, Cheran E, Brudno M (2008) A robust framework for detecting structural variations in a genome. Bioinformatics 24: 59–67. doi: 10.1093/bioinformatics/btn176
[59]
Hormozdiari F, Hajirasouliha I, Dao P, Hach F, Yorukoglu D, et al. (2010) Next-generation Varia-tionHunter: combinatorial algorithms for transposon insertion discovery. Bioinformatics 26: i350–357. doi: 10.1093/bioinformatics/btq216
[60]
Sindi SS, Onal S, Peng LC, Wu HT, Raphael BJ (2012) An integrative probabilistic model for identification of structural variation in sequencing data. Genome Biol 13: R22. doi: 10.1186/gb-2012-13-3-r22
[61]
Volik S, Raphael B, Huang G, Stratton M, Bignel G, et al. (2006) Decoding the fine-scale structure of a breast cancer genome and transcriptome. Genome Res 16: 394–404. doi: 10.1101/gr.4247306
Hampton OA, Den Hollander P, Miller CA, Delgado DA, Li J, et al. (2009) A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome. Genome Res 19: 167–177. doi: 10.1101/gr.080259.108
[64]
Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, et al. (2011) Massive genomic rearrange- ment acquired in a single catastrophic event during cancer development. Cell 144: 27–40. doi: 10.1016/j.cell.2010.11.055
[65]
Navin N, Kendall J, Troge J, Andrews P, Rodgers L, et al. (2011) Tumour evolution inferred by single-cell sequencing. Nature 472: 90–94. doi: 10.1038/nature09807
[66]
Moore J, Bush W (2012) Genome-wide association studies. PLoS Comput Biol 8: e1002802 doi:10.1371/journal.pcbi.1002802.