Sugarcane is an important crop and a major source of sugar and alcohol. In this study, we performed de novo assembly and transcriptome annotation for six sugarcane genotypes involved in bi-parental crosses. The de novo assembly of the sugarcane transcriptome was performed using short reads generated using the Illumina RNA-Seq platform. We produced more than 400 million reads, which were assembled into 72,269 unigenes. Based on a similarity search, the unigenes showed significant similarity to more than 28,788 sorghum proteins, including a set of 5,272 unigenes that are not present in the public sugarcane EST databases; many of these unigenes are likely putative undescribed sugarcane genes. From this collection of unigenes, a large number of molecular markers were identified, including 5,106 simple sequence repeats (SSRs) and 708,125 single-nucleotide polymorphisms (SNPs). This new dataset will be a useful resource for future genetic and genomic studies in this species.
References
[1]
United States Department of Agriculture (2013) Sugar: World Markets and Trade. Foreign Agric Service. Available: http://usda01.library.cornell.edu/usda/c?urrent/sugar/sugar-11-21-2013.pdf. Accessed 10 December 2013.
[2]
Ministério da Agricultura (2013) Acompanhamento de safra brasileira: cana-de-a?úcar Safra 2012/2013 Terceiro levantamento. Cia Nac Abast. Available: http://www.conab.gov.br/OlalaCMS/uploads?/arquivos/12_12_12_10_34_43_boletim_cana?_portugues_12_2012.pdf. Accessed 10 December 2013.
[3]
Ming R, Liu SC, Lin YR, da Silva J, Wilson W, et al. (1998) Detailed alignment of saccharum and sorghum chromosomes: comparative organization of closely related diploid and polyploid genomes. Genetics 150: 1663–1682.
[4]
Li S-W, Yang H, Liu Y-F, Liao Q-R, Du J, et al. (2012) Transcriptome and gene expression analysis of the rice leaf folder, Cnaphalocrosis medinalis. PLoS One 7: e47401. doi: 10.1371/journal.pone.0047401
[5]
Lister R, O'Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, et al. (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133: 523–536. doi: 10.1016/j.cell.2008.03.029
[6]
Trick M, Long Y, Meng J, Bancroft I (2009) Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol J 7: 334–346. doi: 10.1111/j.1467-7652.2008.00396.x
[7]
Lu T, Lu G, Fan D, Zhu C, Li W, et al. (2010) Function annotation of the rice transcriptome at single-nucleotide resolution by RNA-seq. Genome Res 20: 1238–1249. doi: 10.1101/gr.106120.110
[8]
Hansey CN, Vaillancourt B, Sekhon RS, de Leon N, Kaeppler SM, et al. (2012) Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing. PLoS One 7: e33071. doi: 10.1371/journal.pone.0033071
[9]
Marguerat S, B?hler J (2010) RNA-seq: from technology to biology. Cell Mol Life Sci 67: 569–579. doi: 10.1007/s00018-009-0180-6
[10]
Morozova O, Marra Ma (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics 92: 255–264. doi: 10.1016/j.ygeno.2008.07.001
[11]
Varshney RK, Nayak SN, May GD, Jackson Sa (2009) Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 27: 522–530. doi: 10.1016/j.tibtech.2009.05.006
[12]
Novaes E, Drost DR, Farmerie WG, Pappas GJ, Grattapaglia D, et al. (2008) High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics 9: 312. doi: 10.1186/1471-2164-9-312
[13]
Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS (2007) SNP discovery via 454 transcriptome sequencing. Plant J 51: 910–918. doi: 10.1111/j.1365-313x.2007.03193.x
[14]
Garg R, Patel RK, Tyagi AK, Jain M (2011) De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification. DNA Res 18: 53–63. doi: 10.1093/dnares/dsq028
[15]
Carson DL, Botha FC (2000) Preliminary Analysis of Expressed Sequence Tags for Sugarcane. Crop Sci 40: 1769–1779. doi: 10.2135/cropsci2000.4061769x
[16]
Carson D, Botha F (2002) Genes expressed in sugarcane maturing internodal tissue. Plant Cell Rep 20: 1075–1081. doi: 10.1007/s00299-002-0444-1
[17]
Vettore AL, Silva FR, Kemper EL, Arruda P (2001) The libraries that made SUCEST. Genet Mol Biol 24: 1–7. doi: 10.1590/s1415-47572001000100002
[18]
Vettore AL, da Silva FR, Kemper EL, Souza GM, da Silva AM, et al. (2003) Analysis and functional annotation of an expressed sequence tag collection for tropical crop sugarcane. Genome Res 13: 2725–2735. doi: 10.1101/gr.1532103
[19]
Casu RE, Grof CPL, Rae AL, McIntyre CL, Dimmock CM, et al. (2003) Identification of a novel sugar transporter homologue strongly expressed in maturing stem vascular tissues of sugarcane by expressed sequence tag and microarray analysis. Plant Mol Biol 52: 371–386.
[20]
Casu RE, Dimmock CM, Chapman SC, Grof CPL, McIntyre CL, et al. (2004) Identification of differentially expressed transcripts from maturing stem of sugarcane by in silico analysis of stem expressed sequence tags and gene expression profiling. Plant Mol Biol 54: 503–517. doi: 10.1023/b:plan.0000038255.96128.41
[21]
Bower NI, Casu RE, Maclean DJ, Reverter A, Chapman SC, et al. (2005) Transcriptional response of sugarcane roots to methyl jasmonate. Plant Sci 168: 761–772. doi: 10.1016/j.plantsci.2004.10.006
[22]
Ma H-M, Schulze S, Lee S, Yang M, Mirkov E, et al. (2004) An EST survey of the sugarcane transcriptome. Theor Appl Genet 108: 851–863. doi: 10.1007/s00122-003-1510-y
[23]
Vicentini R, Bem LEV, Sluys Ma, Nogueira FTS, Vincentz M (2012) Gene Content Analysis of Sugarcane Public ESTs Reveals Thousands of Missing Coding-Genes and an Unexpected Pool of Grasses Conserved ncRNAs. Trop Plant Biol 5: 199–205. doi: 10.1007/s12042-012-9103-z
[24]
Mancini MC, Leite DC, Perecin D, Bidóia MaP, Xavier Ma, et al. (2012) Characterization of the Genetic Variability of a Sugarcane Commercial Cross Through Yield Components and Quality Parameters. Sugar Tech 14: 119–125. doi: 10.1007/s12355-012-0141-5
[25]
Landell MGA, Campana MP, Figueiredo P, Vasconcelos ACM, Xavier MA, Bidoia MAP, Prado H, Silva MA, Miranda LLD AC (2005) Variedades de cana-de-a?úcar para o centro sul do Brasil. Technical Bulletin IAC 197: 33.
[26]
Bellodi N, Macedo I (1995) Quinta gera??o de variedades de cana-de-a?úcar. COOPERATIVA DOS PRODUTORES DE CANA, A?úCAR E áLCOOL DO ESTADO DE S?O PAULO. Technical Bulletin: 16–23.
[27]
Sabino J (1997) Sexta gera??o de variedades de cana-de-a?úcar. COOPERATIVA DE PRODUTORES DE CANA, A?úCAR E áLCOOL DO ESTADO DE S?O PAULO LTDA. Technical Bulletin: 1.
[28]
Hoffmann H (2008) Variedades RB de cana-de-a?úcar. CCA/UFSCar Technical Bulletin 1: 30.
[29]
McCormick AJ, Cramer MD, Watt DA (2006) Sink strength regulates photosynthesis in sugarcane. New Phytol 171: 759–770. doi: 10.1111/j.1469-8137.2006.01785.x
[30]
Kistner C, Matamoros M (2005) RNA ISOLATION USING PHASE EXTRACTION AND L I C L. In: Márquez A, editor. Lotus japonicus Handbook. Dordrecht, The Netherlands. pp. 123–124.
[31]
Patel RK, Jain M (2012) NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS One 7: e30619. doi: 10.1371/journal.pone.0030619
[32]
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29: 644–652. doi: 10.1038/nbt.1883
[33]
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10: R25–R25. doi: 10.1186/gb-2009-10-3-r25
[34]
Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12: 323. doi: 10.1186/1471-2105-12-323
[35]
Conesa A, G?tz S, García-Gómez JM, Terol J, Talón M, et al. (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21: 3674–3676. doi: 10.1093/bioinformatics/bti610
[36]
Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W (1998) A Computer Program for Aligning a cDNA Sequence with a Genomic DNA Sequence. Genome Res 8: 967–974.
[37]
Iseli C, Jongeneel CV, Bucher P (1999) ESTScan: A Program for Detecting, Evaluating, and Reconstructing Potential Coding Regions in EST Sequences. ISMB-99 Proceedings. AAAI Press. pp. 138–148.
Clote P, Ferré F, Kranakis E, Krizanc D (2005) Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA 11: 578–591. doi: 10.1261/rna.7220505
[40]
Domingues DS, Cruz GMQ, Metcalfe CJ, Nogueira FTS, Vicentini R, et al. (2012) Analysis of plant LTR-retrotransposons at the fine-scale family level reveals individual molecular patterns. BMC Genomics 13: 137. doi: 10.1186/1471-2164-13-137
[41]
Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. Genomics (q-bioGN); Quant Methods: 1–9.
[42]
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079. doi: 10.1093/bioinformatics/btp352
[43]
Cingolani P, Patel VM, Coon M, Nguyen T, Land SJ, et al. (2012) Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift. Front Genet 3: 35. doi: 10.3389/fgene.2012.00035
[44]
Li D, Deng Z, Qin B, Liu X, Men Z (2012) De novo assembly and characterization of bark transcriptome using Illumina sequencing and development of EST-SSR markers in rubber tree (Hevea brasiliensis Muell. Arg.). BMC Genomics 13: 192. doi: 10.1186/1471-2164-13-192
[45]
Liu M, Qiao G, Jiang J, Yang H, Xie L, et al. (2012) Transcriptome sequencing and de novo analysis for Ma bamboo (Dendrocalamus latiflorus Munro) using the Illumina platform. PLoS One 7: e46766. doi: 10.1371/journal.pone.0046766
[46]
Liu S, Li W, Wu Y, Chen C, Lei J (2013) De Novo Transcriptome Assembly in Chili Pepper (Capsicum frutescens) to Identify Genes Involved in the Biosynthesis of Capsaicinoids. PLoS One 8: e48156. doi: 10.1371/journal.pone.0048156
[47]
Vincentz M, Cara FAA, Okura VK, da Silva FR, Pedrosa GL, et al. (2004) Evaluation of monocot and eudicot divergence using the sugarcane transcriptome. Plant Physiol 134: 951–959. doi: 10.1104/pp.103.033878
[48]
Grivet L, Hont AD, Dufour P, Hamon P, Roquest D (1994) Comparative genome mapping of sugar cane with other species within the Andropogoneae tribe. Heredity 73: 500–508. doi: 10.1038/hdy.1994.148
[49]
Dekkers JCM (2002) Hospital F (2002) The use of molecular genetics in the improvement of agricultural populations. Nat Rev Genet 3: 22–32. doi: 10.1038/nrg701
[50]
Daugrois JH, Grivet L, Roques D, Hoarau JY, Lombard H, et al. (1996) A putative major gene for rust resistance linked with a RFLP marker in sugarcane cultivar ‘R570 ’. Theor Appl Genet 92: 1059–1064. doi: 10.1007/bf00224049
[51]
Tai PYP, Miller JD, Dean JL (1981) INHERITANCE OF RESISTANCE TO RUST IN SUGARCANE. F Crop Res 4: 261–268. doi: 10.1016/0378-4290(81)90077-0
[52]
Hogarth DM, Ryan CC, Taylor PWJ (1993) Quantitative inheritance of rust resistance in sugarcane. F Crop Res 34: 187–193. doi: 10.1016/0378-4290(93)90006-9
[53]
Irvine JE (1975) Relations of Photosynthetic Rates and Leaf and Canopy Characters to Sugarcane Yield. Crop Sci 15: 671. doi: 10.2135/cropsci1975.0011183x001500050017x
[54]
Moore PH, Botha F, Furbank R, Grof CP (1996) Intensive sugarcane production: Meeting the challenge beyond 2000. Keating BA and Wilson JR, editor Oxon, UK: CAB International. p544.
[55]
Henry R, Kole C (2010) Genetics, Genomics and Breeding of Sugarcane. 1st ed. Henry, R. J.;Kole C, editor Science Publishers. p300.
[56]
Guo X, Gao L, Liao Q, Xiao H, Ma X, et al. (2013) Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks. Nucleic Acids Res 41: e35. doi: 10.1093/nar/gks967
[57]
Hangauer MJ, Vaughn IW, McManus MT (2013) Pervasive Transcription of the Human Genome Produces Thousands of Previously Unidentified Long Intergenic Noncoding RNAs. PLoS Genet 9: e1003569. doi: 10.1371/journal.pgen.1003569
[58]
Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, et al. (2012) The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22: 1775–1789. doi: 10.1101/gr.132159.111
[59]
Liu J, Jung C, Xu J, Wang H, Deng S, et al. (2012) Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell 24: 4333–4345. doi: 10.1105/tpc.112.102855
[60]
Sun J, Zhou M, Mao Z-T, Hao D-P, Wang Z-Z, et al. (2013) Systematic analysis of genomic organization and structure of long non-coding RNAs in the human genome. FEBS Lett 587: 976–982. doi: 10.1016/j.febslet.2013.02.036
[61]
Kapusta A, Kronenberg Z, Lynch VJ, Zhuo X, Ramsay L, et al. (2013) Transposable Elements Are Major Contributors to the Origin, Diversification, and Regulation of Vertebrate Long Noncoding RNAs. PLoS Genet 9: e1003470. doi: 10.1371/journal.pgen.1003470
[62]
Pinto LR, Oliveira KM, Ulian EC, Garcia AAF, de Souza AP (2004) Survey in the sugarcane expressed sequence tag database (SUCEST) for simple sequence repeats. Genome 47: 795–804. doi: 10.1139/g04-055
[63]
Ramu P, Kassahun B, Senthilvel S, Ashok Kumar C, Jayashree B, et al. (2009) Exploiting rice-sorghum synteny for targeted development of EST-SSRs to enrich the sorghum genetic linkage map. Theor Appl Genet 119: 1193–1204. doi: 10.1007/s00122-009-1120-4
[64]
Cordeiro GM, Casu R, McIntyre CL, Manners JM, Henry RJ (2001) Microsatellite markers from sugarcane (Saccharum spp.) ESTs cross transferable to erianthus and sorghum. Plant Sci 160: 1115–1123. doi: 10.1016/s0168-9452(01)00365-x
[65]
Metzgar D, Bytof J, Wills C (2000) Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res 10: 72–80.
[66]
Marconi TG, Costa EA, Miranda HR, Mancini MC, Cardoso-Silva CB, et al. (2011) Functional markers for gene mapping and genetic diversity studies in sugarcane. BMC Res Notes 4: 264. doi: 10.1186/1756-0500-4-264
[67]
Feltus FA, Wan J, Schulze SR, Estill JC, Jiang N, et al. (2004) An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments. Genome Res 14: 1812–1819. doi: 10.1101/gr.2479404
[68]
Wakeley J (1996) The excess of transitions among nucleotide substitutions: new methods of estimating transition bias underscore its significance. Tree 11: 158–162. doi: 10.1016/0169-5347(96)10009-4
[69]
Borevitz JO, Chory J (2004) Genomics tools for QTL analysis and gene discovery. Curr Opin Plant Biol 7: 132–136. doi: 10.1016/j.pbi.2004.01.011