oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Genome wide SNP discovery in flax through next generation sequencing of reduced representation libraries
Kumar Santosh,You Frank M,Cloutier Sylvie
BMC Genomics , 2012, DOI: 10.1186/1471-2164-13-684
Abstract: Background Flax (Linum usitatissimum L.) is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs. Results Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents). Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%. Conclusions Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from flax. The genotyping-by-sequencing approach proved to be efficient for validation. The SNP resources generated in this work will assist in generating high density maps of flax and facilitate QTL discovery, marker-assisted selection, phylogenetic analyses, association mapping and anchoring of the whole genome shotgun sequence.
Double restriction-enzyme digestion improves the coverage and accuracy of genome-wide CpG methylation profiling by reduced representation bisulfite sequencing  [cached]
Wang Junwen,Xia Yudong,Li Lili,Gong Desheng
BMC Genomics , 2013, DOI: 10.1186/1471-2164-14-11
Abstract: Background Reduced representation bisulfite sequencing (RRBS) was developed to measure DNA methylation of high-CG regions at single base-pair resolution, and has been widely used because of its minimal DNA requirements and cost efficacy; however, the CpG coverage of genomic regions is restricted and important regions with low-CG will be ignored in DNA methylation profiling. This method could be improved to generate a more comprehensive representation. Results Based on in silico simulation of enzyme digestion of human and mouse genomes, we have optimized the current single-enzyme RRBS by applying double enzyme digestion in the library construction to interrogate more representative regions. CpG coverage of genomic regions was considerably increased in both high-CG and low-CG regions using the double-enzyme RRBS method, leading to more accurate detection of their average methylation levels and identification of differential methylation regions between samples. We also applied this double-enzyme RRBS method to comprehensively analyze the CpG methylation profiles of two colorectal cancer cell lines. Conclusion The double-enzyme RRBS increases the CpG coverage of genomic regions considerably over the previous single-enzyme RRBS method, leading to more accurate detection of their average methylation levels. It will facilitate genome-wide DNA methylation studies in multiple and complex clinical samples.
Structural variation in the chicken genome identified by paired-end next-generation DNA sequencing of reduced representation libraries
Hindrik HD Kerstens, Richard PMA Crooijmans, Bert W Dibbits, Addie Vereijken, Ron Okimoto, Martien AM Groenen
BMC Genomics , 2011, DOI: 10.1186/1471-2164-12-94
Abstract: We identified hundreds of shared and divergent SVs in four commercial chicken lines relative to the reference chicken genome. The majority of SVs were found in intronic and intergenic regions, and we also found SVs in the coding regions. To identify the SVs, we combined high-throughput short read paired-end sequencing of genomic reduced representation libraries (RRLs) of pooled samples from 25 individuals and computational mapping of DNA sequences from a reference genome.We provide a first glimpse of the high abundance of small structural genomic variations in the chicken. Extrapolating our results, we estimate that there are thousands of rearrangements in the chicken genome, the majority of which are located in non-coding regions. We observed that structural variation contributes to genetic differentiation among current domesticated chicken breeds and the Red Jungle Fowl. We expect that, because of their high abundance, SVs might explain phenotypic differences and play a role in the evolution of the chicken genome. Finally, our study exemplifies an efficient and cost-effective approach for identifying structural variation in sequenced genomes.Structural variation within the genome, including insertions, duplications, deletions, and inversions of up to multiple kilobase pairs, have recently been described in a variety of species, including humans [1-3], mice [4], rats [5], silkworms [6] drosophila [7], and dogs [8]. These genomic variations were recently found to be widespread, encompassing 5% of the human genome [9], and are thought to be involved in (co)determining complex phenotypes [10,11].The contribution of structural variants (SVs) to complex phenotypes has been measured by association analyses of variance in gene expression levels (traits) and the presence of SVs. SNPs and SVs have been shown to account for 83.6% and 17.7%, respectively, of the total detected genetic variation in gene expression, with only a limited overlap [12]. The effect that SVs have on
Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries
Aniruddha Chatterjee,Euan J. Rodger,Peter A. Stockwell,Robert J. Weeks,Ian M. Morison
Journal of Biomedicine and Biotechnology , 2012, DOI: 10.1155/2012/741542
Abstract: Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background.
Gel-free multiplexed reduced representation bisulfite sequencing for large-scale DNA methylation profiling
Patrick Boyle, Kendell Clement, Hongcang Gu, Zachary D Smith, Michael Ziller, Jennifer L Fostel, Laurie Holmes, Jim Meldrim, Fontina Kelley, Andreas Gnirke, Alexander Meissner
Genome Biology , 2012, DOI: 10.1186/gb-2012-13-10-r92
Abstract: DNA methylation plays an important role in mammalian development [1,2] and is frequently altered in diseases, including cancer [3]. It is generally thought that methylation acts in a repressive function within regulatory contexts [4,5]. DNA methylation in mammalian genomes occurs mostly within the context of the CpG dinucleotide [6] and is generally seen in CpG-poor regions. In contrast, CpG-rich regions naturally exhibit low methylation states [7-10].Many techniques have been developed to investigate global DNA methylation patterns [11]. Comparison of next-generation sequencing-based technologies showed that most methods produce similar results [12,13], but that the optimal sequencing strategy may depend on sample DNA amount, as well as the desired genome coverage and sequencing depth [14,15]. Whole-genome bisulfite sequencing of randomly sheared genomic DNA is the most comprehensive, but also most costly, method, while more focused approaches such as reduced representation bisulfite sequencing (RRBS) allow larger numbers of samples to be analyzed at reduced costs [8,15-17].RRBS utilizes the cutting pattern of MspI (C^CGG) to systematically digest DNA to enrich for CpG dinucleotides. As opposed to whole-genome bisulfite sequencing, every fragment produced by MspI digestion will contain DNA methylation information for at least one CpG dinucleotide [6]. Another benefit of RRBS is that promoters, CpG islands, and other genomic features are disproportionally enriched genomic features because of the frequency of MspI cut sites in these regions [8,16].RRBS reduces the complexity of the genome - and thus the sequencing cost - by selecting a subset of MspI fragments based on their size for sequencing. In the standard RRBS protocol, this size selection is done by preparative gel electrophoresis, which is laborious and difficult to automate, thereby limiting the throughput of the method. For example, using our more recently published protocol [15], which includes a manual 40
Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library
Cecilia Sánchez, Timothy PL Smith, Ralph T Wiedmann, Roger L Vallejo, Mohamed Salem, Jianbo Yao, Caird E Rexroad
BMC Genomics , 2009, DOI: 10.1186/1471-2164-10-559
Abstract: The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts.The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.Single Nucleotide Polymorphisms (SNPs) are highly abundant markers which are evenly distributed throughout the genome and can be functionally relevant[1]. They are suitable markers for fine mapping of genes and candidate gene association studies aimed at identifying alleles potentially affecting important traits. Technologies that enable simultaneous analysis of thousands of SNPs have permitted genome-wide association studies for complex traits in humans [2], chicken [3], cattle [4-6] and sheep [7]. Additionally, reduced representation libraries and pyrosequencing technologies have facilitated the high throughput discovery of
Genome Wide Allele Frequency Fingerprints (GWAFFs) of Populations via Genotyping by Sequencing  [PDF]
Stephen Byrne, Adrian Czaban, Bruno Studer, Frank Panitz, Christian Bendixen, Torben Asp
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0057438
Abstract: Genotyping-by-Sequencing (GBS) is an excellent tool for characterising genetic variation between plant genomes. To date, its use has been reported only for genotyping of single individuals. However, there are many applications where resolving allele frequencies within populations on a genome-wide scale would be very powerful, examples include the breeding of outbreeding species, varietal protection in outbreeding species, monitoring changes in population allele frequencies. This motivated us to test the potential to use GBS to evaluate allele frequencies within populations. Perennial ryegrass is an outbreeding species, and breeding programs are based upon selection on populations. We tested two restriction enzymes for their efficiency in complexity reduction of the perennial ryegrass genome. The resulting profiles have been termed Genome Wide Allele Frequency Fingerprints (GWAFFs), and we have shown how these fingerprints can be used to distinguish between plant populations. Even at current costs and throughput, using sequencing to directly evaluate populations on a genome-wide scale is viable. GWAFFs should find many applications, from varietal development in outbreeding species right through to playing a role in protecting plant breeders’ rights.
Reduced Harmonic Representation of Partitions  [PDF]
Michalis Psimopoulos
Physics , 2011,
Abstract: In the present article the reduced integral representation of partitions in terms of harmonic products has been derived first by using hypergeometry and the new concept of fractional sum and secondly by studying the Fourier series of the kernel function appearing in the integral representation. Using the method of induction, a generalization of the theory has also been obtained.
Detection and mapping of mtDNA SNPs in Atlantic salmon using high throughput DNA sequencing
Olafur Fridjonsson, Kristinn Olafsson, Scott Tompsett, Snaedis Bjornsdottir, Sonia Consuegra, David Knox, Carlos de Leaniz, Steinunn Magnusdottir, Gudbjorg Olafsdottir, Eric Verspoor, Sigridur Hjorleifsdottir
BMC Genomics , 2011, DOI: 10.1186/1471-2164-12-179
Abstract: Pyrosequencing generated a total of 179,826,884 bp of data, and 10,765 of the total 10,920 S. salar sequences (98.6%) were assigned back to their original samples. The approach taken resulted in a total of 216 SNPs and 2 indels, which were validated and mapped onto the S. salar mitochondrial genome, including 107 SNPs and one indel not previously reported. An average of 27.3 sequence reads with a standard deviation of 11.7 supported each SNP per individual.The study generated a mitochondrial SNP panel from a large sample group across a broad geographical area, reducing the potential for ascertainment bias, which has hampered previous studies. The SNPs identified here validate those identified in previous studies, and also contribute additional potentially informative loci for the future study of phylogeography and evolution in the Atlantic salmon. The overall success experienced with this novel application of HT sequencing of targeted regions suggests that the same approach could be successfully applied for SNP mining in other species.Single nucleotide polymorphisms (SNPs), representing single base differences between individuals, are a common form of genome variation [1]. Once identified, SNPs have the potential to be used as genotyping markers for population assignment or in phylogeographic analysis, and are rapidly becoming the marker of choice within this field of study [2]. The emergence of high-throughput (HT) sequencing technologies provides an unparalleled opportunity for the cost-effective sequencing of targeted genomic regions for SNP identification. HT sequencing has been applied for SNP discovery in humans [3,4], animals [5], plants [6,7] and bacteria [8] - in species where reference genomes exist. In organisms lacking a sequenced reference genome, SNPs have also been mined from the random sequencing of either expressed sequence tags (ESTs) [9,10] or reduced representation libraries [11-13]. However, with an available reference sequence, specific genetic
Screening the human exome: a comparison of whole genome and whole transcriptome sequencing
Elizabeth T Cirulli, Abanish Singh, Kevin V Shianna, Dongliang Ge, Jason P Smith, Jessica M Maia, Erin L Heinzen, James J Goedert, David B Goldstein, the Center for HIV/AIDS Vaccine Immunology (CHAVI)
Genome Biology , 2010, DOI: 10.1186/gb-2010-11-5-r57
Abstract: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage.We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.The study of common human diseases is rapidly moving away from an exclusive focus on common variants using genome-wide association studies and toward sequencing approaches that represent most variants, including those that are rare in the general population.Although rapidly falling, the per base costs of next generation sequencing platforms still preclude the generation of large sample sizes of entirely sequenced genomes at high coverage. In addition to this economic constraint, it is widely appreciated that the very large number of variants identified in such studies will make it difficult to use association evidence alone to identify causal sites. For these reasons, there has been considerable interest in focusing attention on coding variants as a first step at complete representation of human variation. Part of the motivation for this approach stems from th
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.