OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

BMC Bioinformatics 2010

Geoseq: a tool for dissecting deep-sequencing datasets

DOI: 10.1186/1471-2105-11-506

James Gurtowski, Anthony Cancio, Hardik Shah, Chaya Levovitz, Ajish George, Robert Homann, Ravi Sachidanandam

Full-Text Cite this paper Add to My Lib

Abstract:

Geoseq http://geoseq.mssm.edu webcite provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment.Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.Deep sequencing platforms such as the Illumina's Solexa Genome Analyzer and ABI's Solid, have simplified the generation of large short-read datasets [1]. Many of these datasets are now deposited in publicly-accessible repositories such as the Sequence Read Archive (SRA) at the NCBI [2].However, a researcher interested in exploring the public datasets is faced with two problems,？ Identifying the right libraries. The short-read datasets are neither uniformly annotated, nor are they organized to make searches easy.？ Analyzing the libraries for features of interest. The sheer magnitude of the data in these datasets poses computational challenges.Each experiment can result in tens of millions of reads and requires specialized software to conduct prop

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133