%0 Journal Article %T Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies %A Shanrong Zhao %A Kurt Prenger %A Lance Smith %J ISRN Bioinformatics %D 2013 %R 10.1155/2013/481545 %X RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. 1. Introduction RNA-Seq is the direct sequencing of transcripts by high-throughput sequencing technology and can profile an entire transcriptome at single-base resolution whilst concurrently quantifying gene expression levels on a genome-wide scale [1¨C3]. RNA-Seq not only has considerable advantages for examining transcriptome fine structure¡ªfor example, in the detection of novel transcripts, allele-specific expression, and alternative splicing¡ªbut also provides a far more precise measurement of levels of transcripts than that of other methods [4, 5]. With no probes or primers to design, RNA-Seq delivers unbiased and unparalleled information about the transcriptome and gene expression. Early studies have demonstrated that RNA-Seq is very reliable in terms of technical reproducibility [6, 7]. Compared to microarray-based profiling, RNA-Seq can detect the expression of low abundance transcripts and the subtle change under different conditions; has a wider dynamic range; and avoids technical issues in microarray related to probe performance such as cross-hybridization, limited detection range of individual probes, and nonspecific hybridization [8, 9]. Currently, RNA-Seq is becoming an attractive approach in the profiling of gene expression and in evaluating differential expression [10¨C13]. Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in computational infrastructure %U http://www.hindawi.com/journals/isrn.bioinformatics/2013/481545/