OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

中国科学生命科学 2013

云计算在生物医学中的应用

DOI: 10.1360/052013-10, PP. 569-578

杨帅, 胡宗倩, 伯晓晨, 王升启, 李非, 王东根

Keywords: 云计算,生物医学,海量数据,私有云

Full-Text Cite this paper Add to My Lib

Abstract:

以下一代测序技术为代表的海量生物医学数据为现代生命科学研究提供了前所未有的机遇,但后续的大数据分析却成为一大难题.本研究综述了云计算在生物医学领域的最新研究进展,首先阐述云计算服务模式及其优点,列举基于云计算的大数据分析工具,并以宏基因组分析应用PathSeq为例介绍使用云计算的步骤,最后给出私有云构建与云计算应用中的一些建议,希望为基因组学、转录组学、蛋白质组学等生物医学领域提供新的海量数据处理方法和思路.

References

[1]	8 Rosenthal A, Mork P, Li M H, et al. Cloud computing: a new business paradigm for biomedical information sharing. J Biomed Inform, 2010, 43: 342-353
[2]	9 Re C, Ro A, Re A. Will computers crash genomics? Science, 2010, 5: 1190
[3]	10 Darling A, Carey L, Feng W C. The design, implementation, and evaluation of mpiBLAST. In: Proceedings of ClusterWorld. San Jose. 2003. 14
[4]	11 Schadt E E, Linderman M D, Sorenson J, et al. Computational solutions to large-scale data management and analysis. Nat Rev Genet, 2010, 11: 647-657
[5]	12 Wall D P, Kudtarkar P, Fusaro V A, et al. Cloud computing for comparative genomics. BMC Bioinformatics, 2010, 11: 259
[6]	13 Stein L D. The case for cloud computing in genome informatics. Genome Biol, 2010, 11: 207
[7]	14 Dudley J T, Pouliot Y, Chen R, et al. Translational bioinformatics in the cloud: an affordable alternative. Genome Med, 2010, 2: 51
[8]	15 Wilkening J, Wilke A, Desai N, et al. Using clouds for metagenomics: a case study. In: IEEE International Conference on Cluster Computing. New Orleans. 2009. 1-6
[9]	16 郝彤, 马红武, 赵学明. 云计算在生物技术领域的应用. 数学的实践与认识, 2012, 24: 117-123
[10]	17 罗军舟, 金嘉晖, 宋爱波. 云计算-体系架构与关键技术. 通信学报, 2011, 32: 3-21
[11]	18 Mell P, Grance T. The NIST definition of cloud computing (draft). NIST Spec Publ, 2011, 800: 145
[12]	19 Kivity A, Kamay Y, Laor D, et al. Kvm: the linux virtual machine monitor. In: Proceedings of the Linux Symposium 2007. Ottawa. 2007. 225-230
[13]	20 Barham P, Dragovic B, Fraser K, et al. Xen and the art of virtualization. ACM SIGOPS Operating Systems Review, 2003, 37: 164-177
[14]	23 Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Commun Acm, 2008, 51: 107-113
[15]	24 Isard M, Budiu M, Yu Y, et al. Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Operating Systems Review, 2007, 41: 59-72
[16]	25 Yang H C, Dasdan A, Hsiao R L, et al. Map-reduce-merge: simplified relational data processing on large clusters. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. Beijing, 2007, 1029-1040
[17]	26 Ghemawat S, Gobioff H, Leung S T. The google file system. ACM SIGOPS Operating Systems Review, 2003, 37: 29-43
[18]	27 Borthakur D. The hadoop distributed file system: architecture and design. Hadoop Project Website, 2007, 11: 21
[19]	28 Schwan P. Lustre: building a file system for 1000-node clusters. In: Proceedings of the Linux Symposium 2003, Ottawa, 2003, 401-408
[20]	29 Goecks J, Nekrutenko A, Taylor J, et al. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol, 2010, 11: R86
[21]	30 Meyer F, Paarmann D, D’souza M, et al. The metagenomics rast server-a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics, 2008, 9: 386
[22]	31 Armbrust M, Fox A, Griffith R, et al. A view of cloud computing. Commun Acm, 2010, 53: 50-58
[23]	32 Bateman A, Wood M. Cloud computing. Bioinformatics, 2009, 25: 1475
[24]	33 Dudley J T, Butte A J. In silico research in the era of cloud computing. Nat biotechnol, 2010, 28: 1181
[25]	34 李彭军. 医学影像云服务平台基础架构研究与实践. 博士学位论文. 广州: 南方医科大学, 2011
[26]	35 Meng B, Pratx G, Xing L. Ultrafast and scalable cone-beam ct reconstruction using mapreduce in a cloud computing environment. Med Phys, 2011, 38: 6603
[27]	36 Chen T S, Liu C H, Chen T L, et al. Secure dynamic access control scheme of phr in cloud computing. J Med Syst, 2012, 36: 4005-4020
[28]	37 Matsunaga A, Tsugawa M, Fortes J. CloudBLAST: combining mapreduce and virtualization on distributed resources for bioinformatics applications. In: Proceedings of the 2008 Fourth IEEE International Conference on eScience. Indianapolis, 2008, 222-229
[29]	38 Di Tommaso P, Orobitg M, Guirado F, et al. Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the t-coffee package and its benchmarking on the amazon elastic-cloud. Bioinformatics, 2010, 26: 1903-1904
[30]	39 Schatz M C. CloudBurst: highly sensitive read mapping with mapreduce. Bioinformatics, 2009, 25: 1363-1369
[31]	40 Talukder A K, Gandham S, Prahalad H, et al. Cloud-MAQ: the cloud-enabled scalable whole genome reference assembly application. In: Proceedings of the 7th International Conference on Wireless and Optical Communications Networks, WOCN 2010. Colombo, 2010, 1-5
[32]	41 Nguyen T, Shi W, Ruden D. CloudAligner: a fast and full-featured mapreduce based tool for sequence mapping. BMC Res Notes, 2011, 4: 171
[33]	42 Langmead B, Schatz M C, Lin J, et al. Searching for SNPs with cloud computing. Genome Biol, 2009, 10: R134
[34]	43 Habegger L, Balasubramanian S, Chen D Z, et al. VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment. Bioinformatics, 2012, 28: 2267-2269
[35]	44 Fischer M, Snajder R, Pabinger S, et al. SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data. PLoS ONE, 2012, 7: e41948
[36]	45 Kostic A D, Ojesina A I, Pedamallu C S, et al. PathSeq: software to identify or discover microbes by deep sequencing of human tissue. Nat biotechnol, 2011, 29: 393-396
[37]	46 Zhao G, Bu D, Liu C, et al. CloudLCA: finding the lowest common ancestor in metagenome analysis using cloud computing. Protein Cell, 2012, 3: 148-152
[38]	47 Langmead B, Hansen K D, Leek J T. Cloud-scale RNA-sequencing differential expression analysis with myrna. Genome Biol, 2010, 11: R83
[39]	48 Jourdren L, Bernard M, Dillies M A, et al. Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics, 2012, 28: 1542-1543
[40]	49 Hong D, Rhie A, Park S S, et al. FX: an RNA-seq analysis tool on the cloud. Bioinformatics, 2012, 28: 721-723
[41]	21 Nurmi D, Wolski R, Grzegorczyk C, et al. The eucalyptus open-source cloud-computing system. In: CCGRID’09: IEEE. Washington. 2009. 124-131
[42]	22 Rackspace Cloud Computing. OpenStack open source cloud computing software. 2012, http://www.openstack.org/
[43]	1 Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol, 2008, 26: 1135-1145
[44]	2 Fox J, Kling J. Chinese institute makes bold sequencing play. Nat Biotechnol, 2010, 28: 189-191
[45]	3 Manyika J, Chui M, Brown B, et al. Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute, 2011, 1-137
[46]	4 Qin J, Li Y, Cai Z, et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature, 2012, 490: 55-60
[47]	5 Mardis E R. The impact of next-generation sequencing technology on genetics. Trends Genet, 2008, 24: 133
[48]	6 Schatz M C, Langmead B, Salzberg S L. Cloud computing and the DNA data race. Nat Biotechnol, 2010, 28: 691
[49]	7 Gathering clouds and a sequencing storm: why cloud computing could broaden community access to next-generation sequencing. Nat Biotechnol, 2010, 28: 1, doi: 10.1038/nbt0110-1
[50]	50 Feng X, Grossman R, Stein L. PeakRanger: a cloud-enabled peak caller for chip-seq data. BMC Bioinformatics, 2011, 12: 139
[51]	51 Afgan E, Baker D, Coraor N, et al. Harnessing cloud computing with galaxy cloud. Nat Biotechnol, 2011, 29: 972-974
[52]	52 Angiuoli S V, Matalka M, Gussman A, et al. CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinformatics, 2011, 12: 356
[53]	53 Krampis K, Booth T, Chapman B, et al. Cloud Biolinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics, 2012, 13: 42
[54]	54 Afgan E, Baker D, Coraor N, et al. Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics, 2010, 11: S4
[55]	55 Zhang L, Gu S, Liu Y, et al. Gene set analysis in the cloud. Bioinformatics, 2012, 28: 294-295
[56]	56 Lee H, Yang Y, Chae H, et al. bioVLAB-MMIA: a cloud environment for microRNA and mRNA integrated analysis (MMIA) on amazon ec2. IEEE Trans Nanobioscience, 2012, 11: 266-272
[57]	57 Trudgian D C, Mirzaei H. Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing. J Proteome Res, 2012, 11: 6282-6290
[58]	58 O’Connor B, Merriman B, Nelson S. SeqWare Query Engine: storing and searching sequence data in the cloud. BMC Bioinformatics, 2010, 11: S2
[59]	59 Sch？nherr S, Forer L, Weissensteiner H, et al. Cloudgene: a graphical execution platform for mapreduce programs on private and public clouds. BMC Bioinformatics, 2012, 13: 200
[60]	60 Niemenmaa M, Kallio A, Schumacher A, et al. Hadoop-BAM: directly manipulating next generation sequencing data in the cloud. Bioinformatics, 2012, 28: 876-877
[61]	61 Metzker M L. Sequencing technologies—the next generation. Nat Rev Genet, 2009, 11: 31-46
[62]	62 Altschul S F, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol, 1990, 215: 403-410
[63]	63 Hugot J P, Chamaillard M, Zouali H, et al. Association of nod2 leucine-rich repeat variants with susceptibility to crohn''s disease. Nature, 2001, 411: 599-603
[64]	64 Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol, 2009, 10: R25
[65]	65 Li R, Li Y, Fang X, et al. SNP detection for massively parallel whole-genome resequencing. Genome Res, 2009, 19: 1124-1132
[66]	66 Subramanian A, Tamayo P, Mootha V K, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA, 2005, 102: 15545-15550
[67]	67 Nam S, Li M, Choi K, et al. MicroRNA and mRNA integrated analysis (MMIA): a web tool for examining biological functions of microrna expression. Nucleic Acids Res, 2009, 37: W356-W362
[68]	68 Trudgian D C, Thomas B, McGowan S J, et al. CPFP: a central proteomics facilities pipeline. Bioinformatics, 2010, 26: 1131-1132
[69]	69 Taylor R C. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics, 2010, 11: S1
[70]	70 Zaharia M, Chowdhury M, Franklin M J, et al. Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. Boston, 2010, 10
[71]	71 陈康, 郑纬民. 云计算: 系统实例与研究现状. 软件学报, 2009, 20: 1337-1348
[72]	72 Fusaro V A, Patil P, Gafni E, et al. Biomedical cloud computing with amazon web services. PLoS Comp Biol, 2011, 7: e1002147

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133