This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. 1. Introduction At the beginning of the 21st century, an explosion of information was discovered from the living organisms, especially in areas of molecular biology and genetics. The focus of bioinformatics deals with this flood of information, which comes from academy, industry, and government labs, and turning it into useful knowledge. Bioinformatics is important to a virtually unlimited number of fields. As the genetic information being organized into computerized databases and their sizes steadily grow, molecular biologists need effective and efficient computational tools to store and retrieve the cognate information such as biological information from the databases, to analyze the sequence patterns they contain, and to extract the biological knowledge the sequences contain. The field of bioinformatics computing is advancing at an unprecedented rate. For people working with genomics and high-throughput sequencing data analysis, it is a serious challenge to analyze the vast amounts of data coming from the next generation sequencing (NGS) instruments. For example, there were approximately 126,?551,?501, and 141 bases in 135,?440, and 924 sequence records in the traditional GenBank divisions as of April 2011 [1]. The tendency is likely only to be reinforced by new generation sequencers, for example, Illumina HiSeq 2500 generating up to 120?Gb of data in 17 hours per run [2]. Data in itself is almost useless until it is analyzed and correctly interpreted. The draft of the human genome has given us a genetic list of what is necessary for building a human: approximately 35,000 genes. For a genome as large as the human genome, it may take many days of CPU time on large-memory, multiprocessor computers to analyze. To handle this much data, computational strategies are important to tackle this vital bottleneck, which can aid scientists in the extraction of useful and important biological data.
T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, vol. 147, no. 1, pp. 195–197, 1981.
[4]
S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, vol. 48, no. 3, pp. 443–453, 1970.
[5]
A. Krogh, M. Brown, I. S. Mian, K. Sjolander, and D. Haussler, “Hidden Markov Models in computational biology applications to protein modeling,” Journal of Molecular Biology, vol. 235, no. 5, pp. 1501–1531, 1994.
[6]
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” Journal of Molecular Biology, vol. 215, no. 3, pp. 403–410, 1990.
[7]
W. R. Pearson, “Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms,” Genomics, vol. 11, no. 3, pp. 635–650, 1991.
[8]
D.-F. Feng and R. F. Doolittle, “Progressive sequence alignment as a prerequisitetto correct phylogenetic trees,” Journal of Molecular Evolution, vol. 25, no. 4, pp. 351–360, 1987.
[9]
J. P. Walters, B. Qudah, and V. Chaudhary, “Accelerating the HMMER sequence analysis suite using conventional processors,” in Proceedings of the 20th International Conference on Advanced Information Networking and Applications, pp. 289–294, April 2006.
[10]
U. Srinivasan, P.-S. Chen, Q. Diao et al., “Characterization and analysis of HMMER and SVM-RFE parallel bioinformatics applications,” in Proceedings of the IEEE International Symposium on Workload Characterization (IISWC '05), pp. 87–98, October 2005.
[11]
W. Zhu, Y. Niu, J. Lu, and G. R. Gao, “Implementing parallel hmm-pfam on the EARTH multithreaded architecture,” in Proceedings of the IEEE Bioinformatics Conference (CSB '03), pp. 549–550, 2003.
[12]
J. P. Walters, R. Darole, and V. Chaudhary, “Improving MPI-HMMER's scalabilitywith parallel I/O,” in Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS '09), pp. 1–11, May 2009.
[13]
J. P. Walters, X. Meng, V. Chaudhary et al., “MPI-HMMER-boost: distributed FPGA acceleration,” Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 48, no. 3, pp. 223–238, 2007.
[14]
B. Wun, J. Buhler, and P. Crowley, “Exploiting coarse-grained parallelism to accelerate protein motif finding with a network processor,” in Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT '05), pp. 173–184, September 2005.
[15]
J. P. Walters, V. Balu, S. Kompalli, and V. Chaudhary, “Evaluating the use of GPUs in liver image segmentation and HMMER database searches,” in Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS '09), pp. 1–2.
[16]
N. Ganesan, R. D. Chamberlain, J. Buhler, and M. Taufer, “Accelerating HMMER on GPUs by implementing hybrid data and task parallelism,” in Proceedings of the ACM International Conference on Bioinformatics and Computational Biology (ACM-BCB '10), pp. 418–421, August 2010.
[17]
V. Sachdeva, M. Kistler, E. Speight, and T.-H. K. Tzeng, “Exploring the viability of the Cell Broadband Engine for bioinformatics applications,” Parallel Computing, vol. 34, no. 11, pp. 616–626, 2008.
[18]
T. Oliver, L. Y. Yeow, and B. Schmidt, “Integrating FPGA acceleration into HMMer,” Parallel Computing, vol. 34, no. 11, pp. 681–691, 2008.
[19]
Y. Sun, P. Li, G. Gu, Y. Wen, Y. Liu, and D. Liu, “Accelerating HMMer on FPGAs using systolic array based architecture,” in Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS '09), pp. 1–8, May 2009.
[20]
S. R. Eddy, “Hidden Markov models,” Current Opinion in Structural Biology, vol. 6, no. 3, pp. 361–365, 1996.
[21]
S. R. Eddy, “Profile hidden Markov models,” Bioinformatics, vol. 14, no. 9, pp. 755–763, 1998.
[22]
R. Durbin, S. Eddy, A. Krogh, and A. Mitchson, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.
[23]
A. Grama, G. Karypis, V. Kumar, and A. Gupta, Introduction to Parallel Computing, Addison Wesley, 2nd edition, 2003.
[24]
D. Butenhof, Programming With POSIX Threads, Addison Wesley Professional, 1997.
[25]
R. Chandra, R. Menon, L. Dagum, D. Kohr, D. Maydan, and J. McDonald, Parallel Programming in OpenMP, Morgan Kaufmann, 2000.
[26]
M. P. I. Forum, “MPI: Message-Passing Interface Standard,” 1995, http://www.mpi-forum.org/.
[27]
T. Takagi and T. Maruyama, “Accelerating hmmer search using FPGA,” in Proceedings of the 19th International Conference on Field Programmable Logic and Applications (FPL '09), pp. 332–337, September 2009.
[28]
T. Oliver, L. Y. Yeow, and B. Schmidt, “High performance database searching with HMMer on FPGAs,” in Proceedings of the 21st International Parallel and Distributed Processing Symposium (IPDPS '07), pp. 1–7, March 2007.
[29]
S. A. Manavski and G. Valle, “CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment,” BMC Bioinformatics, vol. 9, supplement 2, article S10, 2008.
[30]
NVIDIA CUDA ZONE, 2012, http://www.nvidia.com/object/cuda_home.html.
[31]
P. Cohen, “OpenCL: what you need to know,” Macworld, 2008, http://www.macworld.com/article/134858/2008/08/snowleopard_opencl.html.
[32]
G. Amdahl, “The validity of the single processor approach to achieving large scale computing capabilities,” in Proceedings of the AFIPS Computing Conference, vol. 30, pp. 483–485, 1967.
Pfam, “The PFAM HMM library: a large collection of multiple sequence alignments and hidden markov models covering many common protein families,” 2013, http://pfam.sanger.ac.uk/.
[35]
NCBI, “The NR (non-redundant) database,” 2006, ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz.
[36]
K. B. Theobald, EARTH: An Efficient Architecture for Running Threads, School of Computer Science, McGill University, Montreal, Canada, 1999.
[37]
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Mancheck, and Y. Sunderam, PVM: Parallel Virtual Machine. A USer’s Guide and Tutorial for Networked Parallel Computing, Edited by J. Kowalik, MIT Press, 1994.
[38]
J. J. Dongarra, H. W. Meuer, and E. Strohmaier, “TOP500 supercomputer sites,” Supercomputer, vol. 12, no. 1, pp. 91–120, 1996.
K. Jiang, O. Thorsen, A. Peters, B. Smith, and C. P. Sosa, “An efficient parallel implementation of the hidden Markov methods for genomic sequence search on a massively parallel system,” IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 1, pp. 15–23, 2008.
[42]
O. Lascu, N. Allsopp, P. Vezolle, et al., “Unfolding the IBM Server Blue Gene Solution,” IBM Corp., Int’l Technical Support Organization, 2005, http://www.redbooks.ibm.com/abstracts/sg246686.html?Open.
[43]
The Portland Group, “Accelerator Compilers with OpenACC Directives,” 2012, http://www.pgroup.com/resources/accel.htm.