The
increasing amount of sequences stored in genomic databases has become unfeasible
to the sequential analysis. Then, the parallel computing brought its power to
the Bioinformatics through parallel algorithms to align and analyze the
sequences, providing improvements mainly in the running time of these
algorithms. In many situations, the parallel strategy contributes to reducing
the computational complexity of the big problems. This work shows some results
obtained by an implementation of a parallel score estimating technique for
the score matrix calculation stage, which is the first stage of a progressive
multiple sequence alignment. The performance and quality of the parallel
score estimating are compared with the results of a dynamic programming
approach also implemented in parallel. This comparison shows a significant reduction
of running time. Moreover, the quality of the
final alignment, using the new strategy, is analyzed and compared with
the quality of the approach with dynamic programming.
References
[1]
Chou, K.C., Zhou, D., Fan, X., Tan, D., Xu, Y., Tavis, J.E. and Bisceglie, A.M.D. (2007) Separation of near full-length hepatitis c virus quasispecies variants from a complex population. Journal of Virological Methods, 141, 220-224.
doi:10.1016/j.jviromet.2006.12.002
[2]
Edgar, R.C. and Batzoglou, S. (2006) Multiple sequence alignment. Current Opinion in Structural Biology, 16, 368-373. doi:10.1016/j.sbi.2006.04.004
[3]
Arcuri, H.A., Zafalon, G.F.D., Marucci, E.A., Bonalumi, C.E., Da Silveira, N.J.F., Machado, J.M., De Azevedo, W.F. and Palma, M.S. (2010) SKPDB: A structural database of shikimate pathway enzymes. BMC Bioinformatics, 11, 1-7. doi:10.1186/1471-2105-11-12
[4]
Needleman, S.B. and Wunsch, C.D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48, 443-453. doi:10.1016/0022-2836(70)90057-4
[5]
Wallace, I.M., Blackshields, G. and Higgins, D.G. (2005) Multiple sequence alignments. Current Opinion in Structural Biology, 15, 261-266. doi:10.1016/j.sbi.2005.04.002
[6]
Larkin, M., Blackshields, G., Brown, N., Chenna, R., Mc-Gettigan, P., McWilliam, H., Valentin, F., Wallace, I., Wilm, A., Lopez, R., Thampson, J., Gibson, T. and Higgins, D. (2007) Clustal w and clustal x version 2.0. Bioinformatics, 23, 2947-2948. doi:10.1093/bioinformatics/btm404
[7]
Zomaya, A.Y., Ercal, F. and Olariu, S. (2001) Solutions to parallel and distributed computing problems—Lessons from biological sciences. John Wiley & Sons, Chichester.
[8]
Chen, Y., Pan, Y., Chen, J., Liu, W. and Chen, L. (2006) Partitioned optimization algorithms for multiple sequence alignment. Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA’06), 18-20 April 2006, 2.
doi:10.1109/AINA.2006.260
[9]
Bilu, Y., Agarwal, P.K. and Kolodny, R. (2006) Faster algorithms for optimal multiple sequence alignment based on pairwise comparisons. IEEE/ACM Transactions on Com-Putational Biology and Bioinformatics, 3, 408-422.
doi:10.1109/TCBB.2006.53
[10]
Thorsen, O., Smith, B., Sosa, C.P., Jiang, K., Lin, H., Peters, A. and Chung F.W. (2007) Parallel genomic sequence-search on a massively parallel system. Proceedings of the 4th International Conference on Computing Frontiers, Ischia, 7-9 May 2007, 59-68.
doi:10.1145/1242531.1242542
[11]
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) A basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.
doi:10.1016/S0022-2836(05)80360-2
[12]
Gardner, M.K., Chung F.W., Archuleta, J., Lin, H. and Mal, X. (2006) Parallel genomic sequence-searching on an adhoc grid: Experiences, lessons learned, and implications. Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, Tampa, 11-17 November 2006, 22.
doi:10.1109/SC.2006.46
[13]
Moss, J. and Johnson, C.G. (2003) An ant colony algorithm for multiple sequence alignment in bioinformatics. Artificial Neural Networks and Genetic Algorithms, 182-186. doi:10.1007/978-3-7091-0646-4_33
[14]
Lee, Z.-J., Su, S.-F., Chuang, C.-C. and Liu, K.-H. (2008) Genetic algorithm with ant colony optimization (ga-aco) for multiple sequence alignment. Applied Soft Computing, 8, 55-78. doi:10.1016/j.asoc.2006.10.012
[15]
Ebedes, J. and Datta, A. (2004) Multiple sequence alignment in parallel on a workstation cluster. Bioinformatics, 20, 1193-1195. doi:10.1093/bioinformatics/bth055
[16]
Thompson, J.D., Koehl, P., Ripp, R. and Poch, O. (2005) Balibase 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics, 61, 127-136.
doi:10.1002/prot.20527