%0 Journal Article %T Implementation of a Parallel Protein Structure Alignment Service on Cloud %A Che-Lun Hung %A Yaw-Ling Lin %J International Journal of Genomics %D 2013 %I Hindawi Publishing Corporation %R 10.1155/2013/439681 %X Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. 1. Introduction Protein structure alignment is a useful strategy for structural biology. Most of the alignment methods rely on structure comparison to identify structural, evolutionary, and functional relationships between proteins [1]. In general, these methods align proteins based on structural similarity. A structural alignment can identify the evolutionary equivalent residues when the aligned proteins share a common ancestor. Unlike sequence alignment tools, which focus on equivalent residues, structural alignment methods focus on conserved protein structure. Therefore, structural alignments of remote homologous proteins are more reliable than sequence alignments. Structural alignment identifies functional mechanisms by comparing functionally related proteins and can also annotate the function of proteins whose structures have been detected. Several protein structural alignment methods [2¨C8] compare protein structures by structural similarity based on secondary structure elements, as well as intra- and intermolecular atomic distances. The basic idea of structure alignment is to identify the secondary structural elements, cluster these elements into groups, and score the best substructure alignment. The Vector Alignment Search Tool (VAST) [2] compares protein structures according to the continuous distribution of domains in the fold space. VAST has been used to compare all known Protein Data Bank (PDB) domains to each other. The alignment results are presented in NCBI¡¯s Molecular Modeling Database [9]. %U http://www.hindawi.com/journals/ijg/2013/439681/