%0 Journal Article %T GATA: a graphic alignment tool for comparative sequence analysis %A David A Nix %A Michael B Eisen %J BMC Bioinformatics %D 2005 %I BioMed Central %R 10.1186/1471-2105-6-9 %X To address some of these issues, we created a stand alone, platform independent, graphic alignment tool for comparative sequence analysis (GATA http://gata.sourceforge.net/ webcite). GATA uses the NCBI-BLASTN program and extensive post-processing to identify all small sub-alignments above a low cut-off score. These are graphed as two shaded boxes, one for each sequence, connected by a line using the coordinate system of their parent sequence. Shading and colour are used to indicate score and orientation. A variety of options exist for querying, modifying and retrieving conserved sequence elements. Extensive gene annotation can be added to both sequences using a standardized General Feature Format (GFF) file.GATA uses the NCBI-BLASTN program in conjunction with post-processing to exhaustively align two DNA sequences. It provides researchers with a fine-grained alignment and visualization tool aptly suited for non-coding, 0¨C200 kb, pairwise, sequence analysis. It functions independent of sequence feature ordering or orientation, and readily visualizes both large and small sequence inversions, duplications, and segment shuffling. Since the alignment is visual and does not contain gaps, gene annotation can be added to both sequences to create a thoroughly descriptive picture of DNA conservation that is well suited for comparative sequence analysis.The most widely used methods for aligning DNA sequences rely on dynamic programming algorithms initially developed by Smith-Waterman and Needleman-Wunsch [1,2]. These algorithms create the mathematically best possible alignment of two sequences by inserting gaps in either sequence to maximize the score of base pair matches and minimize penalties for base pair mismatches and sequence gaps. Although these methods have proven invaluable in understanding sequence conservation and gene relatedness, they make several assumptions. One of their assumptions in generating the "best" alignment is that sequence features are collinear. For %U http://www.biomedcentral.com/1471-2105/6/9