%0 Journal Article %T TAPDANCE: An automated tool to identify and annotate transposon insertion CISs and associations between CISs from next generation sequence data %A Aaron L Sarver %A Jesse Erdman %A Tim Starr %A David A Largaespada %A Kevin AT Silverstein %J BMC Bioinformatics %D 2012 %I BioMed Central %R 10.1186/1471-2105-13-154 %X We describe Transposon Annotation Poisson Distribution Association Network Connectivity Environment (TAPDANCE) software, which automates the identification of CISs within transposon junction fragment insertion data. Starting with barcoded sequence data, the software identifies and trims sequences and maps putative genomic sequence to a reference genome using the bowtie short read mapper. Poisson distribution statistics are then applied to assess and rank genomic regions showing significant enrichment for transposon insertion. Novel methods of counting insertions are used to ensure that the results presented have the expected characteristics of informative CISs. A persistent mySQL database is generated and utilized to keep track of sequences, mappings and common insertion sites. Additionally, associations between phenotypes and CISs are also identified using FisherĄ¯s exact test with multiple testing correction. In a case study using previously published data we show that the TAPDANCE software identifies CISs as previously described, prioritizes them based on p-value, allows holistic visualization of the data within genome browser software and identifies relationships present in the structure of the data.The TAPDANCE process is fully automated, performs similarly to previous labor intensive approaches, provides consistent results at a wide range of sequence sampling depth, has the capability of handling extremely large datasets, enables meaningful comparison across datasets and enables large scale meta-analyses of junction fragment data. The TAPDANCE software will greatly enhance our ability to analyze these datasets in order to increase our understanding of the genetic basis of cancers. %U http://www.biomedcentral.com/1471-2105/13/154/abstract