Francisella tularensis is the causative agent of tularemia, which is a highly lethal disease from nature and potentially from a biological weapon. This species contains four recognized subspecies including the North American endemic F. tularensis subsp. tularensis (type A), whose genetic diversity is correlated with its geographic distribution including a major population subdivision referred to as A.I and A.II. The biological significance of the A.I – A.II genetic differentiation is unknown, though there are suggestive ecological and epidemiological correlations. In order to understand the differentiation at the genomic level, we have determined the complete sequence of an A.II strain (WY96-3418) and compared it to the genome of Schu S4 from the A.I population. We find that this A.II genome is 1,898,476 bp in size with 1,820 genes, 1,303 of which code for proteins. While extensive genomic variation exists between “WY96” and Schu S4, there is only one whole gene difference. This one gene difference is a hypothetical protein of unknown function. In contrast, there are numerous SNPs (3,367), small indels (1,015), IS element differences (7) and large chromosomal rearrangements (31), including both inversions and translocations. The rearrangement borders are frequently associated with IS elements, which would facilitate intragenomic recombination events. The pathogenicity island duplicated regions (DR1 and DR2) are essentially identical in WY96 but vary relative to Schu S4 at 60 nucleotide positions. Other potential virulence-associated genes (231) varied at 559 nucleotide positions, including 357 non-synonymous changes. Molecular clock estimates for the divergence time between A.I and A.II genomes for different chromosomal regions ranged from 866 to 2131 years before present. This paper is the first complete genomic characterization of a member of the A.II clade of Francisella tularensis subsp. tularensis.
References
[1]
Harris S (1992) Japanese biological warfare research on humans: a case study of microbiology and ethics. Ann N Y Acad Sci 666: 21–52.
[2]
Harris S (1991) Japanese biological warfare experiments and other atrocities in Manchuria, 1932–1945, and the subsequent United States cover up: a preliminary assessment. Crime, Law and Social Change 15: 171–199.
[3]
Dennis DT, Inglesby TV, Henderson DA, Bartlett JG, Ascher MS, et al. (2001) Tularemia as a biological weapon: medical and public health management. Jama 285: 2763–2773.
[4]
CDC (2003) Key Facts About Tularemia. http://www.bt.cdc.gov/agent/tularemia/pd?f/tularemiafacts.pdf.
[5]
Staples JE, Kubota KA, Chalcraft LG, Mead PS, Petersen JM (2006) Epidemiologic and molecular analysis of human tularemia, United States, 1964–2004. Emerg Infect Dis 12: 1113–1118.
[6]
Keim PS, Johansson A, Wagner DM (2007) Molecular Epidemiology, Evolution, and Ecology of Francisella. Ann N Y Acad Sci.
[7]
Barns SM, Grow CC, Okinaka RT, Keim P, Kuske CR (2005) Detection of diverse new Francisella-like bacteria in environmental samples. Appl Environ Microbiol 71: 5494–5500.
[8]
Farlow J, Wagner DM, Dukerich M, Stanley M, Chu M, et al. (2005) Francisella tularensis in the United States. Emerg Infect Dis 11: 1835–1841.
[9]
Johansson A, Farlow J, Larsson P, Dukerich M, Chambers E, et al. (2004) Worldwide genetic relationships among Francisella tularensis isolates determined by multiple-locus variable-number tandem repeat analysis. J Bacteriol 186: 5808–5818.
[10]
Lsufjev NG, Meshcheryakova IS (1983) Subspecific taxonomy of Francisella tularensis McCoy and Chapin 1912. International Journal of Systematic Bacteriology 33: 872–874.
[11]
Larsson P, Oyston PC, Chain P, Chu MC, Duffield M, et al. (2005) The complete genome sequence of Francisella tularensis, the causative agent of tularemia. Nat Genet 37: 153–159.
Polard P, Prere MF, Chandler M, Fayet O (1991) Programmed translational frameshifting and initiation at an AUU codon in gene expression of bacterial insertion sequence IS911. J Mol Biol 222: 465–477.
[14]
Petrosino JF, Xiang Q, Karpathy SE, Jiang H, Yerrapragada S, et al. (2006) Chromosome rearrangement and diversification of Francisella tularensis revealed by the type B (OSU18) genome sequence. J Bacteriol 188: 6977–6985.
[15]
Darling AC, Mau B, Blattner FR, Perna NT (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14: 1394–1403.
[16]
Brotcke A, Weiss DS, Kim CC, Chain P, Malfatti S, et al. (2006) Identification of MglA-regulated genes reveals novel virulence factors in Francisella tularensis. Infect Immun 74: 6642–6655.
[17]
Jordan IK, Rogozin IB, Wolf YI, Koonin EV (2002) Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 12: 962–968.
[18]
Forslund AL, Kuoppa K, Svensson K, Salomonsson E, Johansson A, et al. (2006) Direct repeat-mediated deletion of a type IV pilin gene results in major virulence attenuation of Francisella tularensis. Mol Microbiol 59: 1818–1830.
[19]
Rohmer L, Brittnacher M, Svensson K, Buckley D, Haugen E, et al. (2006) Potential source of Francisella tularensis live vaccine strain attenuation determined by genome comparison. Infect Immun 74: 6895–6906.
[20]
Twine S, Bystrom M, Chen W, Forsman M, Golovliov I, et al. (2005) A mutant of Francisella tularensis strain SCHU S4 lacking the ability to express a 58-kilodalton protein is attenuated for virulence and is an effective live vaccine. Infect Immun 73: 8345–8352.
[21]
Twine SM, Shen H, Kelly JF, Chen W, Sjostedt A, et al. (2006) Virulence comparison in mice of distinct isolates of type A Francisella tularensis. Microb Pathog 40: 133–138.
[22]
Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8: 186–194.
[23]
Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8: 175–185.
[24]
Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8: 195–202.
[25]
Han C, Chain P (2006) Finishing Repetitive Regions Automatically with Dupfinisher. In: Arabnia HR, Valafar H, editors. Las Vegas, Nevada, USA: CSREA Press. pp. 142–147.
[26]
Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, et al. (1999) Alignment of whole genomes. Nucleic Acids Res 27: 2369–2376.
[27]
(2006) Manatee Home Page. http://manatee.sourceforge.net/.
[28]
Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955–964.
[29]
Slater GS, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6: 31.
[30]
Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, et al. (2005) Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33: D121–124.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410.
[33]
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147: 195–197.
[34]
Bateman A, Birney E, Durbin R, Eddy SR, Howe KL, et al. (2000) The Pfam protein families database. Nucleic Acids Res 28: 263–266.
[35]
Haft DH, Loftus BJ, Richardson DL, Yang F, Eisen JA, et al. (2001) TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res 29: 41–43.
[36]
Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305: 567–580.
[37]
Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340: 783–795.
[38]
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, et al. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41.
[39]
Auerbach RK (2006) sDACS: A Novel In Silico SNP Discovery and Classification Method for Bacterial Pathogens [Master's Thesis]. Flagstaff, AZ: Northern Arizona University.
[40]
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, et al. (2004) Versatile and open software for comparing large genomes. Genome Biol 5: R12.
[41]
Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 2496–2497.
[42]
Achtman M (2004) Age, Descent, and Genetic Diversity Within Yersinia pestis. In: Carniel E, Hinnebusch BJ, editors. Yersinia: molecular and cellular biology. Norwich, UK: Horizon Bioscience.
[43]
Stothard P, Wishart DS (2005) Circular genome visualization and exploration using CGView. Bioinformatics 21: 537–539.