Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations. 1. Introduction Genome sequencing data of mixtures can function as biomarkers for identification of genetic content of samples and to establish a sample’s genome profile, inclusive of major and minor genome components, drill down to identify SNPs and mutation events, compare relatedness of genetic content between samples, profile-to-profile, and provide a probabilistic or statistical scoring confidence for sample attribution. While high-throughput, automated sequencing has been used for years, analysis of sequencing information has focused on consensus sequencing [1–5]. In addition, sequencing has been used to infer microbial relationships [6–8]. Due to the ease of generating large volumes of sequence data, there has been pressure to develop computational tools [9]. Novel approaches, based on probabilistic analysis of sequencing information for mixtures and metagenomic samples, enable a broad capture of sequence data from a single run to characterize multiple genomes in a sample, even in isolates that are considered pure [10, 11]. When identifying genomes and determining the distribution of related organisms, knowing the populations of genomes in a sample is critical to accurate biomarker detection
References
[1]
C. Trapnell and S. L. Salzberg, “How to map billions of short reads onto genomes,” Nature Biotechnology, vol. 27, no. 5, pp. 455–457, 2009.
[2]
H. Nagasaki, T. Mochizuki, Y. Kodama et al., “DDBJ read annotation pipeline: a cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data,” DNA Research, vol. 20, no. 4, pp. 383–390, 2013.
[3]
T. Camerlengo, H. G. Ozer, R. Onti-Srinivasan et al., “From sequencer to supercomputer: an automatic pipeline for managing and processing next generation sequencing data,” AMIA Summits on Translational Science Proceedings, vol. 12, pp. 1–10, 2012.
[4]
G. A. Pavlopoulos, A. Oulas, E. Iacucci, et al., “Unraveling genomic variation from next generation sequencing data,” BioData Mining, vol. 6, no. 1, article 13, 2013.
[5]
J. P. Jakupciak and R. R. Colwell, “Biological agent detection technologies,” Molecular Ecology Resources, vol. 9, supplement 1, pp. 51–57, 2009.
[6]
K. A. Crandall, C. R. Kelsey, H. Imamichi, H. C. Lane, and N. P. Salzman, “Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection,” Molecular Biology and Evolution, vol. 16, no. 3, pp. 372–382, 1999.
[7]
J. E. Cooper and E. J. Feil, “Multilocus sequence typing—what is resolved?” Trends in Microbiology, vol. 12, no. 8, pp. 373–377, 2004.
[8]
O. E. Francis, M. Bendall, S. Manimaran, et al., “Pathoscope: species identification and strain attribution with unassembled sequencing data,” Genome Research, vol. 23, no. 10, pp. 1721–1729, 2013.
[9]
M. Wang, Y. Ye, and T. Haixu, “A de Bruijn graph approach to the quantification of closely-related genomes in a microbial community,” Journal of Computational Biology, vol. 19, pp. 814–825, 2012.
[10]
J. P. Jakupciak, “Population-sequencing as a biomarker for sample characterization,” Journal of Biomarkers. In press.
[11]
J. P. Jakupciak, J. M. Wells, J. S. Lin, and A. B. Feldman, “Population analysis of bacterial samples for individual identification in forensics application,” Journal of Datamining in Genomics & Proteomics, vol. 4, article 138, 2013.
[12]
M. T. G. Holden, R. W. Titball, S. J. Peacock et al., “Genomic plasticity of the causative agent of melioidosis, Burkholderia pseudomallei,” Proceedings of the National Academy of Sciences of the United States of America, vol. 101, no. 39, pp. 14240–14245, 2004.
[13]
K. T. Konstantinidis, A. Ramette, and J. M. Tiedje, “The bacterial species definition in the genomic era,” Philosophical Transactions of the Royal Society B, vol. 361, no. 1475, pp. 1929–1940, 2006.
[14]
B. J. Currie, A. Haslem, T. Pearson et al., “Identification of melioidosis outbreak by multilocus variable number tandem repeat analysis,” Emerging Infectious Diseases, vol. 15, no. 2, pp. 169–174, 2009.
[15]
K. H. Chua, K. H. See, K. L. Thong, and S. D. Puthucheary, “DNA fingerprinting of human isolates of Burkholderia pseudomallei from different geographical regions of Malaysia,” Tropical Biomedicine, vol. 27, no. 3, pp. 517–524, 2010.
[16]
C.-S. Chiou, “Editorial: multilocus variable-number tandem repeat analysis as a molecular tool for subtyping and phylogenetic analysis of bacterial pathogens,” Expert Review of Molecular Diagnostics, vol. 10, no. 1, pp. 5–7, 2010.
[17]
E. P. Price, H. M. Hornstra, D. Limmathurotsakul et al., “Within-host evolution of Burkholderia pseudomallei in four cases of acute melioidosis,” PLoS Pathogens, vol. 6, no. 1, Article ID e1000725, 2010.
[18]
C. M. Ronning, L. Losada, L. Brinkac et al., “Genetic and phenotypic diversity in Burkholderia: contributions by prophage and phage-like elements,” BMC Microbiology, vol. 10, pp. 202–208, 2010.
[19]
M. A. Schell, R. L. Ulrich, W. J. Ribot, et al., “Type VI secretion is a major virulence determinant in Burkholdria mallei,” Molecular Microbiology, vol. 64, no. 6, pp. 1466–1485, 2007.
[20]
A. Deeraksa, O. Qazi, G. C. Whitlock, et al., “Development of a non-living vaccine against Burkholderia mallei,” The Journal of Immunology, vol. 182, article 129. 2, 2009.
[21]
D. A. B. Dance, “Melioidosis: the tip of the iceberg?” Clinical Microbiology Reviews, vol. 4, no. 1, pp. 52–60, 1991.
[22]
E. Yabuuchi and M. Arakawa, “Burkholderia pseudomallei and melioidosis: be aware in temperate area,” Microbiology and Immunology, vol. 37, no. 11, pp. 823–836, 1993.
[23]
L. D. Rotz, A. S. Khan, S. R. Lillibridge, S. M. Ostroff, and J. M. Hughes, “Public health assessment of potential biological terrorism agents,” Emerging Infectious Diseases, vol. 8, no. 2, pp. 225–230, 2002.
[24]
D. A. Rasko, P. L. Worsham, T. G. Abshire et al., “Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation,” Proceedings of the National Academy of Sciences of the United States of America, vol. 108, no. 12, pp. 5027–5032, 2011.
[25]
M. J. Struelens, Y. De Gheldre, and A. Deplano, “Comparative and library epidemiological typing systems: outbreak investigations versus surveillance systems,” Infection Control and Hospital Epidemiology, vol. 19, no. 8, pp. 565–569, 1998.
[26]
S. N. Gardner, C. J. Jaing, K. S. McLoughlin, and T. R. Slezak, “A microbial detection array (MDA) for viral and bacterial detection,” BMC Genomics, vol. 11, no. 1, article 668, 2010.
[27]
P. H. M. Savelkoul, H. J. M. Aarts, J. De Haas et al., “Amplified-fragment length polymorphism analysis: the state of an art,” Journal of Clinical Microbiology, vol. 37, no. 10, pp. 3083–3091, 1999.
[28]
P. Janssen, R. Coopman, G. Huys et al., “Evaluation of the DNA fingerprinting method AFLP as a new tool in bacterial taxonomy,” Microbiology, vol. 142, part 7, pp. 1881–1893, 1996.
[29]
B.-A. Lindstedt, E. Heir, T. Vardund, and G. Kapperud, “A variation of the amplified-fragment length polymorphism (AFLP) technique using three restriction endonucleases, and assessment of the enzyme combination BglII-MfeI for AFLP analysis of Salmonella enterica subsp. enterica isolates,” FEMS Microbiology Letters, vol. 189, no. 1, pp. 19–24, 2000.
[30]
Y. Graser, I. Klare, E. Halle et al., “Epidemiological study of an Acinetobacter baumannii outbreak by using polymerase chain reaction fingerprinting,” Journal of Clinical Microbiology, vol. 31, no. 9, pp. 2417–2420, 1993.
[31]
B. Budowle, M. D. Johnson, C. M. Fraser, T. J. Leighton, R. S. Murch, and R. Chakraborty, “Genetic analysis and attribution of microbial forensics evidence,” Critical Reviews in Microbiology, vol. 31, no. 4, pp. 233–254, 2005.
[32]
M. A. Poritz, A. J. Blaschke, C. L. Byington, et al., “FilmArray, an automated nested multiplex PCR system for multi-pathogen detection: development and application to respiratory tract infection,” PLoS ONE, vol. 6, no. 10, Article ID e26047, 2011.
[33]
J. K. Stone, M. Mayo, S. A. Grasso, et al., “Detection of Burkholderia pseudomallei O-antigen serotypes in near-neighbor species,” BMC Microbiology, vol. 12, article 250, 2012.
[34]
I. Vandenbroucke, H. van Marck, P. Verhasselt et al., “Minor variant detection in amplicons using 454 massive parallel pyrosequencing: experiences and considerations for successful applications,” BioTechniques, vol. 51, no. 3, pp. 167–177, 2011.
[35]
J. W. Davey, P. A. Hohenlohe, P. D. Etter, J. Q. Boone, J. M. Catchen, and M. L. Blaxter, “Genome-wide genetic marker discovery and genotyping using next-generation sequencing,” Nature Reviews Genetics, vol. 12, no. 7, pp. 499–510, 2011.
[36]
C.-C. Ho, C. C. Y. Lau, P. Martelli et al., “Novel pan-genomic analysis approach in target selection for multiplex PCR identification and detection of Burkholderia pseudomallei, Burkholderia thailandensis, and Burkholderia cepacia complex species: a proof-of-concept study,” Journal of Clinical Microbiology, vol. 49, no. 3, pp. 814–821, 2011.
[37]
S. A. Boers, W. A. van der Reijden, and R. Jansen, “High-throughput multilocus sequence typing: bringing molecular typing to the next level,” PLoS ONE, vol. 7, Article ID e39630, 2012.
[38]
M. V. Larsen, S. Cosentino, S. Rasmussen et al., “Multilocus sequence typing of total-genome-sequenced bacteria,” Journal of Clinical Microbiology, vol. 50, no. 4, pp. 1355–1361, 2012.
[39]
M. Inouye, T. C. Conway, J. Zobel, and K. E. Holt, “Short Read Sequence Typing (SRST): multi-locus sequence types from short reads,” BMC Genomics, vol. 13, article 338, 2012.
[40]
S. J. Shallom, H. Tae, L. Sarmento et al., “Comparison of genome diversity of Brucella spp. field isolates using Universal Bio-signature Detection Array and whole genome sequencing reveals limitations of current diagnostic methods,” Gene, vol. 509, no. 1, pp. 142–148, 2012.
[41]
K. Lagesen, D. W. Ussery, and T. M. Wassenaar, “Genome update: the 1000th genome—a cautionary tale,” Microbiology, vol. 156, no. 3, pp. 603–608, 2010.
[42]
T. C. Glenn, “Field guide to next-generation DNA sequencers,” Molecular Ecology Resources, vol. 11, no. 5, pp. 759–769, 2011.
[43]
M. L. Metzker, “Sequencing technologies the next generation,” Nature Reviews Genetics, vol. 11, no. 1, pp. 31–46, 2010.
[44]
S. Pabinger, A. Dander, M. Fischer, et al., “A survey of tools for variant analysis of nest generation genome sequencing data,” Briefings in Bioinformatics, 2013.
[45]
M. Eppinger, P. L. Worsham, M. P. Nikolich et al., “Genome sequence of the deep-rooted Yersinia pestis strain Angola reveals new insights into the evolution and pangenome of the plague bacterium,” Journal of Bacteriology, vol. 192, no. 6, pp. 1685–1699, 2010.