Improving Indel Detection Specificity of the Ion Torrent PGM Benchtop Sequencer
Zhen Xuan Yeo, Maurice Chan, Yoon Sim Yap, Peter Ang, Steve Rozen, Ann Siew Gek Lee
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0045798
Abstract: The emergence of benchtop sequencers has made clinical genetic testing using next-generation sequencing more feasible. Ion Torrent's PGMTM is one such benchtop sequencer that shows clinical promise in detecting single nucleotide variations (SNVs) and microindel variations (indels). However, the large number of false positive indels caused by the high frequency of homopolymer sequencing errors has impeded PGMTM's usage for clinical genetic testing. An extensive analysis of PGMTM data from the sequencing reads of the well-characterized genome of the Escherichia coli DH10B strain and sequences of the BRCA1 and BRCA2 genes from six germline samples was done. Three commonly used variant detection tools, SAMtools, Dindel, and GATK's Unified Genotyper, all had substantial false positive rates for indels. By incorporating filters on two major measures we could dramatically improve false positive rates without sacrificing sensitivity. The two measures were: B-Allele Frequency (BAF) and VARiation of the Width of gaps and inserts (VARW) per indel position. A BAF threshold applied to indels detected by UnifiedGenotyper removed ~99% of the indel errors detected in both the DH10B and BRCA sequences. The optimum BAF threshold for BRCA sequences was determined by requiring 100% detection sensitivity and minimum false discovery rate, using variants detected from Sanger sequencing as reference. This resulted in 15 indel errors remaining, of which 7 indel errors were removed by selecting a VARW threshold of zero. VARW specific errors increased in frequency with higher read depth in the BRCA datasets, suggesting that homopolymer-associated indel errors cannot be reduced by increasing the depth of coverage. Thus, using a VARW threshold is likely to be important in reducing indel errors from data with higher coverage. In conclusion, BAF and VARW thresholds provide simple and effective filtering criteria that can improve the specificity of indel detection in PGMTM data without compromising sensitivity.
Assessing Matched Normal and Tumor Pairs in Next-Generation Sequencing Studies
Liang Goh,Geng Bo Chen,Ioana Cutcutache,Benjamin Low,Bin Tean Teh,Steve Rozen,Patrick Tan
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0017810
Abstract: Next generation sequencing technology has revolutionized the study of cancers. Through matched normal-tumor pairs, it is now possible to identify genome-wide germline and somatic mutations. The generation and analysis of the data requires rigorous quality checks and filtering, and the current analytical pipeline is constantly undergoing improvements. We noted however that in analyzing matched pairs, there is an implicit assumption that the sequenced data are matched, without any quality check such as those implemented in association studies. There are serious implications in this assumption as identification of germline and rare somatic variants depend on the normal sample being the matched pair. Using a genetics concept on measuring relatedness between individuals, we demonstrate that the matchedness of tumor pairs can be quantified and should be included as part of a quality protocol in analysis of sequenced data. Despite the mutation changes in cancer samples, matched tumor-normal pairs are still relatively similar in sequence compared to non-matched pairs. We demonstrate that the approach can be used to assess the mutation landscape between individuals.
Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction
Jian Ye, George Coulouris, Irena Zaretskaya, Ioana Cutcutache, Steve Rozen, Thomas Madden
BMC Bioinformatics , 2012, DOI: 10.1186/1471-2105-13-134
Abstract: We present a new software tool called Primer-BLAST to alleviate the difficulty in designing target-specific primers. This tool combines BLAST with a global alignment algorithm to ensure a full primer-target alignment and is sensitive enough to detect targets that have a significant number of mismatches to primers. Primer-BLAST allows users to design new target-specific primers in one step as well as to check the specificity of pre-existing primers. Primer-BLAST also supports placing primers based on exon/intron locations and excluding single nucleotide polymorphism (SNP) sites in primers.We describe a robust and fully implemented general purpose primer design tool that designs target-specific PCR primers. Primer-BLAST offers flexible options to adjust the specificity threshold and other primer properties. This tool is publicly available at http://www.ncbi.nlm.nih.gov/tools/primer-blast webcite.
Descriptions of Mature Larvae of the Bee Tribe Emphorini and Its Subtribes (Hymenoptera, Apidae, Apinae)
Jerome Rozen
ZooKeys , 2011, DOI: 10.3897/zookeys.148.1839
Abstract: A description of the mature larvae of the bee tribe Emphorini based on representatives of six genera is presented herein. The two included subtribes, Ancyloscelidina and Emphorina, are also characterized and distinguished from one another primarily by their mandibular anatomy. The anatomy of abdominal segments 9 and 10 is investigated and appears to have distinctive features that distinguish the larvae of the tribe from those of related apine tribes.
Metabolomics in Early Alzheimer's Disease: Identification of Altered Plasma Sphingolipidome Using Shotgun Lipidomics
Xianlin Han, Steve Rozen, Stephen H. Boyle, Caroline Hellegers, Hua Cheng, James R. Burke, Kathleen A. Welsh-Bohmer, P. Murali Doraiswamy, Rima Kaddurah-Daouk
PLOS ONE , 2011, DOI: 10.1371/journal.pone.0021643
Abstract: Background The development of plasma biomarkers could facilitate early detection, risk assessment and therapeutic monitoring in Alzheimer's disease (AD). Alterations in ceramides and sphingomyelins have been postulated to play a role in amyloidogensis and inflammatory stress related neuronal apoptosis; however few studies have conducted a comprehensive analysis of the sphingolipidome in AD plasma using analytical platforms with accuracy, sensitivity and reproducibility. Methods and Findings We prospectively analyzed plasma from 26 AD patients (mean MMSE 21) and 26 cognitively normal controls in a non-targeted approach using multi-dimensional mass spectrometry-based shotgun lipidomics [1], [2] to determine the levels of over 800 molecular species of lipids. These data were then correlated with diagnosis, apolipoprotein E4 genotype and cognitive performance. Plasma levels of species of sphingolipids were significantly altered in AD. Of the 33 sphingomyelin species tested, 8 molecular species, particularly those containing long aliphatic chains such as 22 and 24 carbon atoms, were significantly lower (p<0.05) in AD compared to controls. Levels of 2 ceramide species (N16:0 and N21:0) were significantly higher in AD (p<0.05) with a similar, but weaker, trend for 5 other species. Ratios of ceramide to sphingomyelin species containing identical fatty acyl chains differed significantly between AD patients and controls. MMSE scores were correlated with altered mass levels of both N20:2 SM and OH-N25:0 ceramides (p<0.004) though lipid abnormalities were observed in mild and moderate AD. Within AD subjects, there were also genotype specific differences. Conclusions In this prospective study, we used a sensitive multimodality platform to identify and characterize an essentially uniform but opposite pattern of disruption in sphingomyelin and ceramide mass levels in AD plasma. Given the role of brain sphingolipids in neuronal function, our findings provide new insights into the AD sphingolipidome and the potential use of metabolomic signatures as peripheral biomarkers.
Host Cell Transcriptome Profile during Wild-Type and Attenuated Dengue Virus Infection
October M. Sessions equal contributor,Ying Tan equal contributor,Kenneth C. Goh,Yujing Liu,Patrick Tan,Steve Rozen,Eng Eong Ooi
PLOS Neglected Tropical Diseases , 2013, DOI: 10.1371/journal.pntd.0002107
Abstract: Dengue viruses 1–4 (DENV1-4) rely heavily on the host cell machinery to complete their life cycle, while at the same time evade the host response that could restrict their replication efficiency. These requirements may account for much of the broad gene-level changes to the host transcriptome upon DENV infection. However, host gene function is also regulated through transcriptional start site (TSS) selection and post-transcriptional modification to the RNA that give rise to multiple gene isoforms. The roles these processes play in the host response to dengue infection have not been explored. In the present study, we utilized RNA sequencing (RNAseq) to identify novel transcript variations in response to infection with both a pathogenic strain of DENV1 and its attenuated derivative. RNAseq provides the information necessary to distinguish the various isoforms produced from a single gene and their splice variants. Our data indicate that there is an extensive amount of previously uncharacterized TSS and post-transcriptional modifications to host RNA over a wide range of pathways and host functions in response to DENV infection. Many of the differentially expressed genes identified in this study have previously been shown to be required for flavivirus propagation and/or interact with DENV gene products. We also show here that the human transcriptome response to an infection by wild-type DENV or its attenuated derivative differs significantly. This differential response to wild-type and attenuated DENV infection suggests that alternative processing events may be part of a previously uncharacterized innate immune response to viral infection that is in large part evaded by wild-type DENV.
Self-assembly of Brownian motor by reduction of its effective temperature
Alexander Feigel,Asaf Rozen
Physics , 2013,
Abstract: Emergence, optimization and stability of a motor-like motion in a fluctuating environment are analyzed. The emergence of motion is shown to be a general phenomenon. A motor converges to the state with the minimum of effective temperature and with the corresponding minimum in the rate of conformation changes similarly as some stochastic processes converge to the states with minimum diffusion activity. This mechanism is important to bacterial foraging (chemotaxis). This work, therefore, raises an analogy between chemotaxis and the emergence of living-like systems. The implications include the deviation of stable natural or artificial machines from the minimum entropy production principle, with a novel self-assembly mechanism for the emergence of the first molecular motors and for mass fabrication of the future nanodevices.
Ex-Post Equilibrium and VCG Mechanisms
Rakefet Rozen,Rann Smorodinsky
Computer Science , 2012,
Abstract: Consider an abstract social choice setting with incomplete information, where the number of alternatives is large. Albeit natural, implementing VCG mechanisms may not be feasible due to the prohibitive communication constraints. However, if players restrict attention to a subset of the alternatives, feasibility may be recovered. This paper characterizes the class of subsets which induce an ex-post equilibrium in the original game. It turns out that a crucial condition for such subsets to exist is the existence of a type-independent optimal social alternative, for each player. We further analyze the welfare implications of these restrictions. This work follows work by Holzman, Kfir-Dahav, Monderer and Tennenholtz (2004) and Holzman and Monderer (2004) where similar analysis is done for combinatorial auctions.
First somatic mutation of E2F1 in a critical DNA binding residue discovered in well-differentiated papillary mesothelioma of the peritoneum
Willie Yu, Waraporn Chan-On, Melissa Teo, Choon Ong, Ioana Cutcutache, George E Allen, Bernice Wong, Swe Myint, Kiat Lim, P Mathijs Voorhoeve, Steve Rozen, Khee Soo, Patrick Tan, Bin Teh
Genome Biology , 2011, DOI: 10.1186/gb-2011-12-9-r96
Abstract: WDPMP exome sequencing reveals the first somatic mutation of E2F1, R166H, to be identified in human cancer. The location is in the evolutionarily conserved DNA binding domain and computationally predicted to be mutated in the critical contact point between E2F1 and its DNA target. We show that the R166H mutation abrogates E2F1's DNA binding ability and is associated with reduced activation of E2F1 downstream target genes. Mutant E2F1 proteins are also observed in higher quantities when compared with wild-type E2F1 protein levels and the mutant protein's resistance to degradation was found to be the cause of its accumulation within mutant over-expressing cells. Cells over-expressing wild-type E2F1 show decreased proliferation compared to mutant over-expressing cells, but cell proliferation rates of mutant over-expressing cells were comparable to cells over-expressing the empty vector.The R166H mutation in E2F1 is shown to have a deleterious effect on its DNA binding ability as well as increasing its stability and subsequent accumulation in R166H mutant cells. Based on the results, two compatible theories can be formed: R166H mutation appears to allow for protein over-expression while minimizing the apoptotic consequence and the R166H mutation may behave similarly to SV40 large T antigen, inhibiting tumor suppressive functions of retinoblastoma protein 1.Mesothelioma is an uncommon neoplasm that develops from the mesothelium, the protective lining covering a majority of the body's internal organs, and is divided into four subtypes: pleural, peritoneum, pericardium and tunica vaginalis [1]. While malignant peritoneal mesothelioma (MPM) is an aggressive tumor mainly afflicting asbestos-exposed males in the age range of 50 to 60 years old [2], well-differentiated papillary mesothelioma of the peritoneum (WDPMP), a rare subtype of epithelioid mesothelioma [1] with fewer than 60 cases described in the literature [3], is generally considered to be a tumor of low malignant p
Massively Parallel Sequencing of Patients with Intellectual Disability, Congenital Anomalies and/or Autism Spectrum Disorders with a Targeted Gene Panel
Maggie Brett, John McPherson, Zhi Jiang Zang, Angeline Lai, Ee-Shien Tan, Ivy Ng, Lai-Choo Ong, Breana Cham, Patrick Tan, Steve Rozen, Ene-Choo Tan
PLOS ONE , 2014, DOI: 10.1371/journal.pone.0093409
Abstract: Developmental delay and/or intellectual disability (DD/ID) affects 1–3% of all children. At least half of these are thought to have a genetic etiology. Recent studies have shown that massively parallel sequencing (MPS) using a targeted gene panel is particularly suited for diagnostic testing for genetically heterogeneous conditions. We report on our experiences with using massively parallel sequencing of a targeted gene panel of 355 genes for investigating the genetic etiology of eight patients with a wide range of phenotypes including DD/ID, congenital anomalies and/or autism spectrum disorder. Targeted sequence enrichment was performed using the Agilent SureSelect Target Enrichment Kit and sequenced on the Illumina HiSeq2000 using paired-end reads. For all eight patients, 81–84% of the targeted regions achieved read depths of at least 20×, with average read depths overlapping targets ranging from 322× to 798×. Causative variants were successfully identified in two of the eight patients: a nonsense mutation in the ATRX gene and a canonical splice site mutation in the L1CAM gene. In a third patient, a canonical splice site variant in the USP9X gene could likely explain all or some of her clinical phenotypes. These results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes. However, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism.
