All Title Author
Keywords Abstract

MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets

DOI: 10.1186/gb-2012-13-3-r16

Full-Text   Cite this paper   Add to My Lib


Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-Seq) has become the preferred method to determine genome-wide binding patterns of transcription factors and other chromatin-associated proteins [1]. With the rapid accumulation of ChIP-Seq data, comparison of multiple ChIP-Seq data sets is increasingly becoming critical for addressing important biological questions. For example, comparison of biological replicates is commonly used to find robust binding sites, and the identification of sites that are differentially bound by chromatin-associated proteins in different cellular contexts is important for elucidating underlying mechanisms of cell type-specific regulation. Although ChIP-Seq data generally exhibit high signal-to-background noise (S/N) ratios compared to ChIP-on-chip datasets, there are still significant challenges in data analysis due to variation in sample preparation and errors introduced in sequencing [1].Several methods have been proposed for finding ChIP-enriched regions in a ChIP-Seq sample compared to a suitable negative control (for example, mock or non-specific immunoprecipitation). These involve fitting a model derived from negative control and/or sample low read intensity (background) regions, and then applying this model to identify ChIP-enriched regions (peaks) [2-4]. However, few methods have been proposed for comparison of ChIP-Seq samples. The simplest approach classifies the peaks from each sample as either common or unique, based on whether or not the peak overlaps with peaks in other samples [5-10]. Although this method can identify general relationships between peak sets from different samples, the results are highly dependent on the cutoff used in peak calling, which is difficult to select in a completely objective manner. Moreover, common peaks may show differential binding between the samples being compared, while other peaks may be identified as unique to one sample simply because they fall below an ar


comments powered by Disqus