|
BMC Bioinformatics 2010
Quantized correlation coefficient for measuring reproducibility of ChIP-chip dataAbstract: We develop the Quantized correlation coefficient (QCC) that is much less dependent on the amount of signal. This involves discretization of data into set of quantiles (quantization), a merging procedure to group the background probes, and recalculation of the Pearson correlation coefficient. This procedure reduces the influence of the background noise on the statistic, which then properly focuses more on the reproducibility of the signal. The performance of this procedure is tested in both simulated and real ChIP-chip data. For replicates with different levels of enrichment over background and coverage, we find that QCC reflects reproducibility more accurately and is more robust than the standard Pearson or Spearman correlation coefficients. The quantization and the merging procedure can also suggest a proper quantile threshold for separating signal from background for further analysis.To measure reproducibility of ChIP-chip data correctly, a correlation coefficient that is robust to the amount of signal present should be used. QCC is one such measure. The QCC statistic can also be applied in a variety of other contexts for measuring reproducibility, including analysis of array CGH data for DNA copy number and gene expression data.Chromatin immunoprecipitation on tiling arrays (ChIP-chip) has been widely used to study genome-wide binding sites of transcription factors [1-5] and other protein complexes [6,7], as well as histone modifications [8]. To ensure data quality, ChIP-chip experiments are generally performed in replicates (two or three), and their reproducibility is used to determine whether the data are of sufficient quality and whether further experiments or validations are necessary. A number of factors impact the reproducibility of ChIP-chip data, including uniformity of cross-linking and sonication, specificity of the antibody, efficiency of chromatin immunoprecipitation, quality of the probes, and biological variability.To assess the reproducibility of C
|