Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios

DOI: 10.1186/1471-2105-9-409

We demonstrate an asymmetry in the detection of the two alleles for each SNP, which deleteriously influences both allelic proportions and copy number estimates. The asymmetry is caused by a remaining bias between the two dyes used in the Infinium II assay after using the normalization method in Illumina's proprietary software (BeadStudio). We propose a quantile normalization strategy for correction of this dye bias. We tested the normalization strategy using 535 individual hybridizations from 10 data sets from the analysis of cancer genomes and normal blood samples generated on Illumina Infinium II 300 k version 1 and 2, 370 k and 550 k BeadChips. We show that the proposed normalization strategy successfully removes asymmetry in estimates of both allelic proportions and copy numbers. Additionally, the normalization strategy reduces the technical variation for copy number estimates while retaining the response to copy number alterations.The proposed normalization strategy represents a valuable tool that improves the quality of data obtained from Illumina Infinium arrays, in particular when used for LOH and copy number variation studies.Genomic copy number alterations (CNA) and allelic imbalances are common events in the development of cancer and certain genetic disorders [1,2]. The introduction of whole genome genotyping (WGG) arrays based on single nucleotide polymorphism (SNP) genotyping [3,4] allows for combined DNA copy number (SNP-CGH) and loss-of-heterozygosity (LOH) analysis at high resolution [5]. Currently, two major SNP array platforms are in use, Affymetrix GeneChip arrays [6] and Illumina BeadChips [7]. The Infinium assay for Illumina BeadChips is based on allele-specific hybridization coupled with primer extension of genomic DNA using primers directly surrounding the SNP on randomly ordered bead arrays [4]. The Infinium assay has been further developed into allele-specific single base extension using two color labeling with the Cy3 and Cy5 fluorescent dy


