%0 Journal Article %T Correction of scaling mismatches in oligonucleotide microarray data %A Martino Barenco %A Jaroslav Stark %A Daniel Brewer %A Daniela Tomescu %A Robin Callard %A Michael Hubank %J BMC Bioinformatics %D 2006 %I BioMed Central %R 10.1186/1471-2105-7-251 %X We explain how scaling mismatches occur in data summarized by the popular MAS5 (GCOS; Affymetrix) algorithm, and propose a simple recursive algorithm to correct them. Its principle is to identify a set of constant genes and to use this set to rescale the microarray signals. We study the properties of the algorithm using artificially generated data and apply it to experimental data. We show that the set of constant genes it generates can be used to rescale data from other experiments, provided that the underlying system is similar to the original. We also demonstrate, using a simple example, that the method can successfully correct existing imbalancesin the data.The set of constant genes obtained for a given experiment can be applied to other experiments, provided the systems studied are sufficiently similar. This type of rescaling is especially relevant in systems biology applications using microarray data.Gene expression profiling using microarrays has become a popular technique in modern biochemical research. One of the commonest microarray platforms in use is the high-density oligonucleotide array introduced by Affymetrix (Santa Clara, CA). In the Affymetrix system, biotinylated cRNA generated from the sample of interest is hybridised to the array and detected using fluorescently-labelled streptavidin. A number of different expression summary algorithms are available to derive the concentration of each transcript from the intensity of fluorescence. These include MAS5 (Affymetrix) [1], RMA [2], MBEI [3,4].It is important to use the most accurate and precise methods for calculating gene expression levels from microarray data. MAS5, RMA and MBEI offer different solutions to this problem. On Affymetrix arrays, transcripts are represented by multiple (typically 11) pairs of 25-mer oligonucleotides termed a probeset. One of each pair is a perfect match (PM) for the target and the other is a mismatch control (MM). In MAS5, signal values from MM oligonucleotides are subt %U http://www.biomedcentral.com/1471-2105/7/251