Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Selecting normalization genes for small diagnostic microarrays
Jochen Jaeger, Rainer Spang
BMC Bioinformatics , 2006, DOI: 10.1186/1471-2105-7-388
Abstract: In this paper we point out the differences of normalizing large microarrays and small diagnostic microarrays. We suggest to include additional normalization genes on the small diagnostic microarrays and propose two strategies for selecting them from genomewide microarray studies. The first is a data driven univariate selection of normalization genes. The second is multivariate and based on finding a balanced diagnostic signature. Finally, we compare both methods to standard normalization protocols known from large microarrays.Not including additional genes for normalization on small microarrays leads to a loss of diagnostic information. Using house keeping genes from the literature for normalization fails to work for certain datasets. While a data driven selection of additional normalization genes works well, the best results were obtained using a balanced signature.Several publications have suggested the use of cDNA-microarrays for clinical diagnosis [1-4]. While today's microarrays can cover up to 50,000 genes, only a small percentage of them is needed for diagnosis. Most diagnostic microarray datasets can achieve optimal classification with no more than 5–50 discriminative genes [5-7]. This opens new possibilities for the design of small diagnostic microarrays used for gene expression based diagnosis.To design such disease specific, small custom arrays differential genes are identified from a large set of potential candidate genes using genome wide expression profiling. Then, only these differential genes are put onto a small custom microarray [8]. Throughout this paper, we refer to diagnostic microarrays as small, custom microarrays for diagnostic purpose holding only few genes and large microarrays as genomewide gene expression microarrays, holding tens of thousands of genes.With the concept of diagnostic microarrays new problems arise. A first important step in microarray analysis is normalization. The overall intensity of microarrays can vary in a large datas
Normalization of one-channel microarrays for identification of organisms
Zierer, Astrid,Reineke, Achim,Drutschmann, Denja,Blohm, Dietmar
GMS Medizinische Informatik, Biometrie und Epidemiologie , 2007,
Abstract: Microarrays are widely used in gene expression analysis, but there are further areas they can be applied to, like e.g. the identification of organisms. To interpret and compare the results of microarray experiments it is necessary to standardize the data. In this context standardization is referred to as normalization. We present data derived from a microarray experiment aiming to identify different subtypes of the hepatitis C virus. Most of the methods developed to normalize microarray data are focused on gene expression analysis. Their use for the identification of organisms is restricted and needs adaption for the special requirements. Based on our data setting, we present several possibilities how to modify the existing methods and deal with the specific conditions.
"Harshlighting" small blemishes on microarrays
Mayte Suárez-Fari?as, Asifa Haider, Knut M Wittkowski
BMC Bioinformatics , 2005, DOI: 10.1186/1471-2105-6-65
Abstract: We present a method that harnesses the statistical power provided by having several HDONAs available, which are obtained under similar conditions except for the experimental factor. This method "harshlights" blemishes and renders them evident. We find empirically that about 25% of our chips are blemished, and we analyze the impact of masking them on screening for differentially expressed genes.Experiments attempting to assess subtle expression changes should be carefully screened for blemishes on the chips. The proposed method provides investigators with a novel robust approach to improve the sensitivity of microarray analyses. By utilizing topological information to identify and mask blemishes prior to model based analyses, the method prevents artefacts from confounding the process of background correction, normalization, and summarization.Analysis of hybridized microarrays starts with scanning the fluorescent image. For high-density oligonucleotide arrays (HDONAs) such as Affymetrix GeneChip? oligonucleotide (Affy) arrays, the focus of this paper, each scanned image is stored pixel-by-pixel in a 'DAT' file. As the first step in measuring intensity of the hybridization signal, a grid is overlaid, the image is segmented into spots or features, and the pixel intensities within each of these are summarized as a probe intensity estimate (See reviews [1] and [2] for cDNA chips). The probe-level intensity estimates are stored in a 'CEL' file. Each gene is represented by pairs of probes, each representing another characteristic sequences and a 'mismatch', which is identical, except for the Watson-Crick complement in the center. Expression of a gene is estimated from such a probe set by applying algorithms for background correction, normalization, and summarization.The quality of data scanned from a microarray is affected by a plethora of potential confounders, which may act during printing/manufacturing, hybridization, washing, and reading. Each chip contains a number of
Improving the scaling normalization for high-density oligonucleotide GeneChip expression microarrays
Chao Lu
BMC Bioinformatics , 2004, DOI: 10.1186/1471-2105-5-103
Abstract: Among the 76 U34A GeneChip experiments, the total signals on each array showed 25.8% variations in terms of the coefficient of variation, although all microarrays were hybridized with the same amount of biotin-labeled cRNA. The 2% of the probe sets with the highest signals that were normally excluded from SF calculation accounted for 34% to 54% of the total signals (40.7% ± 4.4%, mean ± sd). In comparison with normalization factors obtained from the median signal or from the mean of the log transformed signal, SF showed the greatest variation. The normalization factors obtained from log transformed signals showed least variation.Eliminating 40% of the signal data during SF calculation failed to show any benefit. Normalization factors obtained with log transformed signals performed the best. Thus, it is suggested to use the mean of the logarithm transformed data for normalization, rather than the arithmetic mean of signals in GeneChip gene expression microarrays.The high-density oligonucleotide microarray, also known as GeneChip?, made by Affymetrix Inc (Santa Clara, CA), has been widely used in both academic institutions and industrial companies, and is considered as the "standard" of gene expression microarrays among several platforms. A single GeneChip? can hold more than 50,000 probe sets for every gene in human genome. A probe set is a collection of probe pairs that interrogates the same sequence, or set of sequences, and typically contains 11 probe pairs of 25-mer oligonucleotides [1-3]. Each pair contains the complementary sequence to the gene of interest, the so-called perfect match (PM), and a specificity control, called the Mismatch (MM) [3]. Gene expression level is obtained from the calculation of hybridization intensity to the probe pairs and is referred to as the "signal" [4-10]. The normalization method used in GeneChip software is called scaling and is defined as an adjustment of the average signal value of all arrays to a common value, the target sig
Variation-preserving normalization unveils blind spots in gene expression profiling  [PDF]
Carlos P. Roca,Susana I. L. Gomes,Mónica J. B. Amorim,Janeck J. Scott-Fordsmand
Quantitative Biology , 2015,
Abstract: RNA-Seq and gene expression microarrays provide comprehensive profiles of gene activity, but lack of reproducibility has hindered their application. A key challenge in the data analysis is the normalization of gene expression levels, which is currently performed following an implicit assumption that most genes are not differentially expressed. Here, we present a mathematical approach to normalization that makes no assumption of this sort. We have found that variation in gene expression is much greater than currently believed, and it can be measured with available technologies. Our results also explain, at least partially, the problems encountered in transcriptomics studies. We expect this improvement in detection to help efforts to realize the full potential of gene expression profiling, especially in analyses of cellular processes involving complex modulations of gene expression.
"Per cell" normalization method for mRNA measurement by quantitative PCR and microarrays
Jun Kanno, Ken-ichi Aisaki, Katsuhide Igarashi, Noriyuki Nakatsu, Atsushi Ono, Yukio Kodama, Taku Nagao
BMC Genomics , 2006, DOI: 10.1186/1471-2164-7-64
Abstract: Here we report a method (designated the "Percellome" method) for normalizing the expression of mRNA values in biological samples. It provides a "per cell" readout in mRNA copy number and is applicable to both quantitative PCR (Q-PCR) and DNA microarray studies. The genomic DNA content of each sample homogenate was measured from a small aliquot to derive the number of cells in the sample. A cocktail of five external spike RNAs admixed in a dose-graded manner (dose-graded spike cocktail; GSC) was prepared and added to each homogenate in proportion to its DNA content. In this way, the spike mRNAs represented absolute copy numbers per cell in the sample. The signals from the five spike mRNAs were used as a dose-response standard curve for each sample, enabling us to convert all the signals measured to copy numbers per cell in an expression profile-independent manner. A series of samples was measured by Q-PCR and Affymetrix GeneChip microarrays using this Percellome method, and the results showed up to 90 % concordance.Percellome data can be compared directly among samples and among different studies, and between different platforms, without further normalization. Therefore, "percellome" normalization can serve as a standard method for exchanging and comparing data across different platforms and among different laboratories.Normalization of gene expression data between different samples generated in the same laboratory using a single platform, and/or generated in different geographical regions using multiple platforms, is central to the establishment of a reliable reference database for toxicogenomics and pharmacogenomics. Transforming expression data into a "per cell" database is an effective way of normalizing expression data across samples and platforms. However, transcriptome data from the quantitative PCR (Q-PCR) and DNA microarray analyses currently deposited in the database are related to a fixed amount of RNA collected per sample. Variations in RNA yield across s
Evaluation of normalization methods for two-channel microRNA microarrays
Yingdong Zhao, Ena Wang, Hui Liu, Melissa Rotunno, Jill Koshiol, Francesco M Marincola, Maria Landi, Lisa M McShane
Journal of Translational Medicine , 2010, DOI: 10.1186/1479-5876-8-69
Abstract: We evaluated many different normalization methods for data generated with a custom-made two channel miR microarray using two data sets that have technical replicates from several different cell lines. The impact of each normalization method was examined on both within miR error variance (between replicate arrays) and between miR variance to determine which normalization methods minimized differences between replicate samples while preserving differences between biologically distinct miRs.Lowess normalization generally did not perform as well as the other methods, and quantile normalization based on an invariant set showed the best performance in many cases unless restricted to a very small invariant set. Global median and global mean methods performed reasonably well in both data sets and have the advantage of computational simplicity.Researchers need to consider carefully which assumptions underlying the different normalization methods appear most reasonable for their experimental setting and possibly consider more than one normalization approach to determine the sensitivity of their results to normalization method used.MicroRNAs (miRs) are a class of short, highly conserved non-coding RNAs known to play important roles in numerous developmental processes. MiRs regulate gene expression through incomplete base-pairing to a complementary sequence in the 3' untranslated region (3' UTR) of a target mRNA, resulting in translational repression and, to a lesser extent, accelerated turnover of the target transcript [1]. Recently, the dysregulation of miRs has been linked to cancer initiation and progression [2], indicating that miRs may play roles as tumor suppressor genes or oncogenes [3]. There is also mounting evidence that miRs are important in development timing [4,5], cell differentiation [6], cell cycle control and apoptosis [7]. The involvement of miRs in those biological functions suggests their intrinsic roles in maintaining homeostasis or contributing to patholo
Novel design and controls for focused DNA microarrays: applications in quality assurance/control and normalization for the Health Canada ToxArray?
Carole L Yauk, Andrew Williams, Sherri Boucher, Lynn M Berndt, Gu Zhou, Jenny L Zheng, Andrea Rowan-Carroll, Hongyan Dong, Iain B Lambert, George R Douglas, Craig L Parfett
BMC Genomics , 2006, DOI: 10.1186/1471-2164-7-266
Abstract: An EC dilution series that involves spike-in of a single concentration of the A. thaliana chlorophyll synthase gene to hybridize against spotted dilutions (0.000015 to 100 μM) of a single complimentary oligonucleotide representing the gene was developed. The EC series is printed in duplicate within each subgrid of the microarray and covers the full range of signal intensities from background to saturation. The design and placement of the series allows for QA examination of frequently encountered problems in hybridization (e.g., uneven hybridizations) and printing (e.g., cross-spot contamination). Additionally, we demonstrate that the series can be integrated with a LOWESS normalization to improve the detection of differential gene expression (improved sensitivity and predictivity) over LOWESS normalization on its own.The quality of microarray experiments and the normalization methods used affect the ability to measure accurate changes in gene expression. Novel methods are required for normalization of small focused microarrays, and for incorporating measures of performance and quality. We demonstrate that dilution of oligonucleotides on the microarray itself provides an innovative approach allowing the full dynamic range of the scanner to be covered with a single gene spike-in. The dilution series can be used in a composite normalization to improve detection of differential gene expression and to provide quality control measures.High-density genomic tools, such as DNA microarrays, provide an important opportunity to study the global response of genomes to particular stressors or conditions. Unfortunately, commercially-available DNA microarrays exhibit several disadvantages when applied to toxicological investigations. For example, in toxicology, the lack of representation of toxicologically-relevant genes on commercial microarrays is an important problem. Statistical issues arising from existing commercial array designs present additional limitations for toxicogenom
G-spots cause incorrect expression measurement in Affymetrix microarrays
Graham JG Upton, William B Langdon, Andrew P Harrison
BMC Genomics , 2008, DOI: 10.1186/1471-2164-9-613
Abstract: We have tested this expectation by examining the correlation coefficients between pairs of probes using the data on thousands of arrays that are available in the NCBI Gene Expression Omnibus (GEO) repository. We confirm the finding that G-spot probes are poorly correlated with others in their probesets and reveal that, by contrast, they are highly correlated with one another. We demonstrate that the correlation is most marked when the G-spot is at the 5' end of the probe.Since these G-spot probes generally show little correlation with the other members of their probesets they are not fit for purpose and their values should be excluded when calculating gene expression values. This has serious implications, since more than 40% of the probesets in the HG-U133A GeneChip contain at least one such probe. Future array designs should avoid these untrustworthy probes.Microarrays are commonly used to measure gene expression. One of the most popular microarray platforms is the Affymetrix GeneChip. In GeneChip arrays probe sequences with a nominal length of 25 bases are created by photolithography. The probes are arranged in pairs: a so-called Perfect Match (PM) probe and a mismatch (MM) probe that is identical to the PM probe with the exception that the 13th base is the complement of that in the PM probe. Each pair of probes belongs to a probe set (typically of 11 or 16 probe pairs) with each probe set being intended to provide information concerning the expression of a single gene. For some genes there may be more than one dedicated probe set.There are a number of alternative software tools for calculating a single measure of gene expression for a probe set: e.g. MAS5[1], dChip[2], RMA[3] and GCRMA[4]. To calculate the value of the expression measure, all the probes (or at least all the PM probes) in a probe set are used. However, if there are probes that are known to be liable to provide misleading information, then these should be excluded from the analysis so as to give mo
Classification of microarrays; synergistic effects between normalization, gene selection and machine learning
Jenny ?nskog, Eva Freyhult, Mattias Landfors, Patrik Rydén, Torgeir R Hvidsten
BMC Bioinformatics , 2011, DOI: 10.1186/1471-2105-12-390
Abstract: In this study, we used seven previously published cancer-related microarray data sets to compare the effects on classification performance of five normalization methods, three gene selection methods with 21 different numbers of selected genes and eight machine learning methods. Performance in term of error rate was rigorously estimated by repeatedly employing a double cross validation approach. Since performance varies greatly between data sets, we devised an analysis method that first compares methods within individual data sets and then visualizes the comparisons across data sets. We discovered both well performing individual methods and synergies between different methods.Support Vector Machines with a radial basis kernel, linear kernel or polynomial kernel of degree 2 all performed consistently well across data sets. We show that there is a synergistic relationship between these methods and gene selection based on the T-test and the selection of a relatively high number of genes. Also, we find that these methods benefit significantly from using normalized data, although it is hard to draw general conclusions about the relative performance of different normalization procedures.Machine learning methods have found many applications in gene expression data analysis, and are commonly used to classify patient samples into classes, corresponding to for example cancer sub-type, based on gene expression profiles. Supervised learning is a powerful tool in these studies since it can be used both to establish whether the classes of interest can be predicted from expression profiles and to provide an explanation as to what genes underlie the differences between classes. The expression data in such studies typically undergo an analysis pipeline in which the most important steps are data normalization, gene selection and machine learning. Although there are several comparative studies of methods for normalization, gene selection and machine learning, none have studied how all
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.