全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Mining gene expression data by interpreting principal components

DOI: 10.1186/1471-2105-7-194

Full-Text   Cite this paper   Add to My Lib

Abstract:

We present a method for automatically identifying such candidate sets of biologically relevant genes using a combination of principal components analysis and information theoretic metrics. To enable easy use of our methods, we have developed a data analysis package that facilitates visualization and subsequent data mining of the independent sources of significant variation present in gene microarray expression datasets (or in any other similarly structured high-dimensional dataset). We applied these tools to two public datasets, and highlight sets of genes most affected by specific subsets of conditions (e.g. tissues, treatments, samples, etc.). Statistically significant associations for highlighted gene sets were shown via global analysis for Gene Ontology term enrichment. Together with covariate associations, the tool provides a basis for building testable hypotheses about the biological or experimental causes of observed variation.We provide an unsupervised data mining technique for diverse microarray expression datasets that is distinct from major methods now in routine use. In test uses, this method, based on publicly available gene annotations, appears to identify numerous sets of biologically relevant genes. It has proven especially valuable in instances where there are many diverse conditions (10's to hundreds of different tissues or cell types), a situation in which many clustering and ordering algorithms become problematic. This approach also shows promise in other topic domains such as multi-spectral imaging datasets.Bioinformatics has placed much emphasis on using various unsupervised clustering techniques as a means to understand the information present in gene microarray expression datasets. Clustering techniques produce a rich taxonomy of results by defining groups of genes that act more or less similarly across a number of experimental conditions. The diverse approaches to clustering genes by expression levels include k-means [1], self-organizing map

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133