全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Discovering biclusters in gene expression data based on high-dimensional linear geometries

DOI: 10.1186/1471-2105-9-209

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper, we present a novel geometric perspective for the biclustering problem. The biclustering process is interpreted as the detection of linear geometries in a high dimensional data space. Such a new perspective views biclusters with different patterns as hyperplanes in a high dimensional space, and allows us to handle different types of linear patterns simultaneously by matching a specific set of linear geometries. This geometric viewpoint also inspires us to propose a generic bicluster pattern, i.e. the linear coherent model that unifies the seemingly incompatible additive and multiplicative bicluster models. As a particular realization of our framework, we have implemented a Hough transform-based hyperplane detection algorithm. The experimental results on human lymphoma gene expression dataset show that our algorithm can find biologically significant subsets of genes.We have proposed a novel geometric interpretation of the biclustering problem. We have shown that many common types of bicluster are just different spatial arrangements of hyperplanes in a high dimensional data space. An implementation of the geometric framework using the Fast Hough transform for hyperplane detection can be used to discover biologically significant subsets of genes under subsets of conditions for microarray data analysis.In DNA microarray experiments, discovering groups of genes that share similar transcriptional characteristics is instrumental in functional annotation, tissue classification and motif identification [1,2]. In many situations, an interesting cellular process is active only under a subset of conditions, or a single gene may participate in multiple pathways that may or may not be co-active under all conditions [3,4]. In addition, the data to be analyzed often include many heterogeneous conditions from many experiments. In these instances, it is often unrealistic to require that related genes behave similarly across all measured conditions and conventional clust

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133