全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

New resampling method for evaluating stability of clusters

DOI: 10.1186/1471-2105-9-42

Full-Text   Cite this paper   Add to My Lib

Abstract:

We propose a new resampling method based on continuous weights to assess the stability of clusters in hierarchical clustering. While in bootstrapping approximately one third of the original items is lost, continuous weights avoid zero elements and instead allow non integer diagonal elements, which leads to retention of the full dimensionality of space, i.e. each variable of the original data set is represented in the resampling sample.Comparison of continuous weights and bootstrapping using real datasets and simulation studies reveals the advantage of continuous weights especially when the dataset has only few observations, few differentially expressed genes and the fold change of differentially expressed genes is low.We recommend the use of continuous weights in small as well as in large datasets, because according to our results they produce at least the same results as conventional bootstrapping and in some cases they surpass it.Cluster analysis is a widely used tool for interpretation of gene expression experiments. It allows to group genes as well as (tissue) samples in classes (clusters) of similar characteristic profiles. Class assignment results from applying a similarity measure (i.e. distance measure or correlation) and a selected method to calculate the distance of an object to a class (i.e. single, complete or average linkage). The algorithms are well-defined and reproducible, however the choice of different similarity measures and cluster methods leads to different results [1].Algorithms for hierarchical agglomerative classification exist for a long time [2-6]. They are suitable for the description of highly dimensional data. Eisen et al. introduced hierarchical cluster analysis for microarray data in 1998 [7].A problem in cluster analysis is to discriminate between real and random clusters. The latter arise from random variation of gene expression measurements due to technical variation and biological variability. A measure of cluster stability is need

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133