%0 Journal Article %T Cluster analysis for DNA methylation profiles having a detection threshold %A Paul Marjoram %A Jing Chang %A Peter W Laird %A Kimberly D Siegmund %J BMC Bioinformatics %D 2006 %I BioMed Central %R 10.1186/1471-2105-7-361 %X We compare performance of existing methodology (such as k-means) with two novel methods that explicitly allow for the preponderance of values at 0. We also consider how the ability to successfully cluster such data depends upon the number of informative genes for which methylation is measured and the correlation structure of the methylation values for those genes. We show that when data is collected for a sufficient number of genes, our models do improve clustering performance compared to methods, such as k-means, that do not explicitly respect the supposed biological realities of the situation.The performance of analysis methods depends upon how well the assumptions of those methods reflect the properties of the data being analyzed. Differing technologies will lead to data with differing properties, and should therefore be analyzed differently. Consequently, it is prudent to give thought to what the properties of the data are likely to be, and which analysis method might therefore be likely to best capture those properties.With the invention of new high-throughput technologies, researchers are using molecular features to identify novel cancer subtypes. Currently, the most commonly analyzed molecular feature is gene expression. In such experiments, expression values are measured for a large number of genes (1,000's) across a smaller number of samples (10's-100's). More recent studies have used high-throughput arrays to measure protein abundances, single nucleotide polymorphisms (SNPs), or DNA methylation [1-3]. SNPs and DNA methylation are a more stable characteristic than gene expression, since they are based on DNA, which has less biological temporal variation and greater analyte stability than RNA. We investigate the use of DNA methylation for the classification of samples into disease subtypes. Previous studies of colon and lung cancer have shown some success [4,5].Currently there is no single platform for studying DNA methylation that is amenable to all study d %U http://www.biomedcentral.com/1471-2105/7/361