|
BMC Bioinformatics 2009
Detection of recurrent copy number alterations in the genome: taking among-subject heterogeneity seriouslyAbstract: We have developed two methods to define recurrent CNA regions from aCGH data. Our methods are unique and qualitatively different from existing approaches: they detect regions over both the complete set of arrays and alterations that are common only to some subsets of the samples (i.e., alterations that might characterize previously unknown groups); they use probabilities of alteration as input and return probabilities of being a common region, thus allowing researchers to modify thresholds as needed; the two parameters of the methods have an immediate, straightforward, biological interpretation. Using data from previous studies, we show that we can detect patterns that other methods miss and that researchers can modify, as needed, thresholds of immediate interpretability and develop custom statistics to answer specific research questions.These methods represent a qualitative advance in the location of recurrent CNA regions, highlight the relevance of population heterogeneity for definitions of recurrence, and can facilitate the clustering of samples with respect to patterns of CNA. Ultimately, the methods developed can become important tools in the search for genomic regions harboring disease-critical genes.Genomic DNA copy number is often variable. Some of this variability, commonly referred as copy number variations or CNVs, is naturally present in the germ line and thus heritable [1-3], whereas somatic, large-scale alterations that often characterize tumor cells are called copy number alterations or copy number aberrations (CNAs) [3-6]. These CNAs are often longer than CNVs and have been linked to other diseases in addition to cancer, such as HIV acquisition and progression, autoimmune diseases, and Alzheimer and Parkinson's disease [7-10]. The most popular current approaches for the identification of DNA copy number differences are chip- or array-based. These include SNP arrays [11-13] and array-based Comparative Genomic Hybridization (aCGH). aCGH is a broad ter
|