OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Algorithms for Molecular Biology 2011

WordCluster: detecting clusters of DNA words and genomic elements

DOI: 10.1186/1748-7188-6-2

Michael Hackenberg, Pedro Carpena, Pedro Bernaola-Galván, Guillermo Barturen, ángel M Alganza, José L Oliver

Full-Text Cite this paper Add to My Lib

Abstract:

We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php webcite including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.Genome entities as diverse as genes [1], CpG dinucleotides [2], transcription factor binding sites (TFBSs [3]) or ultra-conserved non-coding regions [4] usually form clusters along the chromosome sequence. Such spatial clustering often translates into genome structures with a clear functional and/or evolutionary meaning: gene clusters encoding the same or similar products and originated through gene duplication events, CpG islands, cis-regulatory modules, etc. Thus, the spatial clustering of functional genome elements (in general, words or k-mers) would somewhat remember the situation in literary texts, where keywords show a strong clustering, whereas common words are randomly distributed [5].Despite its potential importance, no algorithm exists to detect the clustering of DNA words in a rigorous way. Most current methods are based on densities and sliding-window a

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133