|
Periodic pattern detection in sparse boolean sequencesAbstract: The algorithm is particularly robust with respect to strong signal distortions such as the addition of 1's at arbitrary positions (contaminated data), the deletion of existing 1's in the sequence (missing data) and the presence of disorder in the position of the 1's (noise). This robustness property stems from an appropriate exploitation of the remarkable alignment properties of periodic points in solenoidal coordinates.The efficiency of the algorithm is demonstrated in situations where standard Fourier-based spectral methods are poorly adapted. We also show how the proposed framework allows to identify the 1's that participate in the periodic trends, i.e. how the framework allows to allocate a positional score to genes, in the same spirit of the sequence score. The software is available for public use at http://www.issb.genopole.fr/MEGA/Softwares/iSSB_SolenoidalApplication.zip webcite.There is increasing evidence that the organization of the genome plays a crucial role in the interplay between genetic regulation and chromosome structure. At the smallest scale, several experimental studies have highlighted the importance of the positions of the transcription factor binding sites in the functioning of small transcriptional regulatory networks [1-3]. At a larger - but still local - scale, in bacteria many transcription units are known to be located along the DNA close to the gene that encodes their regulating transcription factors [4-6]. At the global scale of the chromosome, both in Escherichia coli and in Saccharomyces cerevisiae, it has been previously realized that the genes that are regulated by the same transcription factor have a tendency to be periodically spaced along the DNA [7,8]. Recently, the relative positions of phylogenetically conserved gene pairs were also shown to tend to periodically organize along the DNA in E. coli [9]. Such periodic organization has been proposed to be responsible for the spatial co-localization of co-regulated genes [10]; indee
|