|
- 2017
Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seedsDOI: 10.1186/s13015-017-0092-1 Keywords: Spaced seeds, Dominant seeds, Bernoulli, Hit Integration, Heaviside, Dirac, Counting semi-ring, Polynomial form, DFA Abstract: Spaced seeds, also named gapped q-grams, gapped k-mers, spaced q-grams, have been proven to be more sensitive than contiguous seeds (contiguous q-grams, contiguous k-mers) in nucleic and amino-acid sequences analysis. Initially proposed to detect sequence similarities and to anchor sequence alignments, spaced seeds have more recently been applied in several alignment-free related methods. Unfortunately, spaced seeds need to be initially designed. This task is known to be time-consuming due to the number of spaced seed candidates. Moreover, it can be altered by a set of arbitrary chosen parameters from the probabilistic alignment models used. In this general context, Dominant seeds have been introduced by Mak and Benson (Bioinformatics 25:302–308, 2009) on the Bernoulli model, in order to reduce the number of spaced seed candidates that are further processed in a parameter-free calculation of the sensitivity
|