|
Filtered Composition and Markers for a Flexible Edit-Distance. Application to the Correction of Out-Of-Vocabulary Words. Composition ltrée et marqueurs de règles de réécriture pour une distance d'édition exible. Application à la correction des mots hors vocabulaireKeywords: edit distance , ltered composition , spelling correction , rules' markers Abstract: We present an original and exible implementation of the edit-distance: the ltered composition, a special kind of composition of two nite-state machines through a lter that mod- els all valid edit-operations. The lter is either a weighted transducer or a cascade of weighted transducers. It is built from weighted rewrite rules that take advantage of a new concept de ned in our nite-state framework: the rules' marker, a symbol that does not belong to the alphabet in use, but is inserted into a rewrite rule in order to mark a phenomenon and to track its evolution. Markers disambiguate and make it easy to express conditions and constraints. The method is illustrated on the task of correcting out-of-vocabulary words.
|