|
Applications of string mining techniques in text analysisKeywords: clustering , bioinformatics , string mining , text analysis , information retrieval Abstract: The focus of this project is on the algorithms and data structures used in string mining and their applications in bioinformatics, text mining and information retrieval. More specific, it studies the use of suffix trees and suffix arrays for biological sequence analysis, and the algorithms used for approximate string matching, both general ones and specialized ones used in bioinformatics, like the BLAST algorithm and PAM substitution matrix. Also, an attempt is made to apply these structures and algorithms for text mining and information retrieval.
|