|
Classification of text documents supervised by domain ontologiesKeywords: Text classification , Topic assignment , Supervised learning , Ontology , E-governance Abstract: The research objective is to establish an approach for supporting the classification of text documents referring to a specified domain. The focus is on the preliminary topic assignment to the documents used for training the model. The method implements domain ontology as background knowledge. The idea consists in extracting the preliminary topics for training the classifier by means of unsupervised machine learning on a text corpus and further alignment of the document vectors to concepts of the ontology. The results obtained by classification of new documents supervised by e-governance ontology with several machine learning algorithms showed sufficient match of their content to the ontology concepts. A conclusion is drawn that the approach can support the automatic extraction of documents relevant to any domain described by ontology.
|