OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Information Research: an international electronic journal 1996

Processing morphological variants in searches of Latin text

Mark Greengrass,Alexander M. Robertson,Robyn Schinke,Peter Willett

Keywords: natural language , text databases , query words , recall , word variants , morphology , retrieval systems , truncation , information retrieval , IR , stemming algorithms , stemmers , suffixes , humanities , Latin

Full-Text Cite this paper Add to My Lib

Abstract:

A characteristic of natural-language text databases is that a user must be able to specify all of the variant forms of each query word if high recall is to be achieved. The most common type of word variants are those arising from morphology and thus most retrieval systems provide facilities for user-controlled right-hand (and occasionally left-hand) truncation to allow the retrieval of all words with the same root. A stemming algorithm, or stemmer, is a computational procedure that reduces all words with the same root to a single form by stripping the root of its derivational and inflectional affixes. In most cases, only suffixes are stripped so that a stemmer provides an automatic equivalent of manual, right-hand truncation. Thus far, most work on stemmers has focused on present-day languages, but the increasing user of computers in the humanities has resulted in a need for comparable tools to facilitate searching in historical text databases. This paper summarises some of the initial results of a project here in Sheffield to develop such tools for databases of Latin text.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133