Azhary: An Arabic Lexical Ontology  [PDF]
Hossam Ishkewy,Hany Harb,Hassan Farahat
Computer Science , 2014,
Abstract: Arabic language is the most spoken languages in the Semitic languages group, and one of the most common languages in the world spoken by more than 422 million. It is also of paramount importance to Muslims, it is a sacred language of the Islamic Holly Book (Quran) and prayer (and other acts of worship) in Islam is performed only by mastering some of Arabic words. Arabic is also a major ritual language of a number of Christian churches in the Arab world and it is also used in writing several intellectual and religious Jewish books in the Middle Ages. Despite this, there is no semantic Arabic lexicon which researchers can depend on. In this paper we introduce Azhary as a lexical ontology for the Arabic language. It groups Arabic words into sets of synonyms called synsets, and records a number of relationships between words such as synonym, antonym, hypernym, hyponym, meronym, holonym and association relations. The ontology contains 26,195 words organized in 13,328 synsets. It has been developed and contrasted against AWN which is the most common available Arabic lexical ontology.
Hacia la identificación de relaciones de hiponimia/hiperonimia en Internet Towards the identification of hyponym/hypernym relations in the Internet
Rosa María Ortega,César Aguilar,Luis Villase?or,Manuel Montes
Revista Signos , 2011,
Abstract: En este trabajo se presenta un enfoque para la extracción automática de pares hipónimo-hiperónimo. En particular se propone un método de extracción de información léxica, orientado a la relación de hiponimia, que utiliza un conjunto de patrones léxicos propios del espa ol, así como un esquema simétrico de calificación de pares/patrones cuyo objetivo es enriquecer la confiabilidad del método de extracción. La eficacia del método propuesto se evaluó obteniendo hipónimos correspondientes a un vocabulario de hiperónimos dado. Los resultados logrados confirman la utilidad del método propuesto para extraer hipónimos, así como la relevancia del esquema de calificación de pares/patrones. This paper presents an approach to the automatic extraction of hyponyms and hyperonyms. In particular, it proposes an information extraction method that is specially suited for identifying pairs of hyponym-hyperonym by using a set of Spanish lexical patterns. It also proposes a symmetric weighting scheme of pairs/patterns whose goal is to enhance the confidence of the extraction method. The effectiveness of the proposed approach was evaluated by extracting hyponyms from a given vocabulary of hyperonyms. Results show the usefulness of the proposed extraction method as well as the relevance of the pairs/patterns weighting scheme.
INRIASAC: Simple Hypernym Extraction Methods  [PDF]
Gregory Grefenstette
Computer Science , 2015,
Abstract: Given a set of terms from a given domain, how can we structure them into a taxonomy without manual intervention? This is the task 17 of SemEval 2015. Here we present our simple taxonomy structuring techniques which, despite their simplicity, ranked first in this 2015 benchmark. We use large quantities of text (English Wikipedia) and simple heuristics such as term overlap and document and sentence co-occurrence to produce hypernym lists. We describe these techniques and pre-sent an initial evaluation of results.
Using WordNet for Building WordNets  [PDF]
Xavier Farreres,German Rigau,Horacio Rodriguez
Computer Science , 1998,
Abstract: This paper summarises a set of methodologies and techniques for the fast construction of multilingual WordNets. The English WordNet is used in this approach as a backbone for Catalan and Spanish WordNets and as a lexical knowledge resource for several subtasks.
Integrating selectional preferences in WordNet  [PDF]
Eneko Agirre,David Martinez
Computer Science , 2002,
Abstract: Selectional preference learning methods have usually focused on word-to-class relations, e.g., a verb selects as its subject a given nominal class. This paper extends previous statistical models to class-to-class preferences, and presents a model that learns selectional preferences for classes of verbs, together with an algorithm to integrate the learned preferences in WordNet. The theoretical motivation is twofold: different senses of a verb may have different preferences, and classes of verbs may share preferences. On the practical side, class-to-class selectional preferences can be learned from untagged corpora (the same as word-to-class), they provide selectional preferences for less frequent word senses via inheritance, and more important, they allow for easy integration in WordNet. The model is trained on subject-verb and object-verb relationships extracted from a small corpus disambiguated with WordNet senses. Examples are provided illustrating that the theoretical motivations are well founded, and showing that the approach is feasible. Experimental results on a word sense disambiguation task are also provided.
Experiences in building the indo wordnet- a wordnet for Manipuri
Yumnam Bablu Singh,,Prof. Bipul Syam Purkyastha
International Journal of Engineering Science and Technology , 2011,
Abstract: It is possible to overcome the barrier of major languages in India total using IndoWordNet. Keeping IndoWordNet as the heart of any intelligent information processing system since practioners ofinformation retrieval, NLP and Knowledge engineers increasingly understood as a rich lexical knowledge base. Words are the small units that bind together to create a knowledge web with extensive and unique ways with extremely powerful. WordNet for Manipuri language as we know as IndoWordNet is an attempt to build reference system for Manipuri language. IndoWordNet is inspired by the famous EnglishWordNet. For entering each word we find the synonym set, representing one lexical concept. Semantic relationships are used to link other synonym sets with those entering synonym for Hypernymy, Hyponymy, Meronymy, and Antonymy. In Manipuri language words are formed in three processes. They are affixation, derivation and compounding. The IndowordNet also highlight features like graded antonymy and meronymy. Besides this it also addresses the unique Indian languages phenomenon like causative, compound and conjunctive verbs, both at the conceptual level and the implementation. Being efficient database design, web interface for querying the IndoWordNet has been implemented using Php4 scripting language. Above all the interface used Java/Jfc for simply and elegant.
Linking Geographic Vocabularies through WordNet  [PDF]
Andrea Ballatore,Michela Bertolotto,David C. Wilson
Computer Science , 2014, DOI: 10.1080/19475683.2014.904440
Abstract: The linked open data (LOD) paradigm has emerged as a promising approach to structuring and sharing geospatial information. One of the major obstacles to this vision lies in the difficulties found in the automatic integration between heterogeneous vocabularies and ontologies that provides the semantic backbone of the growing constellation of open geo-knowledge bases. In this article, we show how to utilize WordNet as a semantic hub to increase the integration of LOD. With this purpose in mind, we devise Voc2WordNet, an unsupervised mapping technique between a given vocabulary and WordNet, combining intensional and extensional aspects of the geographic terms. Voc2WordNet is evaluated against a sample of human-generated alignments with the OpenStreetMap (OSM) Semantic Network, a crowdsourced geospatial resource, and the GeoNames ontology, the vocabulary of a large digital gazetteer. These empirical results indicate that the approach can obtain high precision and recall.
Methods and Tools for Building the Catalan WordNet  [PDF]
Laura Benitez,Sergi Cervell,Gerard Escudero,Monica Lopez,German Rigau,Mariona Taule
Computer Science , 1998,
Abstract: In this paper we introduce the methodology used and the basic phases we followed to develop the Catalan WordNet, and shich lexical resources have been employed in its building. This methodology, as well as the tools we made use of, have been thought in a general way so that they could be applied to any other language.
Improving Query Expansion Using WordNet  [PDF]
Dipasree Pal,Mandar Mitra,Kalyankumar Datta
Computer Science , 2013,
Abstract: This study proposes a new way of using WordNet for Query Expansion (QE). We choose candidate expansion terms, as usual, from a set of pseudo relevant documents; however, the usefulness of these terms is measured based on their definitions provided in a hand-crafted lexical resource like WordNet. Experiments with a number of standard TREC collections show that this method outperforms existing WordNet based methods. It also compares favorably with established QE methods such as KLD and RM3. Leveraging earlier work in which a combination of QE methods was found to outperform each individual method (as well as other well-known QE methods), we next propose a combination-based QE method that takes into account three different aspects of a candidate expansion term's usefulness: (i) its distribution in the pseudo relevant documents and in the target corpus, (ii) its statistical association with query terms, and (iii) its semantic relation with the query, as determined by the overlap between the WordNet definitions of the term and query terms. This combination of diverse sources of information appears to work well on a number of test collections, viz., TREC123, TREC5, TREC678, TREC robust new and TREC910 collections, and yields significant improvements over competing methods on most of these collections.
Disambiguating bilingual nominal entries against WordNet  [PDF]
German Rigau,Eneko Agirre
Computer Science , 1995,
Abstract: This paper explores the acquisition of conceptual knowledge from bilingual dictionaries (French/English, Spanish/English and English/Spanish) using a pre-existing broad coverage Lexical Knowledge Base (LKB) WordNet. Bilingual nominal entries are disambiguated agains WordNet, therefore linking the bilingual dictionaries to WordNet yielding a multilingual LKB (MLKB). The resulting MLKB has the same structure as WordNet, but some nodes are attached additionally to disambiguated vocabulary of other languages. Two different, complementary approaches are explored. In one of the approaches each entry of the dictionary is taken in turn, exploiting the information in the entry itself. The inferential capability for disambiguating the translation is given by Semantic Density over WordNet. In the other approach, the bilingual dictionary was merged with WordNet, exploiting mainly synonymy relations. Each of the approaches was used in a different dictionary. Both approaches attain high levels of precision on their own, showing that disambiguating bilingual nominal entries, and therefore linking bilingual dictionaries to WordNet is a feasible task.
