Towards the automatic enrichment of a thesaurus with information in dictionaries
Authors
Abstract
Regarding that information in broad-coverage knowledge bases, such as thesauri, is usually incomplete, merging information from different sources is an alternative to amplify coverage. We propose a method for the enrichment of a thesaurus with information acquired automatically from dictionaries. First, synonymy pairs are extracted. Then, these pairs are assigned to the most similar candidate synsets. Finally, the remaining pairs are the target of clustering to identify new synsets. After selecting the adequate experimentation settings, this method was applied to enrich a Portuguese thesaurus with synonyms extracted from three dictionaries, which resulted in TRIP, a larger and broader thesaurus with new words and concepts. The steps towards the creation of this new thesaurus and its evaluation are described here.
Keywords
thesaurus, enrichment, synonymy, words, ontologies
Subject
Natural Language Processing
Related Project
Onto.PT
Journal
Expert Systems, Vol. 30, #4, pp. 320-332, Jon G. Hall, May 2013
DOI