CISUC

Automatically enriching a thesaurus with information from dictionaries

Authors

Abstract

Regarding that information in broad-coverage knowledge bases, such as thesauri, is usually incomplete, merging information from different sources is a good option to amplify coverage. We propose a method for the enrichment of a thesaurus with information acquired automatically from dictionaries: pairs of synonyms are assigned to candidate synsets and, the pairs whose elements are not in the thesaurus are clustered to identify new synsets. This method was used in the enrichment of a Brazilian Portuguese thesaurus with synonyms from a European Portuguese dictionary, and resulted in a larger and broader thesaurus with new words and new concepts. The assignments and the obtained synsets were manually evaluated and yielded correction scores higher than 71% and 85% respectively.

Keywords

thesaurus, wordnet, enrichment, synonymy

Subject

Natural Language Processing

Related Project

Onto.PT

Conference

15th Portuguese Conference on Artificial Intelligence (EPIA 2011), Lisbon, Portugal 2011


Cited by

No citations found