Improving Question-Answering for Portuguese using Triples Extracted from Corpora
Authors
Abstract
We present here an evolution of a QA system for Portuguese that uses subject-predicate-object triples extracted from sentences in a corpus. The system is supported by indices that store those triples, related sentences and documents. It processes the questions and retrieves answers based on the triples.For purposes of testing and evaluation, we have used the CHAVE cor- pus, used in multiple editions of the CLEF multilingual QA tracks. The questions from those editions were used to query and benchmark our system. Currently, the system manages to answer up to 42% of those questions. This document describes the modules that compose the sys- tem and how they are combined, providing a brief analysis on them, and also current results, as well as some expectations regarding future work.