Distributed Ensemble Learning in Text Classification
Authors
Abstract
In today's society, individuals and organizations are faced with an ever growing load and diversity of textual information and content, and with increasing demands for knowledge and skills. In this work we try to answer part of these challenges by addressing text classification problems, essential to managing knowledge, by combining several different pioneer kernel-learning machines, namely Support Vector Machines and Relevance Vector Machines. To excel complex learning procedures we establish a model of high-performance distributed computing environment to help tackling the tasks involved in the text classification problem.The presented approach is valuable in many practical situations where text classification is used. Reuters-21578 benchmark data set is used to demonstrate the strength of the proposed system while different ensemble based learning machines provide text classification models that are efficiently deployed in the Condor and Alchemi platforms.