CISUC

Rare Class Text Categorization with SVM Ensemble

Authors

Abstract

Text Classification is the assignment of a class from a predetermined set to a new document. In real world applications the number of positive examples for most classes is limited, while the overall number of examples is huge. In this setting classifiers' performance can experience a not so graceful degradation, especially where false negatives are concerned. To handle this problem, we propose a committee of several SVM, where the learning strategy uses the separating margin as differentiating factor on positive classifications. While enabling robustness, the method improves performance by correcting errors of one classifier using the accurate output of others. We demonstrate the practicality and effectiveness of the method by simulation results on Reuters-21578 data set.

Keywords

Text Classification, SVM, Ensembles

Subject

Text Classification, SVM, Ensembles

Related Project

CATCH - Inductive Inference for Large Scale Data Bases Text CATegorization

Journal

Journal of Electrotechnical Review (Przeglad Elektrotechniczny), Vol. 1, pp. 28-31, January 2006

Cited by

No citations found