Boosting RVM Classifiers for Large Data Sets

Authors

Catarina Silva
Bernardete Ribeiro
Andrew H Sung

Abstract

Relevance Vector Machines (RVM) extend Support Vector Machines (SVM) to have probabilistic interpretations, to build sparse training models with fewer basis functions (i.e., relevant vectors or prototypes), and to realize Bayesian learning by placing priors over parameters (i.e., introducing hyperparameters). However, RVM algorithms do not scale up to large data sets. To overcome this problem, in this paper we propose a RVM boosting algorithm and demonstrate its potential with a text mining application. The idea is to build weaker classifiers, and then improve overall accuracy by using a boosting technique for document classification. The algorithm proposed is able to incorporate all the training data available; when combined with sampling techniques for choosing the working set, the boosted learning machine is able to attain high accuracy. Experiments on REUTERS benchmark show that the results achieve competitive accuracy against state-of-the-art SVM; meanwhile, the sparser solution found allows real-time implementations.

Subject

Large Scale Learning; RVM

Related Project

CATCH - Inductive Inference for Large Scale Data Bases Text CATegorization

Conference

ICANNGA 2007, April 2007

Cited by

Year 2009 : 1 citations

BMDS: An interpretable string based malware detection system using SVM ensemble with bagging
Show Abstract Ye, Y., Chen, L., Wang, D., Li, T., Jiang, Q., Zhao, M. Journal in Computer Virology 5 (4), pp. 283-293, 2009