Towards Expanding Relevance Vector Machines to Large Scale Datasets
Authors
Abstract
In this paper we develop and analyze methods for expanding automated learning of Relevance Vector Machines (RVM) to large scale text sets. RVM rely on Bayesian inference learning and while maintaining state-of-the-art performance, offer sparse and probabilistic solutions. However, efforts towards applying RVM to large scale sets have met with limited success in the past, due to computational constraints.We propose a diversified set of divide-and-conquer approaches where decomposition techniques promote the definition of smaller working sets that permit the use of all training examples. The rationale is that by exploring incremental, ensemble and boosting strategies, it is possible to improve classification performance, taking advantage of the large training set available. Results on Reuters-21578 and RCV1 are presented, showing performance gains and maintaining sparse solutions that can be deployed in distributed environments.
Keywords
Large scale learning; text classification; relevance vector machinesSubject
Relevance Vector MachinesRelated Project
CATCH - Inductive Inference for Large Scale Data Bases Text CATegorizationJournal
International Journal of Neural Systems, Vol. 18, #1, pp. 45-58, World Scientific Publishing Company, February 2008Cited by
Year 2011 : 3 citations
Fully Bayesian analysis of the relevance vector machine with an extended hierarchical prior structure
E Fokoué, D Sun… - Statistical Methodology, 2011 - Elsevier
Acharya, U.R.a , Sree, S.V.b , Suri, J.S.c d
Automatic detection of epileptic eeg signals using higher order cumulant features
(2011) International Journal of Neural Systems, 21 (5), pp. 403-414.
/S0129065708001361, PII S0129065708001361
Fokoué, E.a , Goel, P.b
An optimal experimental design perspective on radial basis function regression
(2011) Communications in Statistics - Theory and Methods, 40 (7), pp. 1184-1195.
Year 2010 : 3 citations
Fokoue, Ernest; Goel, Prem,\"An optimal experimental design perspective on redial basis function regression\",
John D. Hromi Center for Quality and Applied Statistics (KGCOE), 2010
Patel, P.B., Marwala, T. , Caller behaviour classification using computational intelligence methods, International Journal of Neural Systems 20 (1), pp. 87-93, 2010
Yang, Y., Lu, B.-L. , Protein subcellular multi-localization prediction using a min-max modular support vector machine , International Journal of Neural Systems 20 (1), pp. 13-28 , 2010