Evaluation of Oversampling Data Balancing Techniques in the Context of Ordinal Classification

Authors

Francisco Marques
Inês Campos Monteiro Sabino Domingues
José Pedro Pereira Amorim
Hugo Duarte
João Santos
Pedro Manuel Henriques da Cunha Abreu

Abstract

The machine learning field has grown considerably in the last years. There are, however, some problems still to be solved. The characteristics of the training sets, for instance, are known to affect the classifiers performance. Here, and inspired by medical applications, we are interested in studying datasets that are both ordinal and imbalanced. Ordinal datasets present labels where only the relative ordering between different values is significant. Imbalanced datasets have very different quantity of examples per class.

Building upon our previous work, we make three new contributions, (1) extend the number of classifiers, (2) evaluate two techniques to balance intermediate train sets in binary decomposition methods (often used in multi-class contexts and ordinal ones in particular), and (3) propose a new, iterative, classifier-based over-sampling algorithm that we name InCuBAtE. Experiments were made on 6 private datasets, concerning the assessment of response to treatment on oncologic diseases, and 15 public datasets widely used in the literature. When compared with our previous work, results have improved (or remained the same) for 4 of the 6 private datasets and for 11 out of the 15 public datasets.

Keywords

diseases;learning (artificial intelligence);medical computing;pattern classification;sampling methods;oversampling data balancing techniques;ordinal classification;data imbalance;dataset;classifier;oversampling strategies;multiclass tasks;medical applications;private datasets;data balance techniques;classification results;ordinal imbalanced datasets;public datasets;MMAE;oncologic diseases;Diseases;Toy manufacturing industry;Pipelines;Bibliographies;Automobiles;Encoding;Decoding

Evaluation of Oversampling Data Balancing Techniques in the Context of Ordinal Classification

Authors

Abstract

Keywords

Conference

DOI

Cited by