Evolutionary Computation for Classifier Assessment and Improvement

Authors

Abstract

Typical Machine Learning (ML) approaches rely on a dataset and a model to solve problems. For most problems, optimisation of ML approaches is crucial to attain competitive performances. Most of the effort goes towards optimising the model by exploring new algorithms and tuning the parameters. Nevertheless, the dataset is also a key part for ML performance. Gathering, constructing and optimising a representative dataset is a hard task and a time-consuming endeavour, with no well-established guidelines to follow. In this thesis, we attest the use of Evolutionary Computation (EC) to assess and improve classifiers via synthesis of new instances. An analysis of the state of the art on dataset construction is performed. The quality of the dataset is tied to the availability of data, which in most cases is hard to control. A throughout analysis is made about Instance Selection and Instance Generation, which sheds light on relevant points for the development of our framework. The Evolutionary FramEwork for Classifier assessmenT and ImproVemEnt (EFECTIVE) is introduced and explored. The key parts of the framework are identified: the Classifier System (CS) module, which holds the ML model that is going to be assessed and improved; the EC module responsible for generating the new instances using the CS module for fitness assignment; and the Supervisor, a module responsible for managing the instances that are generated. The approach comes together in an iterative process of automatic assessment and improvement of classifiers. In a first phase, EFECTIVE is tested as a generator, creating instances of a particular class. Without loss of generality, we apply the framework in the domain of image generation. The problem that motivated the approach is presented first: frontal face generation. In this case, the framework relies on the combination of an EC engine and a CS module, i. e., a frontal face detector, to generate images of frontal faces. The results were revealing in two different ways. On the one hand, the approach was able to generate images that from a subjective standpoint resemble faces and are classified as such by the classifier. On the other hand, most of the images did not resemble faces, although they were classified as such by the classifier module. Based on the results, we extended the approach to generate other types of object, attaining similar results. We also combined several classifiers to study the evolution of ambiguous images, i. e. images that induce multistable perception. Overall, the results suggest that the framework is viable as a generator of instances and also that these instances are often misclassified by the CS module. Building on these results, in a second phase, a study of EFECTIVE for improving the performance of classifiers is performed. The core idea is to use the evolved instances that are generated by the EC engine to augment the training dataset. In this phase, the framework uses the Supervisor module to select and filter the instances that will be added to the dataset. The retraining of the classifier with these instances completes an iteration of the framework. We tested this pipeline in a face detection problem evolving instances to: (i) expand the negative dataset; (ii) expand the positive dataset; and (iii) expand both datasets in the same iteration. Overall, the results show that: expanding the negative dataset, by adding misclassified instances, reduces the number of false alarms; expanding the positive dataset increases the number of hits; expanding positive and negative datasets allows the simultaneous reduction of false alarms and increase of hits. After demonstrating the adequacy of EFECTIVE in face detection, we tested the framework in a Computational Creativity context to create an image generation system that promotes style change, obtaining results that further demonstrate the potential of the framework.

PhD Thesis

Evolutionary Computation for Classifier Assessment and Improvement, October 2018

PDF File

Cited by

No citations found