Handling Missing Values via a Neural Selective Input Model

Authors

Abstract

Missing data is an ubiquitous problem with numerous and diverse causes. Handling Missing Values (MVs) properly is a crucial issue, in particular in Machine Learning (ML) and pattern recognition. To date, the only option available for Neural Networks (NNs) to handle this problem was to rely on pre-processing techniques such as imputation for estimating the missing data values, which limited considerably the scope of their application. We propose to lift this limitation by considering a Neural Selective Input Model (NSIM) that accommodates different transparent and bound models, while providing support for NNs to handle MVs directly. By embedding the mechanisms to support MVs we can obtain better models that reflect the uncertainty caused by unknown values. Experiments on several UCI datasets with both different distributions and proportion of MVs show that the NSIM approach is very robust and yields good to excellent results. Furthermore, the NSIM performs better than the state-of-the-art imputation techniques either with higher prevalence of MVs in a large number of features or with a significant proportion of MVs, while delivering competitive performance in the remaining cases. We demonstrate the usefulness and validity of the NSIM, making this a first-class method for dealing with this problem.

Keywords

Missing Values, Neural Networks, Back-Propagation, Multiple Back-Propagation

Subject

Missing Values Handling

Journal

Neural Network World , Vol. 22, #4, pp. 357-370, Mirko Novák, January 2012

Cited by

No citations found