CISUC

MNAR Imputation with Distributed Healthcare Data

Authors

Abstract

Missing data is a problem found in real-world datasets that has a considerable impact on the learning process of classifiers. Although extensive work has been done in this field, the MNAR mechanism still remains a challenge for the existing imputation methods, mainly because it is not related with any observed information. Focusing on healthcare contexts, MNAR is present in multiple scenarios such as clinical trials where the participants may be quitting the study for reasons related to the outcome that is being measured. This work proposes an approach that uses different sources of information from the same healthcare context to improve the imputation quality and classification performance for datasets with missing data under MNAR. The experiment was performed with several databases from the medical context and the results show that the use of multiple sources of data has a positive impact in the imputation error and classification performance.

Keywords

Missing data, Missing Not At Random, Missing mechanisms, Healthcare data, Data context, Data imputation

Subject

Missing Data

Conference

19th EPIA Conference on Artificial Intelligence 2019

DOI


Cited by

No citations found