Multiple imputation of unordered categorical missing data: A comparison of the multivariate normal imputation and multiple imputation by chained equations
Abstract
. Missing data are common in survey data sets. Enrolled subjects
do not often have data recorded for all variables of interest. The inappropriate handling of them may negatively affect the inferences drawn. Therefore,
special attention is needed when analysing incomplete data. The multivariate
normal imputation (MVNI) and the multiple imputation by chained equations
(MICE) have emerged as the best techniques to deal with missing data. The
former assumes a normal distribution of the variables in the imputation model
and the latter fills in missing values taking into account the distributional
form of the variables to be imputed. This study examines the performance
of these methods when data are missing at random on unordered categorical
variables treated as predictors in the regression models. First, a survey data
set with no missing values is used to generate a data set with missing at random observations on unordered categorical variables.