Objective: When designing prediction models by complete case analysis (CCA), missing information in either baseline (predictors) or outcomes may lead to biased results. Multiple imputation (MI) has been shown to be suitable for obtaining unbiased results. This study provides researchers with an empirical illustration of the use of MI in a data set on low back pain, by comparing MI with the more commonly used CCA. Effects will be shown of imputing missing information on the composition and performance of prognostic models, distinguishing imputation of missing values in baseline characteristics and outcome data.
Methods: Data came from the Beliefs about Backpain cohort, a study of psychologic obstacles to recovery in primary care back pain patients in the United Kingdom. Candidate predictors included demographics, back pain characteristics, and psychologic variables. Complete case analysis was compared with MI within patients with complete outcome but missing baseline data (n = 809) and patients with missing baseline or outcome data (n = 1591). Multiple imputation was performed by a Multiple Imputation by Chained Equations procedure.
Results: Cases with missing outcome data (n = 782, 49.1%) or with missing baseline data (n = 116, 8%) both differed from complete cases regarding the distribution of some predictors and more often had a poor outcome. When comparing CCA with MI, model composition showed to be affected.
Conclusions: Complete case analysis can give biased results, even when only small amounts of data are missing. Now that MI is available in standard statistical software, we recommend that it be used to handle missing data.
This abstract is reproduced with the permission of the publisher; full text by subscription. Click on the above link for free full text. PubMed Record