Home Page >
Research areas >
Methodology >
Applied stat methodology >
Reducing bias and properly reflecting uncertainty from missing data
Reducing bias and properly reflecting uncertainty from missing
data
Missing data in CTU trials and
observational studies is inevitable but may lead to bias. Even
modest amounts of missing data (10-15%) can make study conclusions
unreliable.
Simple approaches to dealing with
missing data in trials, such as ‘missing equals failure’ and ‘last
observation carried forward’, are known to be biased and lead to
over-precise estimates of treatment effects. Nevertheless, they are
still used in the analysis of trials and are recommended by
regulatory agencies.
Better approaches include analysis of
all available measurements in a likelihood-based analysis,
inverse-probability weighting-based estimating equations, and
methods based on multiple imputation of the missing data.
These methods are valid when the missing data are ‘missing at
random’.
The CTU has contributed to these
approaches, and in particular to the practical implementation of
multiple imputation through the development of the
ice and mim software packages in
Stata.
Several aspects of multiple imputation
methods deserve further exploration and development. We plan to
assess the robustness of the methods that assume ‘missing at
random’ through a sensitivity analysis varying the degree to which
the missing data are related to their unobserved values, i.e.
‘missing not at random’. We are also concerned with definitions and
methods to produce an intention-to-treat analysis when there are
missing data, and plan to further develop Stata software for the
practical use of multiple imputation.
The Unit collaborates with leading
missing data methods researchers at the London School of Hygiene
and Tropical Medicine, University of Bristol, University of
Melbourne (Australia), MRC Biostatistics Unit and beyond, and
contributes to workshops on imputation.
Key
projects
- Further extensions to the widely used
multiple imputation software in Stata: development of the
underlying theory, practical strategies around model building and
estimation with multiply imputed data
- Methods to provide a sensitivity analysis for
missing data ‘not at random’
Selected
publications
- Vergouwe Y, Royston P, Moons KG, Altman
DG. Development and validation of a prediction model with missing
predictor data: a practical approach. Journal of Clinical
Epidemiology 2010; 63:205-214
- White IR, Royston P. Imputing missing
covariate values for the Cox model. Statistics in Medicine 2009;
28:1982-1998
- Wood AM, White IR, Royston P. How
should variable selection be performed with multiply imputed data?
Statistics in Medicine 2008; 27:3227-3246
- Seaman SR, White IR, Copas AJ, Li L
Combining multiple imputation and inverse-probability weighting
in press - Biometrics