13.6 Criteria for Choosing an Effective Approach

Choosing an appropriate imputation method depends on the following criteria:

  1. Unbiased Parameter Estimates: The technique should ensure that key estimates, such as means, variances, and regression coefficients, are unbiased, particularly in the presence of MAR or MNAR data.

  2. Adequate Power: The method should preserve statistical power, enabling robust hypothesis testing and model estimation. This ensures that important effects are not missed due to inflated type II error.

  3. Accurate Standard Errors: Accurate estimation of standard errors is critical for reliable p-values and confidence intervals. Methods like single imputation often underestimate standard errors, leading to overconfident conclusions.

Preferred Methods: Multiple Imputation and Full Information Maximum Likelihood

Multiple Imputation (MI):

  • MI replaces missing values with multiple plausible values drawn from a predictive distribution. It generates multiple complete datasets, analyzes each dataset, and combines the results.

  • Pros: Handles uncertainty well, provides valid standard errors, and is robust under MAR.

  • Cons: Computationally intensive, sensitive to model mis-specification.

Full Information Maximum Likelihood (FIML):

  • FIML uses all available data to estimate parameters directly, avoiding the need to impute missing values explicitly.

  • Pros: Efficient, unbiased under MAR, and computationally elegant.

  • Cons: Requires correctly specified models and may be sensitive to MNAR data.

Methods to Avoid

  • Single Imputation (e.g., Mean, Mode):
    • Leads to biased estimates and underestimates variability.
  • Listwise Deletion:
    • Discards rows with missing data, reducing sample size and potentially introducing bias if the data is not MCAR.

Practical Considerations

  • Computational efficiency and ease of implementation.
  • Compatibility with downstream analysis methods.
  • Alignment with the data’s missingness mechanism.