Chapter10 Measures of Missing data information

These measures are the Fraction of Missing information (FMI), the relative increase in variance due to nonresponse (RIV) and the Relative Efficiency (RE). They are derived from values of the between, and within imputation variance and the total variance. There exist two versions of the FMI, which are referred to as lambda and FMI.

10.1 Fraction of Missing Information - Lambda

The proportion of total variance due to missingness, lambda, (Van Buuren (2018); Raghunathan (2016)) can be derived from the between and total missing data variance as:

$\begin{equation} Lambda = \frac{V_B + \frac{V_B}{m}}{V_T} \tag{10.1} \end{equation}$

Where m is the number of imputed datasets and ${V_B}$ and ${V_T}$ are the between and total variance respectively. This value can be interpreted as the proportion of variation in the parameter of interest due to the missing data.

When we use the ${V_B}$ and ${V_T}$ values that were calculated in paragraph 5.1.2, lambda will be:

$Lambda = \frac{0.040027 + \frac{0.040027}{3}}{0.849084}=0.06283485$

This specific value for lambda is not reported by SPSS, but is reported by the mice package in R. Van Buuren (2018) and Enders (2010) use the same formula to calculate this type of missing data information, but van Buuren calls it lambda and Enders FMI.

10.2 Relative increase in variance

Another related measure is the relative increase in variance due to nonresponse. This value is calculated as:

$\begin{equation} RIV = \frac{V_B + \frac{V_B}{m}}{V_W} \tag{10.2} \end{equation}$

Where ${V_B}$ and ${V_W}$ are the between and within variance respectively. This value can be interpreted as the proportional increase in the sampling variance of the parameter of interest that is due to the missing data.

Filling in this formula with the values for ${V_B}$ and ${V_W}$ from paragraph 5.1.2 results in:

$RIV = \frac{0.040027 + \frac{0.040027}{3}}{0.7957147}=0.06704779$

This value is also presented in (Figure 9.1) in the column Relative Increase Variance. The relation between RIV and Lambda is defined as

$\begin{equation} RIV = \frac{Lambda }{1 - Lambda}. \tag{10.3} \end{equation}$

10.3 Fraction of Missing Information - FMI

$\begin{equation} FMI = \frac{RIV + \frac{2}{df+3}}{1+RIV} \tag{10.4} \end{equation}$

Where RIV is the relative increase in variance due to missing data and df is the degrees of freedom for the pooled result. The degrees of freedom for the pooled result can be obtained in two ways: ${df_{Old}}$ or ${df_{Adjusted}}$ .

In SPSS, FMI is calculated using ${df_{Old}}$ , which results in:

$FMI = \frac{RIV + \frac{2}{df+3}}{1+RIV}=\frac{0.06704779 + \frac{2}{506.5576+3}}{1+0.06704779}=0.0665132$

In R package mice, FMI is calculated using the formula for ${df_{Adjusted}}$ , that results in:

$FMI = \frac{RIV + \frac{2}{df_{Adjusted}+3}}{1+RIV}=\frac{0.06704779 + \frac{2}{107.7509+3}}{1+0.06704779}=0.0797587$

The difference between lambda and FMI is that FMI is adjusted for the fact that the number of imputed datasets that are generated is not unlimitedly large. These measures differ for a small value of the df.

10.4 Relative Efficiency

The Relative Efficiency (RE) is defined as:

$\begin{equation} RE = \frac{1}{1+\frac{FMI}{m}} \tag{10.5} \end{equation}$

FMI is the fraction of missing information and m is the number of imputed datasets.

The RE value is only provided by SPSS and is calculated by filling in the values of (Figure 9.1) as follows:

$RE = \frac{1}{1+\frac{0.0665132}{3}}=0.9783098$

The RE gives information about the precision of the parameter estimate as the standard error of a regression coefficient.

References

Enders, Craig K. 2010. Applied Missing Data Analysis. Guilford Press.

Raghunathan, T. 2016. Missing Data Analysis in Practice. Boca Raton, FL: Boca Raton: CRC Press.

Van Buuren, S. 2018. Flexible Imputation of Missing Data. Second Edition. Boca Raton, FL: Chapman & Hall/CRC.