Chapter 3 Description of the technique: pattern-mixture modeling

3.1 Two approaches to MNAR sensitivity analysis

Two major techniques have been used to assess the bias introduced by MNAR mechanisms (Leurent et al. 2018; Little 1993; National Research Council 2010; Rubin 1987): selection models and pattern-mixture models. Since MNAR methods must, by definition, rely on assumptions and distributions outside of the observed data, both techniques use Bayes’ theorem to test the effects of different plausible magnitudes of the MNAR mechanism. The following are general formulas which describe the two approaches:

Selection models:

\(P(Y,R)=P(Y|R)P(R)=P(Y \cup complete=1)=P(Y)P(complete = 1|Y)\)

Pattern-mixture models:

\(P(Y,R)=P(Y|R)P(R)=P(Y|R=1)*P(R=1)+P(Y|R=0)P(R=0)\)

Where

\(Y\) = outcome, and

\(R\) = an indicator variable for being observed in the data

In essence, both techniques attempt to decompose the joint distribution of both the outcome Y and an indicator variable of being observed R. But because the probability of the outcome given not being observed, \(P(Y|R=0)\), is unknown, both methods require that the analyst selects a Bayesian “prior” for R. This prior can be informed by externally collected data, expert knowledge, or relevant literature (see Section 7.

The selection model approach was originally developed by Heckman (1979). This approach assumes a mechanism that predicts completeness – the “selection model” – and multiplies the resulting response weights \(P(R|Y)\) by the marginal distribution of the outcome \(P(Y)\). The challenge with selection models is that both terms are unknown, so assumptions must be made for both \(P(R|Y)\) and \(P(Y)\).

Pattern-mixture model approaches, as described by both Rubin (1987) and Little (1994), accomplish a similar goal, except they build upon an underlying distribution of Y for non-missing observations \(P(Y|R=1)\). This means that only one assumption is required: the magnitude of \(P(Y|R=0)\). This approach mixes the distributions of \(Y\) given \(R=1\) and given \(R=0\) by applying “mixing probabilities” \(P(R=1)\) and \(P(R=0)\).

3.2 Proposed approach

For the purpose of sensitivity analysis for the impact of non-ignorable LTFU and MNAR outcome data, we propose the use of pattern-mixture modeling. Pattern-mixture modeling has several relative advantages to selection modeling, including its relatively simple application in statistical software, integrated consideration of MAR and MNAR mechanisms, and explicit statement of the type and magnitude of MNAR mechanisms being tested.

Furthermore, pattern-mixture models can be fit as an extension of the multiple-imputation-based modeling framework for MAR mechanisms. In the real world, many researchers would have already conducted a complete case analysis and considered implementing multiple imputation for the data they believe is at least partially missing at random, conditional on measured covariates.

As will be shown below, by applying pattern-mixture modeling within a multiple imputation framework, this technique allows for a clear comparison of the effects by MCAR, MAR, and MNAR mechanisms. This technique helps us to answer the question, “how bad does the MNAR mechanism have to be for MCAR or MAR to produce biased results?”

But first, let us introduce four missing data scenarios.