Chapter 8 The meaning of data being MAR

In a MAR missing data situation, missing values can be explained by other (observed) variables, like in the example of the disability and pain variables above. Further, it was stated that within the category of pain scores ≤ 5 the disability scores are MCAR. This means that the mean difference of disability between persons with low pain scores and high pain scores is the same between the observed and missing data. This is illustrated by using the mean disability values in the tables below.

In the tables the means of the disability variables are presented for the subgroups of patients with pain scores ≤ 5 and > 5. There is MAR missing data in the disability variable in the subgroup of patients with pain scores ≤ 5. The consequence is that the means are equal between the groups with complete and missing data, i.e 9.26 and 9.23 respectively. Consequently this also accounts for the mean difference of disability between patients with complete and missing data, i.e 5.3 and 5.2 respectively.

Table 8.1: Mean values of Disability variable for patients with pain scores lower and equal to 5 and higher than 5. Left are the mean and standard deviation values of the complete data and right are those with missing values in the disability variable for the subgroup of patients with pain scores lower and equal to 5.

	Pain_di		Mean	SD
Disability	<= 5		9.26	4.09
	> 5		14.56	3.95

	Pain_di		Mean	SD
Disability_MAR	0		9.23	4.06
	1		14.56	3.95

However, it is not possible to test this assumption, because for that you need information of the missing values and in real-life, that is not possible. In general, excluding MAR data leads to biased parameter estimates and false results for your statistical tests. A missing data method that works well with MAR data is Multiple Imputation (Chapter 4).