4.22 Assumptions: Independence Assumption (IA)
- E[Yi|Di = 1] \(\stackrel{1}{=}\) E[Yi0 + Di(Yi1 - Yi0)| Di = 1] \(\stackrel{2}{=}\) E[Yi1|Di = 1] \(\stackrel{3}{=}\) E[Yi1]23
- Step \(\stackrel{1}{=}\), \(\stackrel{2}{=}\) and \(\stackrel{3}{=}\): Conditional expectations of the observed outcomes conditional on treatment assignment = expectation of the unobserved potential outcomes24
- Same logic for E[Yi|Di = 0] = E[Yi0]
- It follows: ATE = E[Yi1 - Yi0] = E[Yi1] - E[Yi0] = E[Yi|Di = 1] - E[Yi|Di = 0]
The IA allows us to equate the expected value of the whole column E[Yi0] (blue and orange values) with the orange values, i.e. E[Yi|Di = 0] (same for column E[Yi1]).
- When is the independence assumption justified? (it depends… next slide)
Step \(\stackrel{1}{=}\): Replace column 2 with difference of column 3 and 4; Step \(\stackrel{2}{=}\): Yi0 cancels out and we end up with E[Yi1|Di = 1]; Step \(\stackrel{3}{=}\): Because Yi1 is independent of Di (independence assumption) we can replace E[Yi1|Di = 1] with E[Yi1].↩
Normally, to estimate the ATE we calculate the expected value of the differences between column Yi1 and column Yi0. In other words, we would have to observe both treatment and control units in their counterfactual states (e.g. observe what the value of control units would be if they had been treated). However, for the units that were assigned to control (Di = 0) we do not observe Yi1 and the other way round. Starting with the column Yi1, the independence assumption simply means that the expected value of the whole column Yi1 (red and green values) can be equated with the expected of the first two rows of the column, namely Yi1|Di = 1 (the red values). And that is what we actually observe. Hence, through this assumption there is not need to observe the missing green values any more. The same logic applies to column Yi0. The IA allows as to equate the expected value of the whole column E[Yi0] (blue and yellow values) with the yellow values, i.e. E[Yi|Di = 0].↩