Appendices
Appendix B: Average Deviance
The deviance of an observation from its mean is x−ˉx. We denote the deviation for the ith observation as xi−ˉx. So the sum over all n deviances is
Sum of Deviances=Σni=1(xi−ˉx)=(x1−ˉx)+(x2−ˉx)+⋯+(xn−1−ˉx)+(xn−ˉx)=x1−ˉx+x2−ˉx+⋯+xn−1−ˉx+xn−ˉx=x1+x2+⋯+xn−1+xn−ˉx−ˉx−⋯−ˉx−ˉx=(x1+x2+⋯+xn−1+xn)−(ˉx+ˉx+⋯+ˉx+ˉx)
where the first half is the sum over all of the x values and the term (ˉx) appears n times. So we can rewrite this as
Sum of Deviances=Σni=1(xi)−nˉx Now notice that, because ˉx=Σni=1(xi)n, we can multiply Σni=1(xi) by nn to get nΣni=1(xi)n=nˉx and rewrite the sum over the deviances as
Sum of Deviances=nˉx−nˉx=0
Appendix C: Deriving a Confidence Interval
Assume we are taking a sample from a normal distribution with mean μ and standard deviation σ. We will assume the value of σ is known to us. Then ˉX is Normal(μ,σ/√n). If we standardize ˉX, we get Z=ˉX−μσ/√n.
We want some interval (a,b). We will start by considering a<Z<b, so a<Z and Z<b (or b>Z). Then
Z<bˉX−μσ/√n<bˉX−μ<bσ/√nˉX−bσ/√n<μ
and
a<Za<ˉX−μσ/√naσ/√n<ˉX−μμ<ˉX−aσ/√n
putting these together, ˉX−bσ√n<μ<ˉX−aσ√n. If we want to be 95% confident, then we want P(a<Z<b)=0.95: P(ˉX−bσ√n<μ<ˉX−aσ√n)=0.95. To calculate the 95% confidence interval, we need to find a and b such that P(a<Z<b)=0.95.
We want this interval to be as narrow (small) as possible. Why? Narrower intervals are more informative. If I say I’m 95% confident that tomorrow’s high will be between -100 and 200 degrees Fahrenheit, that’s a useless interval. If I change it to between 70 and 100, that’s a little better. Changing it to between 85 and 90 is even better. This is what we mean by more informative.
It turns out that with a symmetric distribution like the normal distribution, the way to make a confidence interval as narrow as possible is to take advantage of this symmetry. Each of the plots below show a shaded area of 0.95. The narrowest interval (along the horizontal axis) is the first interval, which is shaded on (−1.96<Z<1.96).
Using the symmetry of the normal distribution, we find that the narrowest interval uses a=−1.96 and b=1.96, which results in the 95% confidence interval (ˉx−z∗σ√n,ˉx+z∗σ√n) where z∗=1.96.