4.1 One Sample Inference

\(Y_i \sim i.i.d. N(\mu, \sigma^2)\)

i.i.d. standards for “independent and identically distributed”

Hence, we have the following model:

\(Y_i=\mu +\epsilon_i\) where

  • \(\epsilon_i \sim^{iid} N(0,\sigma^2)\)
  • \(E(Y_i)=\mu\)
  • \(Var(Y_i)=\sigma^2\)
  • \(\bar{y} \sim N(\mu,\sigma^2/n)\)

4.1.1 The Mean

When \(\sigma^2\) is estimated by \(s^2\), then

\[ \frac{\bar{y}-\mu}{s/\sqrt{n}} \sim t_{n-1} \]

Then, a \(100(1-\alpha) \%\) confidence interval for \(\mu\) is obtained from:

\[ 1 - \alpha = P(-t_{\alpha/2;n-1} \le \frac{\bar{y}-\mu}{s/\sqrt{n}} \le t_{\alpha/2;n-1}) \\ = P(\bar{y} - (t_{\alpha/2;n-1})s/\sqrt{n} \le \mu \le \bar{y} + (t_{\alpha/2;n-1})s/\sqrt{n}) \]

And the interval is

\[ \bar{y} \pm (t_{\alpha/2;n-1})s/\sqrt{n} \]

and \(s/\sqrt{n}\) is the standard error of \(\bar{y}\)

If the experiment were repeated many times, \(100(1-\alpha) \%\) of these intervals would contain \(\mu\)

Confidence Interval \(100(1-\alpha)%\) Sample Sizes Confidence \(\alpha\), Error \(d\) Hypothesis Testing Test Statistic
When \(\sigma^2\) is known, X is normal (or \(n \ge 25\)) \(\bar{X} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\) \(n \approx \frac{z_{\alpha/2}^2 \sigma^2}{d^2}\) \(z = \frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}\)
When \(\sigma^2\) is unknown, X is normal (or \(n \ge 25\)) \(\bar{X} \pm t_{\alpha/2}\frac{s}{\sqrt{n}}\) \(n \approx \frac{z_{\alpha/2}^2 s^2}{d^2}\) \(t = \frac{\bar{X}-\mu_0}{s/\sqrt{n}}\)

4.1.1.1 For Difference of Means (\(\mu_1-\mu_2\)), Independent Samples

\(100(1-\alpha)%\) Confidence Interval Hypothesis Testing Test Statistic
When \(\sigma^2\) is known \(\bar{X}_1 - \bar{X}_2 \pm z_{\alpha/2}\sqrt{\frac{\sigma^2_1}{n_1}+\frac{\sigma^2_2}{n_2}}\) \(z= \frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)_0}{\sqrt{\frac{\sigma^2_1}{n_1}+\frac{\sigma^2_2}{n_2}}}\)
When \(\sigma^2\) is unknown, Variances Assumed EQUAL \(\bar{X}_1 - \bar{X}_2 \pm t_{\alpha/2}\sqrt{s^2_p(\frac{1}{n_1}+\frac{1}{n_2})}\) \(t = \frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)_0}{\sqrt{s^2_p(\frac{1}{n_1}+\frac{1}{n_2})}}\) Pooled Variance: \(s_p^2 = \frac{(n_1 -1)s^2_1 - (n_2-1)s^2_2}{n_1 + n_2 -2}\) Degrees of Freedom: \(\gamma = n_1 + n_2 -2\)
When \(\sigma^2\) is unknown, Variances Assumed UNEQUAL \(\bar{X}_1 - \bar{X}_2 \pm t_{\alpha/2}\sqrt{(\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2})}\) \(t = \frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)_0}{\sqrt{(\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2})}}\) Degrees of Freedom: \(\gamma = \frac{(\frac{s_1^2}{n_1}+\frac{s^2_2}{n_2})^2}{\frac{(\frac{s_1^2}{n_1})^2}{n_1-1}+\frac{(\frac{s_2^2}{n_2})^2}{n_2-1}}\)

4.1.1.2 For Difference of Means (\(\mu_1 - \mu_2\)), Paired Samples (D = X-Y)

\(100(1-\alpha)%\) Confidence Interval
\[ \bar{D} \pm t_{\alpha/2}\frac{s_d}{\sqrt{n}} \]

Hypothesis Testing Test Statistic

\[ t = \frac{\bar{D}-D_0}{s_d / \sqrt{n}} \]

4.1.1.3 Difference of Two Proportions

Mean

\[ \hat{p_1}-\hat{p_2} \]

Variance \[ \frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2} \]

\(100(1-\alpha)%\) Confidence Interval

\[ \hat{p_1}-\hat{p_2} + z_{\alpha/2}\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}} \]

Sample Sizes, Confidence \(\alpha\), Error d
(Prior Estimate fo \(\hat{p_1},\hat{p_2}\))

\[ n \approx \frac{z_{\alpha/2}^2[p_1(1-p_1)+p_2(1-p_2)]}{d^2} \]

(No Prior Estimates for \(\hat{p}\))

\[ n \approx \frac{z_{\alpha/2}^2}{2d^2} \]

Hypothesis Testing - Test Statistics

Null Value \((p_1 - p_2) \neq 0\)

\[ z = \frac{(\hat{p_1} - \hat{p_2})-(p_1 - p_2)_0}{\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}} \]

Null Value \((p_1 - p_2)_0 = 0\)

\[ z = \frac{\hat{p_1} - \hat{p_2}}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}} \]

where

\[ \hat{p}= \frac{x_1 + x_2}{n_1 + n_2} = \frac{n_1 \hat{p_1} + n_2 \hat{p_2}}{n_1 + n_2} \]

4.1.2 Single Variance

\[ 1 - \alpha = P( \chi_{1-\alpha/2;n-1}^2) \le (n-1)s^2/\sigma^2 \le \chi_{\alpha/2;n-1}^2) \\ = P(\frac{(n-1)s^2}{\chi_{\alpha/2}^2} \le \sigma^2 \le \frac{(n-1)s^2}{\chi_{1-\alpha/2}^2}) \]

and a \(100(1-\alpha) \%\) confidence interval for \(\sigma^2\) is:

\[ (\frac{(n-1)s^2}{\chi_{\alpha/2;n-1}^2},\frac{(n-1)s^2}{\chi_{1-\alpha/2;n-1}^2}) \] Confidence limits for \(\sigma^2\) are obtained by computing the positive square roots of these limits

Equivalently,

\(100(1-\alpha)%\) Confidence Interval

\[ L_1 = \frac{(n-1)s^2}{\chi^2_{\alpha/2}} \\ L_1 = \frac{(n-1)s^2}{\chi^2_{1-\alpha/2}} \] Hypothesis Testing Test Statistic

\[ \chi^2 = \frac{(n-1)s^2}{\sigma^2_0} \]

4.1.3 Single Proportion (p)

Confidence Interval \(100(1-\alpha)%\) Sample Sizes Confidence \(\alpha\), Error d (prior estimate for \(\hat{p}\)) (No prior estimate for \(\hat{p}\)) Hypothesis Testing Test Statistic
\(\hat{p} \pm z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\) \(n \approx \frac{z_{\alpha/2}^2 \hat{p}(1-\hat{p})}{d^2}\) \(n \approx \frac{z_{\alpha/2}^2}{4d^2}\) \(z = \frac{\hat{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\)

4.1.4 Power

Formally, power (for the test of the mean) is given by:

\[ \pi(\mu) = 1 - \beta = P(\text{test rejects } H_0|\mu) \] To evaluate the power, one needs to know the distribution of the test statistic if the null hypothesis is false.

For 1-sided z-test where \(H_0: \mu \le \mu_0 \\ H_A: \mu >0\)

The power is:

\[ \begin{aligned} \pi(\mu) &= P(\bar{y} > \mu_0 + z_{\alpha} \sigma/\sqrt{n}|\mu) \\ &= P(Z = \frac{\bar{y} - \mu}{\sigma / \sqrt{n}} > z_{\alpha} + \frac{\mu_0 - \mu}{\sigma/ \sqrt{n}}|\mu) \\ &= 1 - \Phi(z_{\alpha} + \frac{(\mu_0 - \mu)\sqrt{n}}{\sigma}) \\ &= \Phi(-z_{\alpha}+\frac{(\mu -\mu_0)\sqrt{n}}{\sigma}) \end{aligned} \]

where \(1-\Phi(x) = \Phi(-x)\) since the normal pdf is symmetric

Power is correlated to the difference in \(\mu - \mu_0\), sample size n, variance \(\sigma^2\), and the \(\alpha\)-level of the test (through \(z_{\alpha}\))
Equivalently, power can be increased by making \(\alpha\) large, \(\sigma^2\) smaller, or n larger.

For 2-sided z-test is:

\[ \pi(\mu) = \Phi(-z_{\alpha/2} + \frac{(\mu_0 - \mu)\sqrt{n}}{\sigma}) + \Phi(-z_{\alpha/2}+\frac{(\mu - \mu_0)\sqrt{n}}{\sigma}) \]

4.1.5 Sample Size

4.1.5.1 1-sided Z-test

Example: to show that the mean response \(\mu\) under the treatment is higher than the mean response \(\mu_0\) without treatment (show that the treatment effect \(\delta = \mu -\mu_0\) is large)

Because power is an increasing function of \(\mu - \mu_0\), it is only necessary to find n that makes the power equal to \(1- \beta\) at \(\mu = \mu_0 + \delta\)

Hence, we have

\[ \pi(\mu_0 + \delta) = \Phi(-z_{\alpha} + \frac{\delta \sqrt{n}}{\sigma}) = 1 - \beta \]

Since \(\Phi (z_{\beta})= 1-\beta\), we have

\[ -z_{\alpha} + \frac{\delta \sqrt{n}}{\sigma} = z_{\beta} \]

Then n is

\[ n = (\frac{(z_{\alpha}+z_{\beta})\sigma}{\delta})^2 \]

Then, we need larger samples, when

  • the sample variability is large (\(\sigma\) is large)
  • \(\alpha\) is small (\(z_{\alpha}\) is large)
  • power \(1-\beta\) is large (\(z_{\beta}\) is large)
  • The magnitude of the effect is smaller (\(\delta\) is small)

Since we don’t know \(\delta\) and \(\sigma\). We can base \(\sigma\) on previous studies, pilot studies. Or, obtain an estimate of \(\sigma\) by anticipating the range of the observation (without outliers). divide this range by 4 and use the resulting number as an approximate estimate of \(\sigma\). For normal (distribution) data, this is reasonable.

4.1.5.2 2-sided Z-test

We want to know the min n, required to guarantee \(1-\beta\) power when the treatment effect \(\delta = |\mu - \mu_0|\) is at least greater than 0. Since the power function for the 2-sided is increasing and symmetric in \(|\mu - \mu_0|\), we only need to find n that makes the power equal to \(1-\beta\) when \(\mu = \mu_0 + \delta\)

\[ n = (\frac{(z_{\alpha/2} + z_{\beta}) \sigma}{\delta})^2 \]

We could also use the confidence interval approach. If we require that an \(\alpha\)-level two-sided CI for \(\mu\) be

\[ \bar{y} \pm D \] where \(D = z_{\alpha/2}\sigma/\sqrt{n}\) gives

\[ n = (\frac{z_{\alpha/2}\sigma}{D})^2 \] (round up to the nearest integer)

data = rnorm(100)
t.test(data, conf.level=0.95)
#> 
#>  One Sample t-test
#> 
#> data:  data
#> t = 0.42865, df = 99, p-value = 0.6691
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -0.1728394  0.2680940
#> sample estimates:
#>  mean of x 
#> 0.04762729

\[ H_0: \mu \ge 30 \\ H_a: \mu < 30 \]

t.test(data, mu=30,alternative="less")
#> 
#>  One Sample t-test
#> 
#> data:  data
#> t = -269.57, df = 99, p-value < 2.2e-16
#> alternative hypothesis: true mean is less than 30
#> 95 percent confidence interval:
#>       -Inf 0.2321136
#> sample estimates:
#>  mean of x 
#> 0.04762729

4.1.6 Note

For t-tests, the sample and power are not as easy as z-test.

\[ \pi(\mu) = P(\frac{\bar{y}-\mu_0}{s/\sqrt{n}}> t_{n-1;\alpha}|\mu) \]

when \(\mu > \mu_0\) (i.e., \(\mu - \mu_0 = \delta\)), the random variable \((\bar{y} - \mu_0)/(s/\sqrt{n})\) does not have a [Student’s t distribution][Student T], but rather is distributed as a non-central t-distribution with non-centrality parameter \(\delta \sqrt{n}/\sigma\) and d.f. of \(n-1\)

  • The power is an increasing function of this non-centrality parameter (note, when \(\delta = 0\) the distribution is usual Student’s t-distribution).
  • To evaluate power, one must consider numerical procedure or use special charts

Approximate Sample Size Adjustment for t-test. We use an adjustment to the z-test determination for sample size.

Let \(v = n-1\), where n is sample size derived based on the z-test power. Then the 2-sided t-test sample size (approximate) is given:

\[ n^* = \frac{(t_{v;\alpha/2}+t_{v;\beta})^2 \sigma^2}{\delta^2} \]

4.1.7 One-sample Non-parametric Methods

lecture.data = c(0.76, 0.82, 0.80, 0.79, 1.06, 0.83,-0.43,-0.34, 3.34, 2.33)

4.1.7.1 Sign Test

If we want to test \(H_0: \mu_{(0.5)} = 0; H_a: \mu_{(0.5)} >0\) where \(\mu_{(0.5)}\) is the population median. We can

  1. Count the number of observation (\(y_i\)’s) that exceed 0. Denote this number by \(s_+\), called the number of plus signs. Let \(s_- = n - s_+\), which is the number of minus signs.
  2. Reject \(H_0\) if \(s_+\) is large or equivalently, if \(s_-\) is small.

To determine how large \(s_+\) must be to reject \(H_0\) at a given significance level, we need to know the distribution of the corresponding random variable \(S_+\) under the null hypothesis, which is a binomial with p = 1/2,w hen the null is true.

To work out the null distribution using the binomial formula, we have \(\alpha\)-level test rejects \(H_0\) if \(s_+ \ge b_{n,\alpha}\), where \(b_{n,\alpha}\) is the upper \(\alpha\) critical point of the \(Bin(n,1/2)\) distribution. Both \(S_+\) and \(S_-\) have this same distribution (\(S = S_+ = S_-\)).

\[ \text{p-value} = P(S \ge s_+) = \sum_{i = s_+}^{n} {{n}\choose{i}} (\frac{1}{2})^n \] equivalently,

\[ P(S \le s_-) = \sum_{i=0}^{s_-}{{n}\choose{i}} (\frac{1}{2})^2 \] For large sample sizes, we could use the normal approximation for the binomial, in which case reject \(H_0\) if

\[ s_+ \ge n/2 + 1/2 + z_{\alpha}\sqrt{n/4} \]

For the 2-sided test, we use the tests statistic \(s_{max} = max(s_+,s_-)\) or \(s_{min} = min(s_+, s_-)\). An \(\alpha\)-level test rejects \(H_0\) if the p-value is \(\le \alpha\), where the p-value is computed from:

\[ p-value = 2 \sum_{i=s_{max}}^{n} {{n}\choose{i}} (\frac{1}{2})^n = s \sum_{i=0}^{s_{min}} {{n}\choose{i}} (\frac{1}{2})^n \] Equivalently, rejecting \(H_0\) if \(s_{max} \ge b_{n,\alpha/2}\)

A large sample normal approximation can be used, where

\[ z = \frac{s_{max}- n/2 -1/2}{\sqrt{n/4}} \] and reject \(H_0\) at \(\alpha\) if \(z \ge z_{\alpha/2}\)

However, treatment of 0 is problematic for this test.

  • Solution 1: randomly assign 0 to the positive or negative (2 researchers might get different results).
  • Solution 2: count each 0 as a contribution 1/2 toward \(s_+\) and \(s_-\) (but then could not apply the binomial distribution)
  • Solution 3: ignore 0 (reduces the power of test due to decreased sample size).
binom.test(sum(lecture.data > 0), length(lecture.data)) 
#> 
#>  Exact binomial test
#> 
#> data:  sum(lecture.data > 0) and length(lecture.data)
#> number of successes = 8, number of trials = 10, p-value = 0.1094
#> alternative hypothesis: true probability of success is not equal to 0.5
#> 95 percent confidence interval:
#>  0.4439045 0.9747893
#> sample estimates:
#> probability of success 
#>                    0.8
# alternative = "greater" or alternative = "less"

4.1.7.2 Wilcoxon Signed Rank Test

Since the Sign Test could not consider the magnitude of each observation from 0, the Wilcoxon Signed Rank Test improves by taking account the ordered magnitudes of the observation, but it will impose the requirement of symmetric to this test (while Sign Test does not)

\[ H_0: \mu_{0.5} = 0 \\ H_a: \mu_{0.5} > 0 \] (assume no ties or same observations)

The signed rank test procedure:

  1. rank order the observation \(y_i\) in terms of their absolute values. Let \(r_i\) be the rank of \(y_i\) in this ordering. Since we assume no ties, the ranks \(r_i\) are uniquely determined and are a permutation of the integers \(1,2,…,n\).
  2. Calculate \(w_+\), which is the sum of the ranks of the positive values, and \(w_-\), which is the sum of the ranks of the negative values. Note that \(w_+ + w_- = r_1 + r_2 + ... = 1 + 2 + ... + n = n(n+1)/2\)
  3. Reject \(H_0\) if \(w_+\) is large (or if \(w_-\) is small)

To know what is large or small with regard to \(w_+\) and \(w_-\), we need the distribution of \(W_+\) and \(W_-\) when the null is true.

Since these null distributions are identical and symmetric, the p-value is \(P(W \ge w_+) = P(W \le w_-)\)

An \(\alpha\)-level test rejects the null if the p-value is \(\le \alpha\), or if \(w_+ \ge w_{n,\alpha}\), where \(w_{n,\alpha}\) is the upper \(\alpha\) critical point of the null distribution of W.

This distribution of W has a special table. For large n, the distribution of W is approximately normal.

\[ z = \frac{w_+ - n(n+1) /4 -1/2}{\sqrt{n(n+1)(2n+1)/24}} \]

The test rejects \(H_0\) at level \(\alpha\) if

\[ w_+ \ge n(n+1)/4 +1/2 + z_{\alpha}\sqrt{n(n+1)(2n+1)/24} \approx w_{n,\alpha} \]

For the 2-sided test, we use \(w_{max}=max(w_+,w_-)\) or \(w_{min}=min(w_+,w_-)\), with p-value given by:

\[ p-value = 2P(W \ge w_{max}) = 2P(W \le w_{min}) \] Same as Sign Test,we ignore 0. In some cases where some of the \(|y_i|\)’s may be tied for the same rank, we simply assign each of the tied ranks the average rank (or “midrank”).

Example, if \(y_1 = -1\), \(y_3 = 3\) and \(y_3 = -3\), and \(y_4 =5\), then \(r_1 = 1\), \(r_2 = r_3=(2+3)/2 = 2.5\), \(r_4 = 4\)

wilcox.test(lecture.data) 
#> 
#>  Wilcoxon signed rank exact test
#> 
#> data:  lecture.data
#> V = 52, p-value = 0.009766
#> alternative hypothesis: true location is not equal to 0
# does not use normal approximation
# (using the underlying W distribution)

wilcox.test(lecture.data,exact=F) 
#> 
#>  Wilcoxon signed rank test with continuity correction
#> 
#> data:  lecture.data
#> V = 52, p-value = 0.01443
#> alternative hypothesis: true location is not equal to 0
# uses normal approximation