4.3 One-Sample Inference

4.3.1 For Single Mean

Consider a scenario where

\[ Y_i \sim \text{i.i.d. } N(\mu, \sigma^2), \]

where i.i.d. stands for “independent and identically distributed.” This model can be expressed as:

\[ Y_i = \mu + \epsilon_i, \]

where:

  • \(\epsilon_i \sim^{\text{i.i.d.}} N(0, \sigma^2)\),
  • \(E(Y_i) = \mu\),
  • \(\text{Var}(Y_i) = \sigma^2\),
  • \(\bar{y} \sim N(\mu, \sigma^2 / n)\).

When \(\sigma^2\) is estimated by \(s^2\), the standardized test statistic follows a \(t\)-distribution:

\[ \frac{\bar{y} - \mu}{s / \sqrt{n}} \sim t_{n-1}. \]

A \(100(1-\alpha)\%\) confidence interval for \(\mu\) is obtained as:

\[ 1 - \alpha = P\left(-t_{\alpha/2;n-1} \leq \frac{\bar{y} - \mu}{s / \sqrt{n}} \leq t_{\alpha/2;n-1}\right), \]

or equivalently,

\[ P\left(\bar{y} - t_{\alpha/2;n-1}\frac{s}{\sqrt{n}} \leq \mu \leq \bar{y} + t_{\alpha/2;n-1}\frac{s}{\sqrt{n}}\right). \]

The confidence interval is expressed as:

\[ \bar{y} \pm t_{\alpha/2;n-1}\frac{s}{\sqrt{n}}, \]

where \(s / \sqrt{n}\) is the standard error of \(\bar{y}\).

If the experiment were repeated many times, \(100(1-\alpha)\%\) of these intervals would contain \(\mu\).

Case Confidence Interval \(100(1-\alpha)\%\) Sample Size (Confidence \(\alpha\), Error \(d\)) Hypothesis Test Statistic
\(\sigma^2\) known, \(X\) normal (or \(n \geq 25\)) \(\bar{X} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\) \(n \approx \frac{z_{\alpha/2}^2 \sigma^2}{d^2}\) \(z = \frac{\bar{X} - \mu_0}{\sigma / \sqrt{n}}\)
\(\sigma^2\) unknown, \(X\) normal (or \(n \geq 25\)) \(\bar{X} \pm t_{\alpha/2}\frac{s}{\sqrt{n}}\) \(n \approx \frac{z_{\alpha/2}^2 s^2}{d^2}\) \(t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}}\)

4.3.1.1 Power in Hypothesis Testing

Power (\(\pi(\mu)\)) of a hypothesis test represents the probability of correctly rejecting the null hypothesis (\(H_0\)) when it is false (i.e., when alternative hypothesis \(H_A\) is true). Formally, it is expressed as:

\[ \begin{aligned} \text{Power} &= \pi(\mu) = 1 - \beta \\ &= P(\text{test rejects } H_0|\mu) \\ &= P(\text{test rejects } H_0| H_A \text{ is true}), \end{aligned} \]

where \(\beta\) is the probability of a Type II error (failing to reject \(H_0\) when it is false).

To calculate this probability:

  1. Under \(H_0\): The distribution of the test statistic is centered around the null parameter (e.g., \(\mu_0\)).

  2. Under \(H_A\): The test statistic is distributed differently, shifted according to the true value under \(H_A\) (e.g., \(\mu_1\)).

Hence, to evaluate the power, it is crucial to determine the distribution of the test statistic under the alternative hypothesis, \(H_A\).

Below, we derive the power for both one-sided and two-sided z-tests.


4.3.1.1.1 One-Sided z-Test

Consider the hypotheses:

\[ H_0: \mu \leq \mu_0 \quad \text{vs.} \quad H_A: \mu > \mu_0 \]

The power for a one-sided z-test is derived as follows:

  1. The test rejects \(H_0\) if \(\bar{y} > \mu_0 + z_{\alpha} \frac{\sigma}{\sqrt{n}}\), where \(z_{\alpha}\) is the critical value for the test at the significance level \(\alpha\).
  2. Under the alternative hypothesis, the distribution of \(\bar{y}\) is centered at \(\mu\), with standard deviation \(\frac{\sigma}{\sqrt{n}}\).
  3. The power is then:

\[ \begin{aligned} \pi(\mu) &= P\left(\bar{y} > \mu_0 + z_{\alpha} \frac{\sigma}{\sqrt{n}} \middle| \mu \right) \\ &= P\left(Z > z_{\alpha} + \frac{\mu_0 - \mu}{\sigma / \sqrt{n}} \middle| \mu \right), \quad \text{where } Z = \frac{\bar{y} - \mu}{\sigma / \sqrt{n}} \\ &= 1 - \Phi\left(z_{\alpha} + \frac{(\mu_0 - \mu)\sqrt{n}}{\sigma}\right) \\ &= \Phi\left(-z_{\alpha} + \frac{(\mu - \mu_0)\sqrt{n}}{\sigma}\right). \end{aligned} \]

Here, we use the symmetry of the standard normal distribution: \(1 - \Phi(x) = \Phi(-x)\).

Suppose we wish to show that the mean response \(\mu\) under the treatment is higher than the mean response \(\mu_0\) without treatment (i.e., the treatment effect \(\delta = \mu - \mu_0\) is large).

Since power is an increasing function of \(\mu - \mu_0\), it suffices to find the sample size \(n\) that achieves the desired power \(1 - \beta\) at \(\mu = \mu_0 + \delta\). The power at \(\mu = \mu_0 + \delta\) is:

\[ \pi(\mu_0 + \delta) = \Phi\left(-z_{\alpha} + \frac{\delta \sqrt{n}}{\sigma}\right) = 1 - \beta \]

Given \(\Phi(z_{\beta}) = 1 - \beta\), we have:

\[ -z_{\alpha} + \frac{\delta \sqrt{n}}{\sigma} = z_{\beta} \]

Solving for \(n\), we obtain:

\[ n = \left(\frac{(z_{\alpha} + z_{\beta})\sigma}{\delta}\right)^2 \]

Larger sample sizes are required when:

  • The sample variability is large (\(\sigma\) is large).
  • The significance level \(\alpha\) is small (\(z_{\alpha}\) is large).
  • The desired power \(1 - \beta\) is large (\(z_{\beta}\) is large).
  • The magnitude of the effect is small (\(\delta\) is small).

In practice, \(\delta\) and \(\sigma\) are often unknown. To estimate \(\sigma\), you can:

  1. Use prior studies or pilot studies.
  2. Approximate \(\sigma\) based on the anticipated range of the observations (excluding outliers). For normally distributed data, dividing the range by 4 provides a reasonable estimate of \(\sigma\).

These considerations ensure the test is adequately powered to detect meaningful effects while balancing practical constraints such as sample size.

4.3.1.1.2 Two-Sided z-Test

For a two-sided test, the hypotheses are:

\[ H_0: \mu = \mu_0 \quad \text{vs.} \quad H_A: \mu \neq \mu_0 \]

The test rejects \(H_0\) if \(\bar{y}\) lies outside the interval \(\mu_0 \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\). The power of the test is:

\[ \begin{aligned} \pi(\mu) &= P\left(\bar{y} < \mu_0 - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \middle| \mu \right) + P\left(\bar{y} > \mu_0 + z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \middle| \mu \right) \\ &= \Phi\left(-z_{\alpha/2} + \frac{(\mu - \mu_0)\sqrt{n}}{\sigma}\right) + \Phi\left(-z_{\alpha/2} - \frac{(\mu - \mu_0)\sqrt{n}}{\sigma}\right). \end{aligned} \]

To ensure a power of \(1-\beta\) when the treatment effect \(\delta = |\mu - \mu_0|\) is at least a certain value, we solve for \(n\). Since the power function for a two-sided test is increasing and symmetric in \(|\mu - \mu_0|\), it suffices to find \(n\) such that the power equals \(1-\beta\) when \(\mu = \mu_0 + \delta\). This gives:

\[ n = \left(\frac{(z_{\alpha/2} + z_{\beta}) \sigma}{\delta}\right)^2 \]

Alternatively, the required sample size can be determined using a confidence interval approach. For a two-sided \(\alpha\)-level confidence interval of the form:

\[ \bar{y} \pm D \]

where \(D = z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\), solving for \(n\) gives:

\[ n = \left(\frac{z_{\alpha/2} \sigma}{D}\right)^2 \]

This value should be rounded up to the nearest integer to ensure the required precision.

# Generate random data and compute a 95% confidence interval
data <- rnorm(100) # Generate 100 random values
t.test(data, conf.level = 0.95) # Perform t-test with 95% confidence interval
#> 
#>  One Sample t-test
#> 
#> data:  data
#> t = -1.3809, df = 99, p-value = 0.1704
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -0.33722662  0.06046365
#> sample estimates:
#>  mean of x 
#> -0.1383815

For a one-sided hypothesis test, such as testing \(H_0: \mu \geq 30\) versus \(H_a: \mu < 30\):

# Perform one-sided t-test
t.test(data, mu = 30, alternative = "less")
#> 
#>  One Sample t-test
#> 
#> data:  data
#> t = -300.74, df = 99, p-value < 2.2e-16
#> alternative hypothesis: true mean is less than 30
#> 95 percent confidence interval:
#>        -Inf 0.02801196
#> sample estimates:
#>  mean of x 
#> -0.1383815

When \(\sigma\) is unknown, you can estimate it using:

  1. Prior studies or pilot studies.

  2. The range of observations (excluding outliers) divided by 4, which provides a reasonable approximation for normally distributed data.

4.3.1.1.3 z-Test Summary
  • For one-sided tests:

\[ \pi(\mu) = \Phi\left(-z_{\alpha} + \frac{(\mu - \mu_0)\sqrt{n}}{\sigma}\right) \]

  • For two-sided tests:

\[ \pi(\mu) = \Phi\left(-z_{\alpha/2} + \frac{(\mu - \mu_0)\sqrt{n}}{\sigma}\right) + \Phi\left(-z_{\alpha/2} - \frac{(\mu - \mu_0)\sqrt{n}}{\sigma}\right) \]

Factors Affecting Power

  • Effect Size (\(\mu - \mu_0\)): Larger differences between \(\mu\) and \(\mu_0\) increase power.
  • Sample Size (\(n\)): Larger \(n\) reduces the standard error, increasing power.
  • Variance (\(\sigma^2\)): Smaller variance increases power.
  • Significance Level (\(\alpha\)): Increasing \(\alpha\) (making the test more liberal) increases power through \(z_{\alpha}\).
4.3.1.1.4 One-Sample t-test

In hypothesis testing, calculating the power and determining the required sample size for t-tests are more complex than for z-tests. This complexity arises from the involvement of the Student’s t-distribution and its generalized form, the non-central t-distribution.

The power function for a one-sample t-test can be expressed as:

\[ \pi(\mu) = P\left(\frac{\bar{y} - \mu_0}{s / \sqrt{n}} > t_{n-1; \alpha} \mid \mu \right) \]

Here:

  • \(\mu_0\) is the hypothesized population mean under the null hypothesis,

  • \(\bar{y}\) is the sample mean,

  • \(s\) is the sample standard deviation,

  • \(n\) is the sample size,

  • \(t_{n-1; \alpha}\) is the critical t-value from the Student’s t-distribution with \(n-1\) degrees of freedom at significance level \(\alpha\).

When \(\mu > \mu_0\) (i.e., \(\mu - \mu_0 = \delta\)), the random variable

\[ T = \frac{\bar{y} - \mu_0}{s / \sqrt{n}} \]

does not follow the Student’s t-distribution. Instead, it follows a non-central t-distribution with:

  • a non-centrality parameter \(\lambda = \delta \sqrt{n} / \sigma\), where \(\sigma\) is the population standard deviation,

  • degrees of freedom \(n-1\).

Key Properties of the Power Function

  • The power \(\pi(\mu)\) is an increasing function of the non-centrality parameter \(\lambda\).
  • For \(\delta = 0\) (i.e., when the null hypothesis is true), the non-central t-distribution simplifies to the regular Student’s t-distribution.

To calculate the power in practice, numerical procedures (see below) or precomputed charts are typically required.

Approximate Sample Size Adjustment for t-tests

When planning a study, researchers often start with an approximation based on z-tests and then adjust for the specifics of the t-test. Here’s the process:

1. Start with the Sample Size for a z-test

For a two-sided test: \[ n_z = \frac{\left(z_{\alpha/2} + z_\beta\right)^2 \sigma^2}{\delta^2} \] where:

  • \(z_{\alpha/2}\) is the critical value from the standard normal distribution for a two-tailed test,

  • \(z_\beta\) corresponds to the desired power \(1 - \beta\),

  • \(\delta\) is the effect size \(\mu - \mu_0\),

  • \(\sigma\) is the population standard deviation.

2. Adjust for the t-distribution

Let \(v = n - 1\), where \(n\) is the sample size derived from the z-test. For a two-sided t-test, the approximate sample size is:

\[ n^* = \frac{\left(t_{v; \alpha/2} + t_{v; \beta}\right)^2 \sigma^2}{\delta^2} \]

Here:

  • \(t_{v; \alpha/2}\) and \(t_{v; \beta}\) are the critical values from the Student’s t-distribution for the significance level \(\alpha\) and desired power, respectively.

  • Since \(v\) depends on \(n^*\), this process may require iterative refinement.

Notes:

  1. Approximations: The above formulas provide an intuitive starting point but may require adjustments based on exact numerical solutions.
  2. Insights: Power is an increasing function of:
    • the effect size \(\delta\),
    • the sample size \(n\),
    • and a decreasing function of the population variability \(\sigma\).
# Example: Power calculation for a one-sample t-test
library(pwr)

# Parameters
effect_size <- 0.5  # Cohen's d
alpha <- 0.05       # Significance level
power <- 0.8        # Desired power

# Compute sample size
sample_size <-
    pwr.t.test(
        d = effect_size,
        sig.level = alpha,
        power = power,
        type = "one.sample"
    )$n

# Print result
cat("Required sample size for one-sample t-test:",
    ceiling(sample_size),
    "\n")
#> Required sample size for one-sample t-test: 34

# Power calculation for a given sample size
calculated_power <-
    pwr.t.test(
        n = ceiling(sample_size),
        d = effect_size,
        sig.level = alpha,
        type = "one.sample"
    )$power
cat("Achieved power with computed sample size:",
    calculated_power,
    "\n")
#> Achieved power with computed sample size: 0.8077775

4.3.2 For Difference of Means, Independent Samples

\(100(1-\alpha)%\) Confidence Interval Hypothesis Testing Test Statistic
When \(\sigma^2\) is known \(\bar{X}_1 - \bar{X}_2 \pm z_{\alpha/2}\sqrt{\frac{\sigma^2_1}{n_1}+\frac{\sigma^2_2}{n_2}}\) \(z= \frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)_0}{\sqrt{\frac{\sigma^2_1}{n_1}+\frac{\sigma^2_2}{n_2}}}\)
When \(\sigma^2\) is unknown, Variances Assumed EQUAL \(\bar{X}_1 - \bar{X}_2 \pm t_{\alpha/2}\sqrt{s^2_p(\frac{1}{n_1}+\frac{1}{n_2})}\) \(t = \frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)_0}{\sqrt{s^2_p(\frac{1}{n_1}+\frac{1}{n_2})}}\) Pooled Variance: \(s_p^2 = \frac{(n_1 -1)s^2_1 - (n_2-1)s^2_2}{n_1 + n_2 -2}\) Degrees of Freedom: \(\gamma = n_1 + n_2 -2\)
When \(\sigma^2\) is unknown, Variances Assumed UNEQUAL \(\bar{X}_1 - \bar{X}_2 \pm t_{\alpha/2}\sqrt{(\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2})}\) \(t = \frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)_0}{\sqrt{(\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2})}}\) Degrees of Freedom: \(\gamma = \frac{(\frac{s_1^2}{n_1}+\frac{s^2_2}{n_2})^2}{\frac{(\frac{s_1^2}{n_1})^2}{n_1-1}+\frac{(\frac{s_2^2}{n_2})^2}{n_2-1}}\)

4.3.3 For Difference of Means, Paired Samples

Metric Formula
Confidence Interval \(\bar{D} \pm t_{\alpha/2}\frac{s_d}{\sqrt{n}}\)
Hypothesis Test Statistic \(t = \frac{\bar{D} - D_0}{s_d / \sqrt{n}}\)

4.3.4 For Difference of Two Proportions

The mean of the difference between two sample proportions is given by:

\[ \hat{p_1} - \hat{p_2} \]

The variance of the difference in proportions is:

\[ \frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2} \]

A \(100(1-\alpha)\%\) confidence interval for the difference in proportions is calculated as:

\[ \hat{p_1} - \hat{p_2} \pm z_{\alpha/2} \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}} \]

where

  • \(z_{\alpha/2}\): The critical value from the standard normal distribution.

  • \(\hat{p_1}\), \(\hat{p_2}\): Sample proportions.

  • \(n_1\), \(n_2\): Sample sizes.

Sample Size for a Desired Confidence Level and Margin of Error

To achieve a margin of error \(d\) for a given confidence level, the required sample size can be estimated as follows:

  1. With Prior Estimates of \(\hat{p_1}\) and \(\hat{p_2}\): \[ n \approx \frac{z_{\alpha/2}^2 \left[p_1(1-p_1) + p_2(1-p_2)\right]}{d^2} \]

  2. Without Prior Estimates (assuming maximum variability, \(\hat{p} = 0.5\)): \[ n \approx \frac{z_{\alpha/2}^2}{2d^2} \]

Hypothesis Testing for Difference in Proportions

The test statistic for hypothesis testing depends on the null hypothesis:

  1. When \((p_1 - p_2) \neq 0\): \[ z = \frac{(\hat{p_1} - \hat{p_2}) - (p_1 - p_2)_0}{\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}} \]

  2. When \((p_1 - p_2)_0 = 0\) (testing equality of proportions): \[ z = \frac{\hat{p_1} - \hat{p_2}}{\sqrt{\hat{p}(1-\hat{p}) \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}} \]

where \(\hat{p}\) is the pooled sample proportion:

\[ \hat{p} = \frac{x_1 + x_2}{n_1 + n_2} = \frac{n_1\hat{p_1} + n_2\hat{p_2}}{n_1 + n_2} \]


4.3.5 For Single Proportion

The \(100(1-\alpha)\%\) confidence interval for a population proportion \(p\) is:

\[ \hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

Sample Size Determination

  • With Prior Estimate (\(\hat{p}\)): \[ n \approx \frac{z_{\alpha/2}^2 \hat{p}(1-\hat{p})}{d^2} \]

  • Without Prior Estimate: \[ n \approx \frac{z_{\alpha/2}^2}{4d^2} \]

The test statistic for \(H_0: p = p_0\) is:

\[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}} \]


4.3.6 For Single Variance

For a sample variance \(s^2\) with \(n\) observations, the \(100(1-\alpha)\%\) confidence interval for the population variance \(\sigma^2\) is:

\[ \begin{aligned} 1 - \alpha &= P( \chi_{1-\alpha/2;n-1}^2) \le (n-1)s^2/\sigma^2 \le \chi_{\alpha/2;n-1}^2)\\ &=P\left(\frac{(n-1)s^2}{\chi^2_{\alpha/2; n-1}} \leq \sigma^2 \leq \frac{(n-1)s^2}{\chi^2_{1-\alpha/2; n-1}}\right) \end{aligned} \]

Equivalently, the confidence interval can be written as:

\[ \left(\frac{(n-1)s^2}{\chi^2_{\alpha/2}}, \frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}\right) \]

To find confidence limits for \(\sigma\), compute the square root of the interval bounds:

\[ \text{Confidence Interval for } \sigma: \quad \left(\sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2}}}, \sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}}\right) \]

Hypothesis Testing for Variance

The test statistic for testing a null hypothesis about a population variance (\(\sigma^2_0\)) is:

\[ \chi^2 = \frac{(n-1)s^2}{\sigma^2_0} \]

This test statistic follows a chi-squared distribution with \(n-1\) degrees of freedom under the null hypothesis.

4.3.7 Non-parametric Tests

Method Purpose Assumptions
Sign Test Test median None (ordinal data sufficient)
Wilcoxon Signed Rank Test Test symmetry around a value Symmetry of distribution
Wald-Wolfowitz Runs Test Test for randomness Independent observations
Quantile (or Percentile) Test Test specific quantile None (ordinal data sufficient)

4.3.7.1 Sign Test

The Sign Test is used to test hypotheses about the median of a population, \(\mu_{(0.5)}\), without assuming a specific distribution for the data. This test is ideal for small sample sizes or when normality assumptions are not met.

To test the population median, consider the hypotheses:

  • Null Hypothesis: \(H_0: \mu_{(0.5)} = 0\)
  • Alternative Hypothesis: \(H_a: \mu_{(0.5)} > 0\) (one-sided test)

Steps:

  1. Count Positive and Negative Deviations:

    • Count observations (\(y_i\)) greater than 0: \(s_+\) (number of positive signs).
    • Count observations less than 0: \(s_-\) (number of negative signs).
    • \(s_- = n - s_+\).
  2. Decision Rule:

    • Reject \(H_0\) if \(s_+\) is large (or equivalently, \(s_-\) is small).
    • To determine how large \(s_+\) must be, use the distribution of \(S_+\) under \(H_0\), which is Binomial with \(p = 0.5\).
  3. Null Distribution:
    Under \(H_0\), \(S_+\) follows: \[ S_+ \sim Binomial(n, p = 0.5) \]

  4. Critical Value:
    Reject \(H_0\) if: \[ s_+ \ge b_{n,\alpha} \] where \(b_{n,\alpha}\) is the upper \(\alpha\) critical value of the binomial distribution.

  5. p-value Calculation:
    Compute the p-value for the observed (one-tailed) \(s_+\) as: \[ \text{p-value} = P(S \ge s_+) = \sum_{i=s_+}^{n} \binom{n}{i} \left(\frac{1}{2}\right)^n \]

    Alternatively: \[ P(S \le s_-) = \sum_{i=0}^{s_-} \binom{n}{i} \left(\frac{1}{2}\right)^n \]


Large Sample Normal Approximation

For large \(n\), use a normal approximation for the binomial test. Reject \(H_0\) if: \[ s_+ \ge \frac{n}{2} + \frac{1}{2} + z_{\alpha} \sqrt{\frac{n}{4}} \] where \(z_\alpha\) is the critical value for a one-sided test.

For two-sided tests, use the maximum or minimum of \(s_+\) and \(s_-\):

  • Test statistic: \(s_{\text{max}} = \max(s_+, s_-)\) or \(s_{\text{min}} = \min(s_+, s_-)\)

  • Reject \(H_0\) if \(p\)-value is less than \(\alpha\), where: \[ p\text{-value} = 2 \sum_{i=s_{\text{max}}}^{n} \binom{n}{i} \left(\frac{1}{2}\right)^n = 2 \sum_{i = 0}^{s_{min}} \binom{n}{i} \left( \frac{1}{2} \right)^n \]

Equivalently, rejecting \(H_0\) if \(s_{max} \ge b_{n,\alpha/2}\).

For large \(n\), the normal approximation uses: \[ z = \frac{s_{\text{max}} - \frac{n}{2} - \frac{1}{2}}{\sqrt{\frac{n}{4}}} \]
Reject \(H_0\) at \(\alpha\) if \(z \ge z_{\alpha/2}\).

Handling zeros in the data is a common issue with the Sign Test:

  1. Random Assignment: Assign zeros randomly to either \(s_+\) or \(s_-\) (2 researchers might get different results).
  2. Fractional Assignment: Count each zero as \(0.5\) toward both \(s_+\) and \(s_-\) (but then we could not apply the Binomial Distribution afterward).
  3. Ignore Zeros: Ignore zeros, but note this reduces the sample size and power.
# Example Data
data <- c(0.76, 0.82, 0.80, 0.79, 1.06, 0.83, -0.43, -0.34, 3.34, 2.33)

# Count positive signs
s_plus <- sum(data > 0)

# Sample size excluding zeros
n <- length(data)

# Perform a one-sided binomial test
binom.test(s_plus, n, p = 0.5, alternative = "greater")
#> 
#>  Exact binomial test
#> 
#> data:  s_plus and n
#> number of successes = 8, number of trials = 10, p-value = 0.05469
#> alternative hypothesis: true probability of success is greater than 0.5
#> 95 percent confidence interval:
#>  0.4930987 1.0000000
#> sample estimates:
#> probability of success 
#>                    0.8

4.3.7.2 Wilcoxon Signed Rank Test

The Wilcoxon Signed Rank Test is an improvement over the Sign Test as it considers both the magnitude and direction of deviations from the null hypothesis value (e.g., 0). However, this test assumes that the data are symmetrically distributed around the median, unlike the Sign Test.

We test the following hypotheses:

\[ H_0: \mu_{(0.5)} = 0 \\ H_a: \mu_{(0.5)} > 0 \]

This example assumes no ties or duplicate observations in the data.

Procedure for the Signed Rank Test

  1. Rank the Absolute Values:
    • Rank the observations \(y_i\) based on their absolute values.
    • Let \(r_i\) denote the rank of \(y_i\).
    • Since there are no ties, ranks \(r_i\) are uniquely determined and form a permutation of integers \(1, 2, \dots, n\).
  2. Calculate \(w_+\) and \(w_-\):
    • \(w_+\) is the sum of the ranks corresponding to positive values of \(y_i\).
    • \(w_-\) is the sum of the ranks corresponding to negative values of \(y_i\).
    • By definition: \[ w_+ + w_- = \sum_{i=1}^n r_i = \frac{n(n+1)}{2} \]
  3. Decision Rule:
    • Reject \(H_0\) if \(w_+\) is large (or equivalently, if \(w_-\) is small).

Null Distribution of \(W_+\)

Under the null hypothesis, the distributions of \(W_+\) and \(W_-\) are identical and symmetric. The p-value for a one-sided test is:

\[ \text{p-value} = P(W \ge w_+) = P(W \le w_-) \]

An \(\alpha\)-level test rejects \(H_0\) if \(w_+ \ge w_{n,\alpha}\), where \(w_{n,\alpha}\) is the critical value from a table of the null distribution of \(W_+\).

For two-sided tests, use:

\[ p\text{-value} = 2P(W \ge w_{max}) = 2P(W \le w_{min}) \]


Normal Approximation for Large Samples

For large \(n\), the null distribution of \(W_+\) can be approximated by a normal distribution:

\[ z = \frac{w_+ - \frac{n(n+1)}{4} - \frac{1}{2}}{\sqrt{\frac{n(n+1)(2n+1)}{24}}} \]

The test rejects \(H_0\) at level \(\alpha\) if:

\[ w_+ \ge \frac{n(n+1)}{4} + \frac{1}{2} + z_{\alpha} \sqrt{\frac{n(n+1)(2n+1)}{24}} \approx w_{n,\alpha} \]

For a two-sided test, the decision rule uses the maximum or minimum of \(w_+\) and \(w_-\):

  • \(w_{max} = \max(w_+, w_-)\)

  • \(w_{min} = \min(w_+, w_-)\)

The p-value is computed as:

\[ p\text{-value} = 2P(W \ge w_{max}) = 2P(W \le w_{min}) \]


Handling Tied Ranks

If some observations \(|y_i|\) have tied absolute values, assign the average rank (or “midrank”) to all tied values. For example:

  • Suppose \(y_1 = -1\), \(y_2 = 3\), \(y_3 = -3\), and \(y_4 = 5\).
  • The ranks for \(|y_i|\) are:
    • \(|y_1| = 1\): \(r_1 = 1\)
    • \(|y_2| = |y_3| = 3\): \(r_2 = r_3 = \frac{2+3}{2} = 2.5\)
    • \(|y_4| = 5\): \(r_4 = 4\)
# Example Data
data <- c(0.76, 0.82, 0.80, 0.79, 1.06, 0.83, -0.43, -0.34, 3.34, 2.33)

# Perform Wilcoxon Signed Rank Test (exact test)
wilcox_exact <- wilcox.test(data, exact = TRUE)

# Display results
wilcox_exact
#> 
#>  Wilcoxon signed rank exact test
#> 
#> data:  data
#> V = 52, p-value = 0.009766
#> alternative hypothesis: true location is not equal to 0

For large samples, you can use the normal approximation by setting exact = FALSE:

# Perform Wilcoxon Signed Rank Test (normal approximation)
wilcox_normal <- wilcox.test(data, exact = FALSE)

# Display results
wilcox_normal
#> 
#>  Wilcoxon signed rank test with continuity correction
#> 
#> data:  data
#> V = 52, p-value = 0.01443
#> alternative hypothesis: true location is not equal to 0

4.3.7.3 Wald-Wolfowitz Runs Test

The Runs Test is a non-parametric test used to examine the randomness of a sequence. Specifically, it tests whether the order of observations in a sequence is random. This test is useful in detecting non-random patterns, such as trends, clustering, or periodicity.

The hypotheses for the Runs Test are:

  • Null Hypothesis: \(H_0\): The sequence is random.
  • Alternative Hypothesis: \(H_a\): The sequence is not random.

A run is a sequence of consecutive observations of the same type. For example: - In the binary sequence + + - - + - + +, there are 5 runs: ++, --, +, -, ++.

Runs can be formed based on any classification criteria, such as:

  • Positive vs. Negative values

  • Above vs. Below the median

  • Success vs. Failure in binary outcomes

Test Statistic

  1. Number of Runs (\(R\)):
    The observed number of runs in the sequence.

  2. Expected Number of Runs (\(E[R]\)):
    Under the null hypothesis of randomness, the expected number of runs is: \[ E[R] = \frac{2 n_1 n_2}{n_1 + n_2} + 1 \] where:

    • \(n_1\): Number of observations in the first category (e.g., positives).
    • \(n_2\): Number of observations in the second category (e.g., negatives).
    • \(n = n_1 + n_2\): Total number of observations.
  3. Variance of Runs (\(\text{Var}[R]\)):
    The variance of the number of runs is given by: \[ \text{Var}[R] = \frac{2 n_1 n_2 (2 n_1 n_2 - n)}{n^2 (n - 1)} \]

  4. Standardized Test Statistic (\(z\)):
    For large samples (\(n \geq 20\)), the test statistic is approximately normally distributed: \[ z = \frac{R - E[R]}{\sqrt{\text{Var}[R]}} \]

Decision Rule

  • Compute the \(z\)-value and compare it to the critical value of the standard normal distribution.
  • For a significance level \(\alpha\):
    • Reject \(H_0\) if \(|z| \ge z_{\alpha/2}\) (two-sided test).
    • Reject \(H_0\) if \(z \ge z_\alpha\) or \(z \le -z_\alpha\) for one-sided tests.

Steps for Conducting a Runs Test:

  1. Classify the data into two groups (e.g., above/below median, positive/negative).
  2. Count the total number of runs (\(R\)).
  3. Compute \(E[R]\) and \(\text{Var}[R]\) based on \(n_1\) and \(n_2\).
  4. Compute the \(z\)-value for the observed number of runs.
  5. Compare the \(z\)-value to the critical value to decide whether to reject \(H_0\).

For a numerical dataset where the test is based on values above and below the median:

# Example dataset
data <- c(1.2, -0.5, 3.4, -1.1, 2.8, -0.8, 4.5, 0.7)

library(randtests)
# Perform Runs Test (above/below median)
runs.test(data)
#> 
#>  Runs Test
#> 
#> data:  data
#> statistic = 2.2913, runs = 8, n1 = 4, n2 = 4, n = 8, p-value = 0.02195
#> alternative hypothesis: nonrandomness

The output of the runs.test function includes:

  • Observed Runs: The actual number of runs in the sequence.

  • Expected Runs: The expected number of runs under \(H_0\).

  • p-value: The probability of observing a number of runs as extreme as the observed one under \(H_0\).

  • If the p-value is less than \(\alpha\), reject \(H_0\) and conclude that the sequence is not random.

Limitations of the Runs Test

  • The test assumes that observations are independent.

  • For small sample sizes, the test may have limited power.

  • Ties in the data must be resolved by a predefined rule (e.g., treating ties as belonging to one group or excluding them).

4.3.7.4 Quantile (or Percentile) Test

The Quantile Test (also called the Percentile Test) is a non-parametric test used to evaluate whether the proportion of observations falling within a specific quantile matches the expected proportion under the null hypothesis. This test is useful for assessing the distribution of data when specific quantiles (e.g., medians or percentiles) are of interest.

Suppose we want to test whether the true proportion of data below a specified quantile \(q\) matches a given probability \(p\). The hypotheses are:

  • Null Hypothesis: \(H_0\): The true proportion is equal to \(p\).
  • Alternative Hypothesis: \(H_a\): The true proportion is not equal to \(p\) (two-sided), greater than \(p\) (right-tailed), or less than \(p\) (left-tailed).

Test Statistic

The test statistic is based on the observed count of data points below the specified quantile.

  1. Observed Count (\(k\)):
    The number of data points \(y_i\) such that \(y_i \leq q\).

  2. Expected Count (\(E[k]\)):
    The expected number of observations below the quantile \(q\) under \(H_0\) is: \[ E[k] = n \cdot p \]

  3. Variance:
    Under the binomial distribution, the variance is: \[ \text{Var}[k] = n \cdot p \cdot (1 - p) \]

  4. Standardized Test Statistic (\(z\)):
    For large \(n\), the test statistic is approximately normally distributed: \[ z = \frac{k - E[k]}{\sqrt{\text{Var}[k]}} = \frac{k - n \cdot p}{\sqrt{n \cdot p \cdot (1 - p)}} \]

Decision Rule

  1. Compute the \(z\)-value for the observed count.
  2. Compare the \(z\)-value to the critical value of the standard normal distribution:
    • For a two-sided test, reject \(H_0\) if \(|z| \geq z_{\alpha/2}\).
    • For a one-sided test, reject \(H_0\) if \(z \geq z_\alpha\) (right-tailed) or \(z \leq -z_\alpha\) (left-tailed).

Alternatively, calculate the p-value and reject \(H_0\) if the p-value \(\leq \alpha\).

Suppose we have a dataset and want to test whether the proportion of observations below the 50th percentile (median) matches the expected value of \(p = 0.5\).

# Example data
data <- c(12, 15, 14, 10, 13, 11, 14, 16, 15, 13)

# Define the quantile to test
quantile_value <- quantile(data, 0.5) # Median
p <- 0.5                             # Proportion under H0

# Count observed values below or equal to the quantile
k <- sum(data <= quantile_value)

# Sample size
n <- length(data)

# Expected count under H0
expected_count <- n * p

# Variance
variance <- n * p * (1 - p)

# Test statistic (z-value)
z <- (k - expected_count) / sqrt(variance)

# Calculate p-value for two-sided test
p_value <- 2 * (1 - pnorm(abs(z)))

# Output results
list(
  quantile_value = quantile_value,
  observed_count = k,
  expected_count = expected_count,
  z_value = z,
  p_value = p_value
)
#> $quantile_value
#>  50% 
#> 13.5 
#> 
#> $observed_count
#> [1] 5
#> 
#> $expected_count
#> [1] 5
#> 
#> $z_value
#> [1] 0
#> 
#> $p_value
#> [1] 1

For a one-sided test (e.g., testing whether the proportion is greater than \(p\)):

# Calculate one-sided p-value
p_value_one_sided <- 1 - pnorm(z)

# Output one-sided p-value
p_value_one_sided
#> [1] 0.5

Interpretation of Results

  • p-value: If the p-value is less than \(\alpha\), reject \(H_0\) and conclude that the proportion of observations below the quantile deviates significantly from \(p\).

  • Quantile Test Statistic (\(z\)): The \(z\)-value indicates how many standard deviations the observed count is from the expected count under the null hypothesis. Large positive or negative \(z\) values suggest non-random deviations.

Assumptions of the Test

  1. Observations are independent.

  2. The sample size is large enough for the normal approximation to the binomial distribution to be valid (\(n \cdot p \geq 5\) and \(n \cdot (1 - p) \geq 5\)).

Limitations of the Test

  • For small sample sizes, the normal approximation may not hold. In such cases, exact binomial tests are more appropriate.

  • The test assumes that the quantile used (e.g., the median) is well-defined and correctly calculated from the data.