24.2 Nonparametric ANOVA

When assumptions of normality and equal variance are not satisfied, we use nonparametric ANOVA tests, which rank the data instead of using raw values.


24.2.1 Kruskal-Wallis Test (One-Way Nonparametric ANOVA)

The Kruskal-Wallis test is a generalization of the Wilcoxon rank-sum test to more than two independent samples. It is an alternative to one-way ANOVA when normality is not assumed.

Setup

  • \(a \geq 2\) independent treatments.
  • \(n_i\) is the sample size for the \(i\)-th treatment.
  • \(Y_{ij}\) is the \(j\)-th observation from the \(i\)-th treatment.
  • No assumption of normality.
  • Assume observations are independent random samples from continuous CDFs \(F_1, F_2, \dots, F_a\).

Hypotheses

\[ \begin{aligned} &H_0: F_1 = F_2 = \dots = F_a \quad \text{(All distributions are identical)} \\ &H_a: F_i < F_j \text{ for some } i \neq j \end{aligned} \] If the data come from a location-scale family, the hypothesis simplifies to:

\[ H_0: \theta_1 = \theta_2 = \dots = \theta_a \]


Procedure

  1. Rank all \(N = \sum_{i=1}^a n_i\) observations in ascending order.
    Let \(r_{ij} = rank(Y_{ij})\)
    The sum of ranks must satisfy:

    \[ \sum_i \sum_j r_{ij} = \frac{N(N+1)}{2} \]

  2. Compute rank sums and averages: \[ r_{i.} = \sum_{j=1}^{n_i} r_{ij}, \quad \bar{r}_{i.} = \frac{r_{i.}}{n_i} \]

  3. Calculate the test statistic:

    \[ \chi_{KW}^2 = \frac{SSTR}{\frac{SSTO}{N-1}} \]

    where:

    • Treatment Sum of Squares: \[ SSTR = \sum n_i (\bar{r}_{i.} - \bar{r}_{..})^2 \]
    • Total Sum of Squares: \[ SSTO = \sum_i \sum_j (r_{ij} - \bar{r}_{..})^2 \]
    • Overall Mean Rank: \[ \bar{r}_{..} = \frac{N+1}{2} \]
  4. Compare to a chi-square distribution:

    • For large \(n_i\) (\(\geq 5\)), \(\chi^2_{KW} \sim \chi^2_{a-1}\).
    • Reject \(H_0\) if: \[ \chi^2_{KW} > \chi^2_{(1-\alpha; a-1)} \]
  5. Exact Test for Small Samples:

    • Compute all possible rank assignments:
      \[ \frac{N!}{n_1! n_2! \dots n_a!} \]
    • Evaluate each Kruskal-Wallis statistic and determine the empirical p-value.

24.2.2 Friedman Test (Nonparametric Two-Way ANOVA)

The Friedman test is a distribution-free alternative to two-way ANOVA when data are measured in a randomized complete block design and normality cannot be assumed.

Setup

  • \(Y_{ij}\) represents responses from \(n\) blocks and \(r\) treatments.
  • Assume no normality or homogeneity of variance.
  • Let \(F_{ij}\) be the CDF of \(Y_{ij}\), corresponding to observed values.

Hypotheses

\[ \begin{aligned} &H_0: F_{i1} = F_{i2} = \dots = F_{ir} \quad \forall i \quad \text{(Identical distributions within each block)} \\ &H_a: F_{ij} < F_{ij'} \text{ for some } j \neq j' \quad \forall i \end{aligned} \]

For location-scale families, the hypothesis simplifies to:

\[ \begin{aligned} &H_0: \tau_1 = \tau_2 = \dots = \tau_r \\ &H_a: \tau_j > \tau_{j'} \text{ for some } j \neq j' \end{aligned} \]


Procedure

  1. Rank observations within each block separately (ascending order).

    • If there are ties, assign average ranks.
  2. Compute test statistic:

    \[ \chi^2_F = \frac{SSTR}{\frac{SSTR + SSE}{n(r-1)}} \]

    where:

    • Treatment Sum of Squares: \[ SSTR = n \sum (\bar{r}_{.j} - \bar{r}_{..})^2 \]
    • Error Sum of Squares: \[ SSE = \sum_i \sum_j (r_{ij} - \bar{r}_{.j})^2 \]
    • Mean Ranks: \[ \bar{r}_{.j} = \frac{\sum_i r_{ij}}{n}, \quad \bar{r}_{..} = \frac{r+1}{2} \]
  3. Alternative Formula for Large Samples (No Ties):

    If no ties, Friedman’s statistic simplifies to:

    \[ \chi^2_F = \left[\frac{12}{nr(n+1)} \sum_j r_{.j}^2\right] - 3n(r+1) \]

  4. Compare to a chi-square distribution:

    • For large \(n\), \(\chi^2_F \sim \chi^2_{r-1}\).
    • Reject \(H_0\) if: \[ \chi^2_F > \chi^2_{(1-\alpha; r-1)} \]
  5. Exact Test for Small Samples:

    • Compute all possible ranking permutations: \[ (r!)^n \]
    • Evaluate each Friedman statistic and determine the empirical p-value.