25.2 Nonparametric ANOVA

25.2.1 Kruskal-Wallis

Generalization of independent samples Wilcoxon Rank sum test for 2 independent samples (like F-test of one-way ANOVA is a generalization to several independent samples of the two sample t-test)

Consider the one-way case:

We have

$a\ge2$ treatments
$n_i$ is the sample size for the $i$ -th treatment
$Y_{ij}$ is the $j$ -th observation from the $i$ -th treatment.
we make no assumption of normality
We only assume that observations on the $i$ -th treatment are a random sample from the continuous CDF $F_i$ , i = 1,..,n, and are mutually independent.

$\begin{aligned} &H_0: F_1 = F_2 = ... = F_a \\ &H_a: F_i < F_j \text{ for some } i \neq j \end{aligned}$

or if distribution is from the location-scale family, $H_0: \theta_1 = \theta_2 = ... = \theta_a$ )

Procedure

Rank all $N = \sum_{i=1}^a n_i$ observations in ascending order. Let $r_{ij} = rank(Y_{ij})$ , note $\sum_i \sum_j r_{ij} = 1 + 2 .. + N = \frac{N(N+1)}{2}$
Calculate the rank sums and averages:
$r_{i.} = \sum_{j=1}^{n_i} r_{ij}$ and $\bar{r}_{i.} = \frac{r_{i.}}{n_i}, i = 1,..,a$
Calculate the test statistic on the ranks: $\chi_{KW}^2 = \frac{SSTR}{\frac{SSTO}{N-1}}$ where $SSTR = \sum n_i (\bar{r}_{i.}- \bar{r}_{..})^2$ and $SSTO = \sum \sum (\bar{r}_{ij}- \bar{r}_{..})^2$
For large $n_i$ ( $\ge 5$ observations) the Kruskal-Wallis statistic is approximated by a $\chi^2_{a-1}$ distribution when all the treatment means are equal. Hence, reject $H_0$ if $\chi^2_{KW} > \chi^2_{(1-\alpha;a-1)}$ .
If sample sizes are small, one can exhaustively work out all possible distinct ways of assigning N ranks to the observations from a treatments and calculate the value of the KW statistic in each case ( $\frac{N!}{n_1!..n_a!}$ possible combinations). Under $H_0$ all of these assignments are equally likely.

25.2.2 Friedman Test

When the responses $Y_{ij} = 1,..,n, j = 1,..,r$ in a randomized complete block design are not normally distributed (or do not have constant variance), a nonparametric test is more helpful.

A distribution-free rank-based test for comparing the treatments in this setting is the Friedman test. Let $F_{ij}$ be the CDF of random $Y_{ij}$ , corresponding to the observed value $y_{ij}$

Under the null hypothesis, $F_{ij}$ are identical for all treatments j separately for each block i.

$\begin{aligned} &H_0: F_{i1} = F_{i2} = ... = F_{ir} \text{ for all i} \\ &H_a: F_{ij} < F_{ij'} \text{ for some } j \neq j' \text{ for all } i \end{aligned}$

For location parameter distributions, treatment effects can be tested:

$\begin{aligned} &H_0: \tau_1 = \tau_2 = ... = \tau_r \\ &H_a: \tau_j > \tau_{j'} \text{ for some } j \neq j' \end{aligned}$

Procedure

Rank observations from the r treatments separately within each block (in ascending order; if ties, each tied observation is given the mean of ranks involved). Let the ranks be called $r_{ij}$
Calculate the Friedman test statistic
$\chi^2_F = \frac{SSTR}{\frac{SSTR + SSE}{n(r-1)}}$ where $\begin{aligned} SSTR &= n \sum (\bar{r}_{.j}-\bar{r}_{..})^2 \\ SSE &= \sum \sum (r_{ij} - \bar{r}_{.j})^2 \\ \bar{r}_{.j} &= \frac{\sum_i r_{ij}}{n}\\ \bar{r}_{..} &= \frac{r+1}{2} \end{aligned}$

If there is no ties, it can be rewritten as

$\chi^2_{F} = [\frac{12}{nr(n+1)}\sum_j r_{.j}^2] - 3n(r+1)$

with large number of blocks, $\chi^2_F$ is approximately $\chi^2_{r-1}$ under $H_0$ . Hence, we reject $H_0$ if $\chi^2_F > \chi^2_{(1-\alpha;r-1)}$
The exact null distribution for $\chi^2_F$ can be derived since there are r! possible ways of assigning ranks 1,2,…,r to the r observations within each block. There are n blocks and thus $(r!)^n$ possible assignments to the ranks, which are equally likely when $H_0$ is true.