24.2 Nonparametric ANOVA
When assumptions of normality and equal variance are not satisfied, we use nonparametric ANOVA tests, which rank the data instead of using raw values.
24.2.1 Kruskal-Wallis Test (One-Way Nonparametric ANOVA)
The Kruskal-Wallis test is a generalization of the Wilcoxon rank-sum test to more than two independent samples. It is an alternative to one-way ANOVA when normality is not assumed.
Setup
- \(a \geq 2\) independent treatments.
- \(n_i\) is the sample size for the \(i\)-th treatment.
- \(Y_{ij}\) is the \(j\)-th observation from the \(i\)-th treatment.
- No assumption of normality.
- Assume observations are independent random samples from continuous CDFs \(F_1, F_2, \dots, F_a\).
Hypotheses
\[ \begin{aligned} &H_0: F_1 = F_2 = \dots = F_a \quad \text{(All distributions are identical)} \\ &H_a: F_i < F_j \text{ for some } i \neq j \end{aligned} \] If the data come from a location-scale family, the hypothesis simplifies to:
\[ H_0: \theta_1 = \theta_2 = \dots = \theta_a \]
Procedure
Rank all \(N = \sum_{i=1}^a n_i\) observations in ascending order.
Let \(r_{ij} = rank(Y_{ij})\)
The sum of ranks must satisfy:\[ \sum_i \sum_j r_{ij} = \frac{N(N+1)}{2} \]
Compute rank sums and averages: \[ r_{i.} = \sum_{j=1}^{n_i} r_{ij}, \quad \bar{r}_{i.} = \frac{r_{i.}}{n_i} \]
Calculate the test statistic:
\[ \chi_{KW}^2 = \frac{SSTR}{\frac{SSTO}{N-1}} \]
where:
- Treatment Sum of Squares: \[ SSTR = \sum n_i (\bar{r}_{i.} - \bar{r}_{..})^2 \]
- Total Sum of Squares: \[ SSTO = \sum_i \sum_j (r_{ij} - \bar{r}_{..})^2 \]
- Overall Mean Rank: \[ \bar{r}_{..} = \frac{N+1}{2} \]
Compare to a chi-square distribution:
- For large \(n_i\) (\(\geq 5\)), \(\chi^2_{KW} \sim \chi^2_{a-1}\).
- Reject \(H_0\) if: \[ \chi^2_{KW} > \chi^2_{(1-\alpha; a-1)} \]
Exact Test for Small Samples:
- Compute all possible rank assignments:
\[ \frac{N!}{n_1! n_2! \dots n_a!} \] - Evaluate each Kruskal-Wallis statistic and determine the empirical p-value.
- Compute all possible rank assignments:
24.2.2 Friedman Test (Nonparametric Two-Way ANOVA)
The Friedman test is a distribution-free alternative to two-way ANOVA when data are measured in a randomized complete block design and normality cannot be assumed.
Setup
- \(Y_{ij}\) represents responses from \(n\) blocks and \(r\) treatments.
- Assume no normality or homogeneity of variance.
- Let \(F_{ij}\) be the CDF of \(Y_{ij}\), corresponding to observed values.
Hypotheses
\[ \begin{aligned} &H_0: F_{i1} = F_{i2} = \dots = F_{ir} \quad \forall i \quad \text{(Identical distributions within each block)} \\ &H_a: F_{ij} < F_{ij'} \text{ for some } j \neq j' \quad \forall i \end{aligned} \]
For location-scale families, the hypothesis simplifies to:
\[ \begin{aligned} &H_0: \tau_1 = \tau_2 = \dots = \tau_r \\ &H_a: \tau_j > \tau_{j'} \text{ for some } j \neq j' \end{aligned} \]
Procedure
Rank observations within each block separately (ascending order).
- If there are ties, assign average ranks.
Compute test statistic:
\[ \chi^2_F = \frac{SSTR}{\frac{SSTR + SSE}{n(r-1)}} \]
where:
- Treatment Sum of Squares: \[ SSTR = n \sum (\bar{r}_{.j} - \bar{r}_{..})^2 \]
- Error Sum of Squares: \[ SSE = \sum_i \sum_j (r_{ij} - \bar{r}_{.j})^2 \]
- Mean Ranks: \[ \bar{r}_{.j} = \frac{\sum_i r_{ij}}{n}, \quad \bar{r}_{..} = \frac{r+1}{2} \]
Alternative Formula for Large Samples (No Ties):
If no ties, Friedman’s statistic simplifies to:
\[ \chi^2_F = \left[\frac{12}{nr(n+1)} \sum_j r_{.j}^2\right] - 3n(r+1) \]
Compare to a chi-square distribution:
- For large \(n\), \(\chi^2_F \sim \chi^2_{r-1}\).
- Reject \(H_0\) if: \[ \chi^2_F > \chi^2_{(1-\alpha; r-1)} \]
Exact Test for Small Samples:
- Compute all possible ranking permutations: \[ (r!)^n \]
- Evaluate each Friedman statistic and determine the empirical p-value.