25.2 Nonparametric ANOVA
25.2.1 Kruskal-Wallis
Generalization of independent samples Wilcoxon Rank sum test for 2 independent samples (like F-test of one-way ANOVA is a generalization to several independent samples of the two sample t-test)
Consider the one-way case:
We have
- a≥2 treatments
- ni is the sample size for the i-th treatment
- Yij is the j-th observation from the i-th treatment.
- we make no assumption of normality
- We only assume that observations on the i-th treatment are a random sample from the continuous CDF Fi, i = 1,..,n, and are mutually independent.
H0:F1=F2=...=FaHa:Fi<Fj for some i≠j
or if distribution is from the location-scale family, H0:θ1=θ2=...=θa)
Procedure
- Rank all N=∑ai=1ni observations in ascending order. Let rij=rank(Yij), note ∑i∑jrij=1+2..+N=N(N+1)2
- Calculate the rank sums and averages:
ri.=ni∑j=1rij and ˉri.=ri.ni,i=1,..,a - Calculate the test statistic on the ranks: χ2KW=SSTRSSTON−1 where SSTR=∑ni(ˉri.−ˉr..)2 and SSTO=∑∑(ˉrij−ˉr..)2
- For large ni (≥5 observations) the Kruskal-Wallis statistic is approximated by a χ2a−1 distribution when all the treatment means are equal. Hence, reject H0 if χ2KW>χ2(1−α;a−1).
- If sample sizes are small, one can exhaustively work out all possible distinct ways of assigning N ranks to the observations from a treatments and calculate the value of the KW statistic in each case (N!n1!..na! possible combinations). Under H0 all of these assignments are equally likely.
25.2.2 Friedman Test
When the responses Yij=1,..,n,j=1,..,r in a randomized complete block design are not normally distributed (or do not have constant variance), a nonparametric test is more helpful.
A distribution-free rank-based test for comparing the treatments in this setting is the Friedman test. Let Fij be the CDF of random Yij, corresponding to the observed value yij
Under the null hypothesis, Fij are identical for all treatments j separately for each block i.
H0:Fi1=Fi2=...=Fir for all iHa:Fij<Fij′ for some j≠j′ for all i
For location parameter distributions, treatment effects can be tested:
H0:τ1=τ2=...=τrHa:τj>τj′ for some j≠j′
Procedure
- Rank observations from the r treatments separately within each block (in ascending order; if ties, each tied observation is given the mean of ranks involved). Let the ranks be called rij
- Calculate the Friedman test statistic
χ2F=SSTRSSTR+SSEn(r−1) where SSTR=n∑(ˉr.j−ˉr..)2SSE=∑∑(rij−ˉr.j)2ˉr.j=∑irijnˉr..=r+12
If there is no ties, it can be rewritten as
χ2F=[12nr(n+1)∑jr2.j]−3n(r+1)
with large number of blocks, χ2F is approximately χ2r−1 under H0. Hence, we reject H0 if χ2F>χ2(1−α;r−1)
The exact null distribution for χ2F can be derived since there are r! possible ways of assigning ranks 1,2,…,r to the r observations within each block. There are n blocks and thus (r!)n possible assignments to the ranks, which are equally likely when H0 is true.