Chapter 6 Nonparametric tests
This chapter overviews some well-known nonparametric hypothesis tests.194 The reviewed tests are intended for different purposes, mostly related to: (i) the evaluation of the goodness-of-fit of a distribution model to a dataset; and (ii) the assessment of the relation between two random variables.
A nonparametric test evaluates a null hypothesis H0 against an alternative H1 without assuming any parametric model, on neither H0 nor H1. Consequently, a nonparametric test is free from the overhead of evaluating a parametric assumption that one needs to conduct before applying a parametric test.195 More importantly, it is quite likely that the inspection of these parametric assumptions has a negative outcome that forbids the subsequent application of a parametric test. The direct applicability and generality of nonparametric tests are the reasons for their usefulness in real-data applications.
Nonparametric tests have lower efficiency with respect to optimal parametric tests for specific parametric problems.196 Statistical inference is full of instances of such parametric tests, especially within the context of normal populations.197 For example, given two iid samples X11,…,X1n1 and X21,…,X2n2 from two normal populations X1∼N(μ1,σ2) and X2∼N(μ2,σ2), the test for the equality of the means,
H0:μ1=μ2vs.H1:μ1≠μ2,
is optimally carried out using the test statistic Tn:=ˉX1−ˉX2S√1/n1+1/n2, where S2:=1n1+n2−2(∑n1i=1(X1i−ˉX1)2+∑n2i=1(X2i−ˉX2)2) is the pooled sample variance. The distribution of Tn under H0 is tn1+n2−2, which is compactly denoted by TnH0∼tn1+n2−2. For this result to hold, it is key that the two populations are indeed normally distributed, an assumption that may be unrealistic in practice. Recall that, under H0, this test states the equality of distributions of X1 and X2. A nonparametric alternative therefore is the Kolmogorov–Smirnov test for two samples, to be seen in Section 6.2. It evaluates if the distributions of X1∼F1 and X2∼F2 are equal:
H0:F1=F2vs.H1:F1≠F2.
Finally, the term goodness-of-fit refers to the statistical tests that check the adequacy of a model for explaining a sample. For example, a goodness-of-fit test allows answering if a normal model is “acceptable” to describe a given sample X1,…,Xn. Initially, the concept of goodness-of-fit test was proposed for distribution models, but it was later extended to regression198 and other statistical models,199 although such extensions are not addressed in these notes.
References
If necessary, see Section C for an informal review on the main concepts involved in hypothesis testing.↩︎
This prior assessment is of key importance to ensure coherency between the real and the assumed data distributions, as the parametric test bases its decision on the latter. An example to dramatize this point follows. Let X1∼N(μ,σ2) and X2∼Γ(μ/σ2,μ2/σ2), for μ,σ2>0. The cdfs of X1 and X2, F1 and F2, are different for all μ,σ2>0. Yet E[X1]=E[X2] and Var[X1]=Var[X2]. When testing H0:F1=F2, if one assumes that X1 and X2 are normally distributed (which is partially true), then one can use a t-test with unknown variances. The t-test will believe H0 is true, since E[X1]=E[X2] and Var[X1]=Var[X2], thus having a rejection rate equal to the significance level α. However, by construction, H0 is false. The t-test fails to reject H0 because its parametric assumption does not match the reality.↩︎
These optimal parametric tests are often obtained by maximum likelihood theory.↩︎
See, e.g., Section 6.2 in Molina Peralta and García-Portugués (2025).↩︎
See González-Manteiga and Crujeiras (2013) for an exhaustive review of the topic.↩︎
For example, there are goodness-of-fit tests for time series models, such as ARMA(p,q) models (see, e.g., Velilla (1994) and references therein).↩︎