9 Non Parametric tests

A statistical method is called non-parametric if it makes no assumption on the population distribution or sample size.

This is in contrast with most parametric methods in elementary statistics that assume the data is quantitative, the population has a normal distribution and the sample size is sufficiently large.

In general, conclusions drawn from non-parametric methods are not as powerful as the parametric ones. However, as non-parametric methods make fewer assumptions, they are more flexible, more robust, and applicable to non-quantitative data.

9.1 Sign Test

A sign test is used to decide whether a binomial distribution has the equal chance of success and failure.

Example

A soft drink company has invented a new drink, and would like to find out if it will be as popular as the existing favorite drink. For this purpose, its research department arranges 18 participants for taste testing. Each participant tries both drinks in random order before giving his or her opinion.

It turns out that 5 of the participants like the new drink better, and the rest prefer the old one. At .05 significance level, can we reject the notion that the two drinks are equally popular?

The null hypothesis is that the drinks are equally popular. Here we apply the binom.test function. As the p-value turns out to be 0.096525, and is greater than the .05 significance level, we do not reject the null hypothesis.

binom.test(5, 18) 
## 
##  Exact binomial test
## 
## data:  5 and 18
## number of successes = 5, number of trials = 18, p-value = 0.09625
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.09694921 0.53480197
## sample estimates:
## probability of success 
##              0.2777778

At .05 significance level, we do not reject the notion that the two drinks are equally popular.

9.2 Wilcoxon-Whitney-Wilcoxon Test

The paired samples Wilcoxon test (also known as Wilcoxon signed-rank test) is a non-parametric alternative to paired t-test used to compare paired data. It’s used when your data are not normally distributed. This tutorial describes how to compute paired samples Wilcoxon test in R.

9.3 Mann-Whitney-Wilcoxon Test

Here, we’ll use an example data set, which contains the sales before and after the treatment (discount).

# Weight of the mice before treatment
before <-c(200.1, 190.9, 192.7, 213, 241.4, 196.9, 172.2, 185.5, 205.2, 193.7)
# Weight of the mice after treatment
after <-c(392.9, 393.2, 345.1, 393, 434, 427.9, 422, 383.9, 392.3, 352.2)
# Create a data frame
my_data <- data.frame( 
                group = rep(c("before", "after"), each = 10),
                sales = c(before,  after)
                )

We want to know, if there is any significant difference in the median sales before and after treatment?

library("dplyr")
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
group_by(my_data, group) %>%
  summarise(
    count = n(),
    median = median(sales, na.rm = TRUE),
    IQR = IQR(sales, na.rm = TRUE)
  )
## # A tibble: 2 x 4
##   group  count median   IQR
##   <fctr> <int>  <dbl> <dbl>
## 1 after     10    393  28.8
## 2 before    10    195  12.6

Question : Is there any significant changes in the sales before after treatment?

res <- wilcox.test(sales ~ group, data = my_data, paired = TRUE)
res
## 
##  Wilcoxon signed rank test
## 
## data:  sales by group
## V = 55, p-value = 0.001953
## alternative hypothesis: true location shift is not equal to 0

The p-value of the test is 0.001953, which is less than the significance level alpha = 0.05. We can conclude that the median sales o before treatment is significantly different from the median sales after treatment with a p-value = 0.001953.

9.4 Kruskal-Wallis Test

Kruskal-Wallis test by rank is a non-parametric alternative to one-way ANOVA test, which extends the two-samples Wilcoxon test in the situation where there are more than two groups. It’s recommended when the assumptions of one-way ANOVA test are not met.

Example

Here, we’ll use the built-in R data set named PlantGrowth. It contains the weight of plants obtained under a control and two different treatment conditions.

my_data <- PlantGrowth
head(my_data)
##   weight group
## 1   4.17  ctrl
## 2   5.58  ctrl
## 3   5.18  ctrl
## 4   6.11  ctrl
## 5   4.50  ctrl
## 6   4.61  ctrl

Summary by group

group_by(my_data, group) %>%
  summarise(
    count = n(),
    mean = mean(weight, na.rm = TRUE),
    sd = sd(weight, na.rm = TRUE),
    median = median(weight, na.rm = TRUE),
    IQR = IQR(weight, na.rm = TRUE)
  )
## # A tibble: 3 x 6
##   group  count  mean    sd median   IQR
##   <fctr> <int> <dbl> <dbl>  <dbl> <dbl>
## 1 ctrl      10  5.03 0.583   5.15 0.743
## 2 trt1      10  4.66 0.794   4.55 0.662
## 3 trt2      10  5.53 0.443   5.44 0.467

We want to know if there is any significant difference between the average weights of plants in the 3 experimental conditions.

kruskal.test(weight ~ group, data = my_data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  weight by group
## Kruskal-Wallis chi-squared = 7.9882, df = 2, p-value = 0.01842

As the p-value is less than the significance level 0.05, we can conclude that there are significant differences between the treatment groups.

From the output of the Kruskal-Wallis test, we know that there is a significant difference between groups, but we don’t know which pairs of groups are different.

It’s possible to use the function pairwise.wilcox.test() to calculate pairwise comparisons between group levels with corrections for multiple testing.

pairwise.wilcox.test(PlantGrowth$weight, PlantGrowth$group,
                 p.adjust.method = "BH")
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  PlantGrowth$weight and PlantGrowth$group 
## 
##      ctrl  trt1 
## trt1 0.199 -    
## trt2 0.095 0.027
## 
## P value adjustment method: BH

The pairwise comparison shows that, only trt1 and trt2 are significantly different (p < 0.05).