6.6 Fisher’s Exact Test

Whereas the chi-squared and G-test rely on the approximation that the test statistic distribution approaches \(\chi^2\) as \(n \rightarrow \infty\), Fisher’s exact test is an “exact test” in that the p-value is calculated exactly from the hypergeometric distribution. Therefore Fisher’s test should only apply only to 2 x 2 tables. For some reason, it doesn’t.¹⁵

The test assumes the row totals \(n_{i+}\) and the column totals \(n_{+j}\) are fixed by study design, and the expected values of at least 20% of cells in the table have expected cell count >5, and no expected cell count is 0.

The famous example of the Fisher exact test is the “Lady tea testing” example. A lady claims she can guess whether the milk was poured into the cup before or after the tea. The experiment consists of 8 cups, 4 with milk poured first, 4 with milk poured second. The lady guesses correctly in 6 of the 8 cups.

(tea <- matrix(c(3, 1, 1, 3), nrow = 2, 
       dimnames = list(Guess = c("Milk", "Tea"),
                       Truth = c("Milk", "Tea"))))

##       Truth
## Guess  Milk Tea
##   Milk    3   1
##   Tea     1   3

This is a hypergeometric distribution question because you want to know the probability of 3 or 4 successes in a sample of 4. If \(X = k\) is the count of successful events in a sample of size \(n\) without replacement from a population of size \(N\) containing \(K\) successes, then \(X\) is a random variable with a hypergeometric distribution

\[f_X(k|N, K, n) = \frac{{{K}\choose{k}}{{N-K}\choose{n-k}}}{{N}\choose{n}}.\]

The formula follows from the frequency table of the possible outcomes. \({K}\choose{k}\) is the number of ways to get k successes in K draws. \({N-K}\choose{n-k}\) is the number of ways to fail to get k successes in K draws. And the denominator \({N}\choose{n}\) is the total of all ways to succeed and fail.

tibble::tribble(
  ~` `, ~Sampled, ~`Not Sampled`,  ~Total,
  "success", "k", "K-k", "K",
  "non-success", "n-k", "(N-K)-(n-k)", "N-K",
  "Total", "n", "N-n", "N"
) %>%
  flextable::flextable() %>%
  flextable::autofit()

	Sampled	Not Sampled	Total
success	k	K-k	K
non-success	n-k	(N-K)-(n-k)	N-K
Total	n	N-n	N

Function choose() returns the binomial coefficient \({{n}\choose{k}} = \frac{n!}{k!(n-k)!}\). The probability of choosing correctly at least 3 times out of 4 is the number of combinations of k = 3 plus the number with k = 4 divided by the number of combinations of any outcome.

k <- 3; n <- 4; K <- 4; N <- 8

choose(K, k) * choose(N-K, n-k) / choose(N, n) +
choose(K, k+1) * choose(N-K, n-(k+1)) / choose(N, n)

## [1] 0.2428571

phyper() does this.¹⁶

(pi <- phyper(q = k-1, m = K, n = N-K, k = n, lower.tail = FALSE))

## [1] 0.2428571

The p-value from Fisher’s exact test is calculated this way.

fisher.test(tea, alternative = "greater")

## 
##  Fisher's Exact Test for Count Data
## 
## data:  tea
## p-value = 0.2429
## alternative hypothesis: true odds ratio is greater than 1
## 95 percent confidence interval:
##  0.3135693       Inf
## sample estimates:
## odds ratio 
##   6.408309

The odds ratio at the bottom is the odds of success divided by the odds of non-success. The sample odds ratio is (3/1) / (1/3) = 9, but the documentation for fisher.test() explains that this it calculates the conditional maximum likelihood estimate.¹⁷

It doesn’t apply to just 2 x 2, so I need to figure out why.↩︎
k-1 because it returns the probability of >k and we want >=k).↩︎
I don’t really know what that means. Brownlee might help.↩︎