6.7 Case Study: Chi-Square, Fisher

A researcher investigates whether males and females enrolled in an Exercise Science degree course differ in the type of exercise then engage, competitive or non-competitive. They survey 25 males and 25 females.

cs$or2x2 %>% 
  gtsummary::tbl_cross(
    percent = "row",
    label = list(comp ~ "Competitive")
  ) %>%
  gtsummary::add_p()
Competitive Total p-value1
Yes No
gender 0.023
    Male 18 (72%) 7 (28%) 25 (100%)
    Female 10 (40%) 15 (60%) 25 (100%)
Total 28 (56%) 22 (44%) 50 (100%)
1 Pearson's Chi-squared test

Conditions

Use the Chi-Square test of independence for nominal, independent variables. The test requires all cell counts to be greater than 5. If your data does not meet this condition, consider collapsing some levels. If you collapse all the way to a 2 x 2 cross-tabulation and still do not have at least 5 counts per cell, use Fisher’s exact test.

Use Fisher’s exact test for dichotomous nominal, independent variables. The test is valid for cross-sectional samples, but not for prospective or retrospective samples.

Test

Chi-Square

To see what’s going on, here is chi-square test done by hand. The expected values are the joint probabilities from the independence model (e.g., \(E(\mathrm{male}, {\mathrm{yes}}) = \pi_{\mathrm{male}} \times \pi_{\mathrm{yes}} \times n\)).

exer_table <- cs$or2x2 %>% table()
expected <- marginSums(exer_table, 1) %*% t(marginSums(exer_table, 2)) / sum(exer_table)
(X2 <- sum((exer_table - expected)^2 / expected))
## [1] 5.194805
(df <- (2 - 1) * (2 - 1))
## [1] 1
pchisq(X2, df, lower.tail = FALSE)
## [1] 0.02265449

And the G test.

(G <- 2 * sum(exer_table * log(exer_table / expected)))
## [1] 5.294731
pchisq(G, df, lower.tail = FALSE)
## [1] 0.02139004

The chisq.test() function applies the Yates continuity correction by default to correct for situations with small cell counts. The Yates continuity correction subtracts 0.5 from the \(O_{ij} - E_{ij}\) differences. Set correct = FALSE to suppress Yates. The Yates continuity correction only applies to 2 x 2 tables.

(cs$or2x2_chisq.test <- chisq.test(exer_table, correct = FALSE))
## 
##  Pearson's Chi-squared test
## 
## data:  exer_table
## X-squared = 5.1948, df = 1, p-value = 0.02265

Calculate the G test with DescTools::GTest().

(cs$or2x2_g.test <- DescTools::GTest(exer_table, correct = "none"))
## 
##  Log likelihood ratio (G-test) test of independence without correction
## 
## data:  exer_table
## G = 5.2947, X-squared df = 1, p-value = 0.02139

As a side note, if the cell size >5 condition is violated, you can use Monte Carlo simulation.

chisq.test(exer_table, correct = FALSE, simulate.p.value = TRUE)
## 
##  Pearson's Chi-squared test with simulated p-value (based on 2000
##  replicates)
## 
## data:  exer_table
## X-squared = 5.1948, df = NA, p-value = 0.05097
Fisher

The documentation for fisher.test() explains that the p-value is based on the the first element of the contingency table “with non-centrality parameter given by the odds ratio.” I don’t understand the odds ratio business and cannot figure out how it is calculated. It says in the documentation for the estimate value that “Note that the conditional Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used.”, so that may hold the answer.

At least the p-value I can calculate by hand.

phyper(q = exer_table[1, 1] - 1,  # k minus 1
       m = sum(exer_table[1, ]),  # K
       n = sum(exer_table[2, ]),  # N - K
       k = sum(exer_table[, 1]),  # n
       lower.tail = FALSE) * 2
## [1] 0.04500454
(cs$or2x2_fisher.test <- fisher.test(exer_table))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  exer_table
## p-value = 0.045
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##   1.02531 15.01800
## sample estimates:
## odds ratio 
##    3.74678
Post-Test: Phi, Cramer’s V

The problem with the chi-square test for association is that it does not measure the strength of any association. Two measures of associate are often calculated after the chi-squared or Fisher tests.

Cramer’s V is derived from the chi-square statistic. It restricts the statistic to a range of 0 to 1. 18 Values under .1 are considered poor evidence of association; >.2 are “moderately strong” evidence.

\[V = \sqrt{\frac{\chi^2 / n}{\mathrm{min}(I, J) - 1}}\]

# by hand
sqrt(cs$or2x2_chisq.test$statistic / sum(exer_table) / (min(2, 2) - 1))
## X-squared 
## 0.3223292

# from package
(cs$or2x2_v <- rcompanion::cramerV(exer_table))
## Cramer V 
##   0.3223

The Phi Coefficient is defined

\[\Phi = \sqrt{\frac{AD-BC}{(A+B)(C+D)(A+C)(B+D)}}\]

where A, B, C, and D are the four values of the 2 x 2 contingency table.

matrix(c("A", "C", "B", "D"), nrow = 2)
##      [,1] [,2]
## [1,] "A"  "B" 
## [2,] "C"  "D"

Similar to a Pearson Correlation Coefficient, a Phi Coefficient takes on values between -1 and 1.

# by hand
det(matrix(exer_table, nrow = 2)) / 
  sqrt(prod(c(marginSums(exer_table, 1), marginSums(exer_table, 2))))
## [1] 0.3223292

# from package
(cs$or2x2_phi <- psych::phi(exer_table))
## [1] 0.32
Reporting

A chi-square test for association was conducted between gender and preference for performing competitive sport. All expected cell frequencies were greater than five. There was a statistically significant association between gender and preference for performing competitive sport, (\(X^2\)(1) = 5.195, p = 0.023. There was a moderately strong association between gender and preference for performing competitive sport, V = 0.322.

A Fisher’s Exact test was conducted between gender and preference for performing competitive sport. There was a statistically significant association between gender and preference for performing competitive sport, p = 0.045.


  1. This tutorial says both variables should have more than two levels, but doesn’t explain why.↩︎