4.5 G-Test

The G-test is a likelihood-ratio statistical significance test increasingly used instead of chi-squared tests. The test statistic is defined

\[G^2 = 2 \sum O_j \log \left[ \frac{O_j}{E_j} \right]\]

where the 2 multiplier asymptotically aligns with the chi-squared test formula. G is distributed \(\sim \chi^2\), with the same number of degrees of freedom as in the corresponding chi-squared test. In fact, the chi-squared test statistic is a second order Taylor expansion of the natural logarithm around 1.

Returning to the phenotype case study in the chi-squared goodness-of-fit test section, you can calculate the \(G^2\) test statistic and probability by hand.

(pheno_g2 <- 2 * sum(pheno_obs * log(pheno_obs / pheno_exp)))
## [1] 9.836806
(pchisq(q = pheno_g2, df = length(pheno_type) - 1, lower.tail = FALSE))
## [1] 0.02000552

This is pretty close to the \(X^2\) = 9.547, p = 0.023 using the chi-squared goodness-of-fit test. The DescTools::GTest() function to conducts a G-test.

DescTools::GTest(pheno_obs, p = pheno_pi)

## 
##  Log likelihood ratio (G-test) goodness of fit test
## 
## data:  pheno_obs
## G = 9.8368, X-squared df = 3, p-value = 0.02001

According to the function documentation, the G-test is not usually used for 2x2 tables.

EMT::multinomial.test(o, f, useChisq = TRUE)

## 
##  The model includes 4598126 different events.
## 
##  The chosen number of trials is rather low, should be at least 10 times the numver of events.
## 
## 
##  Exact Multinomial Test, Chisquare
## 
##     Events    chi2Obs    p.value
##    4598126      7.093     0.1479

chisq.test(o, e)

## Warning in chisq.test(o, e): Chi-squared approximation may be incorrect

## 
##  Pearson's Chi-squared test
## 
## data:  o and e
## X-squared = 15, df = 12, p-value = 0.2414