6.18 Likelihood ratio test vs. Wald test

For binary logistic regression models fit with glm(), the p-values obtained from summary() are, by default, what are referred to as “Wald” tests. Above, we used the test.statistic = "Wald" option in car::Anova() to obtain p-values that would be consistent with those from summary(). An alternative is to use a “likelihood ratio” (LR) test, which may be more accurate. In fact, confint() uses a profile likelihood method to obtain confidence intervals for a model fit with glm() and these correspond to LR hypothesis tests.

The default for car::Anova() is to compute LR test p-values for binary logistic regression models. This works for continuous predictors, binary predictors, and for multiple df tests for categorical predictors. For tests of individual levels of a categorical predictor vs. the reference level, an additional step is needed. After fitting the model using categorical predictors with more than two levels coded as factors to get the multiple df p-values, refit the model after replacing these factor variables with the corresponding sets of binary indicator variables. Using car::Anova() one can then get the corresponding LR tests for the comparisons between levels.

Example 6.3 (continued): What is the association between lifetime marijuana use (mj_lifetime) and age at first use of alcohol (alc_agefirst), adjusted for age (demog_age_cat6), sex (demog_sex), and income (demog_income)? Use LR tests for all the p-values.

# Same model as fit previously
fit.ex6.3.adj <- glm(mj_lifetime ~ alc_agefirst + demog_age_cat6 + demog_sex +
                     demog_income, family = binomial, data = nsduh)

The p-values displayed by summary() and the p-values computed by car::Anova(, type = 3, test.statistic = "Wald") are all Wald tests.

round(summary(fit.ex6.3.adj)$coef, 4)
##                               Estimate Std. Error z value Pr(>|z|)
## (Intercept)                     6.2542     0.5914 10.5759   0.0000
## alc_agefirst                   -0.2754     0.0276 -9.9922   0.0000
## demog_age_cat626-34            -0.2962     0.3286 -0.9012   0.3675
## demog_age_cat635-49            -0.8043     0.2966 -2.7120   0.0067
## demog_age_cat650-64            -0.6899     0.2985 -2.3109   0.0208
## demog_age_cat665+              -1.2748     0.3043 -4.1893   0.0000
## demog_sexMale                  -0.0609     0.1618 -0.3763   0.7067
## demog_income$20,000 - $49,999  -0.5309     0.2664 -1.9927   0.0463
## demog_income$50,000 - $74,999  -0.0793     0.3049 -0.2601   0.7948
## demog_income$75,000 or more    -0.3612     0.2532 -1.4264   0.1538
car::Anova(fit.ex6.3.adj, type = 3, test.statistic = "Wald")
## Analysis of Deviance Table (Type III tests)
## 
## Response: mj_lifetime
##                Df  Chisq           Pr(>Chisq)    
## (Intercept)     1 111.85 < 0.0000000000000002 ***
## alc_agefirst    1  99.84 < 0.0000000000000002 ***
## demog_age_cat6  4  23.01              0.00013 ***
## demog_sex       1   0.14              0.70669    
## demog_income    3   5.44              0.14197    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The following computes a LR test p-value for each predictor.

car::Anova(fit.ex6.3.adj, type = 3, test.statistic = "LR")
## Analysis of Deviance Table (Type III tests)
## 
## Response: mj_lifetime
##                LR Chisq Df           Pr(>Chisq)    
## alc_agefirst      149.6  1 < 0.0000000000000002 ***
## demog_age_cat6     24.0  4             0.000081 ***
## demog_sex           0.1  1                 0.71    
## demog_income        5.5  3                 0.14    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Use the multiple df LR p-values provided here in place of Wald p-values we obtained earlier from car::Anova(, type = 3, test.statistic = "Wald"), and the 1 df LR p-values provided here in place of the Wald p-values we obtained earlier from summary() for continuous and binary predictors. However, we need an additional step to get LR p-values to replace the remaining Wald p-values in summary(), those for comparisons between a level of a categorical predictor with more than two levels and a reference level.

# Create indicator variables
nsduh <- nsduh %>% 
  mutate(age26_34 = as.numeric(demog_age_cat6 == "26-34"),
         age35_49 = as.numeric(demog_age_cat6 == "35-49"),
         age50_64 = as.numeric(demog_age_cat6 == "50-64"),
         age65    = as.numeric(demog_age_cat6 == "65+"),
         income1  = as.numeric(demog_income   == "$20,000 - $49,999"),
         income2  = as.numeric(demog_income   == "$50,000 - $74,999"),
         income3  = as.numeric(demog_income   == "$75,000 or more"))

# Refit model
fit.ex6.3.adj.2 <- glm(mj_lifetime ~ alc_agefirst + 
                         age26_34 + age35_49 + age50_64 + age65 +
                         demog_sex + income1 + income2 + income3,
                       family = binomial, data = nsduh)

# LR p-values
car::Anova(fit.ex6.3.adj.2, type = 3, test.statistic = "LR")
## Analysis of Deviance Table (Type III tests)
## 
## Response: mj_lifetime
##              LR Chisq Df           Pr(>Chisq)    
## alc_agefirst    149.6  1 < 0.0000000000000002 ***
## age26_34          0.8  1               0.3655    
## age35_49          7.8  1               0.0054 ** 
## age50_64          5.6  1               0.0182 *  
## age65            18.8  1             0.000014 ***
## demog_sex         0.1  1               0.7066    
## income1           4.0  1               0.0445 *  
## income2           0.1  1               0.7947    
## income3           2.1  1               0.1506    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In summary, other than in this section we have used Wald tests for simplicity when introducing binary logistic regression. However, LR tests are better. Now that you are familiar with all the steps needed to carry out a binary logistic regression, you should be able to add in this last extra step and use LR tests.