# Chapter 11 Multinomial Logistic Regression

## 11.1 Introduction to Multinomial Logistic Regression

Logistic regression is a technique used when the dependent variable is categorical (or nominal). For Binary logistic regression the number of dependent variables is two, whereas the number of dependent variables for multinomial logistic regression is more than two.

Examples: Consumers make a decision to buy or not to buy, a product may pass or fail quality control, there are good or poor credit risks, and employee may be promoted or not.

## 11.2 Equation

In logistic regression, a logistic transformation of the odds (referred to as logit) serves as the depending variable:

$\log (o d d s)=\operatorname{logit}(P)=\ln \left(\frac{P}{1-P}\right)=a+b_{1} x_{1}+b_{2} x_{2}+b_{3} x_{3}+\ldots$

or

$p=\frac{\exp \left(a+b_{1} X_{1}+b_{2} X_{2}+b_{3} X_{3}+\ldots\right)}{1+\exp \left(a+b_{1} X_{1}+b_{2} X_{2}+b_{3} X_{3}+\ldots\right)}$ > Where:

p = the probability that a case is in a particular category,

exp = the exponential (approx. 2.72),

a = the constant of the equation and,

b = the coefficient of the predictor or independent variables.

Logits or Log Odds

• Odds value can range from 0 to infinity and tell you how much more likely it is that an observation is a member of the target group rather than a member of the other group.

• Odds = p/(1-p)
• If the probability is 0.80, the odds are 4 to 1 or .80/.20; if the probability is 0.25, the odds are .33 (.25/.75).

• The odds ratio (OR), estimates the change in the odds of membership in the target group for a one unit increase in the predictor. It is calculated by using the regression coefficient of the predictor as the exponent or exp.

• Assume in the example earlier where we were predicting accountancy success by a maths competency predictor that b = 2.69. Thus the odds ratio is exp(2.69) or 14.73. Therefore the odds of passing are 14.73 times greater for a student for example who had a pre-test score of 5 than for a student whose pre-test score was 4.

## 11.3 Hypothesis Test of Coefficients

In logistic regression, hypotheses are of interest:

• The null hypothesis, which is when all the coefficients in the regression equation take the value zero, and

• The alternate hypothesis that the model currently under consideration is accurate and differs significantly from the null of zero, i.e. gives significantly better than the chance or random prediction level of the null hypothesis.

Evaluation of Hypothesis

We then work out the likelihood of observing the data we actually did observe under each of these hypotheses. The result is usually a very small number, and to make it easier to handle, the natural logarithm is used, producing a log likelihood (LL). Probabilities are always less than one, so LL’s are always negative. Log likelihood is the basis for tests of a logistic model.

## 11.4 Likelihood Ratio Test

The likelihood ratio test is based on -2LL ratio. It is a test of the significance of the difference between the likelihood ratio (-2LL) for the researcher’s model with predictors (called model chi square) minus the likelihood ratio for baseline model with only a constant in it.

Significance at the .05 level or lower means the researcher’s model with the predictors is significantly different from the one with the constant only (all ‘b’ coefficients being zero). It measures the improvement in fit that the explanatory variables make compared to the null model.

Chi square is used to assess significance of this ratio (see Model Fitting Information in SPSS output).

• $$H_0$$: There is no difference between null model and final model.

• $$H_1$$: There is difference between null model and final model.

## 11.5 Checking AssumptionL: Multicollinearity

Just run “linear regression” after assuming categorical dependent variable as continuous variable

• If the largest VIF (Variance Inflation Factor) is greater than 10 then there is cause of concern (Bowerman & O’Connell, 1990)

• Tolerance below 0.1 indicates a serious problem.

• Tolerance below 0.2 indicates a potential problem (Menard,1995).

• If the Condition index is greater than 15 then the multicollinearity is assumed.

## 11.6 Features of Multinomial logistic regression

Multinomial logistic regression to predict membership of more than two categories. It (basically) works in the same way as binary logistic regression. The analysis breaks the outcome variable down into a series of comparisons between two categories.

E.g., if you have three outcome categories (A, B and C), then the analysis will consist of two comparisons that you choose:

• Compare everything against your first category (e.g. A vs. B and A vs. C),

• Or your last category (e.g. A vs. C and B vs. C),

• Or a custom category (e.g. B vs. A and B vs. C).

The important parts of the analysis and output are much the same as we have just seen for binary logistic regression.

## 11.7 R Labs: Running Multinomial Logistic Regression in R

### 11.7.1 Understanding the Data: Choice of Programs

The data set(hsbdemo.sav) contains variables on 200 students. The outcome variable is prog, program type (1=general, 2=academic, and 3=vocational). The predictor variables are ses, social economic status (1=low, 2=middle, and 3=high), math, mathematics score, and science, science score: both are continuous variables.

(Research Question):When high school students choose the program (general, vocational, and academic programs), how do their math and science scores and their social economic status (SES) affect their decision?

### 11.7.2 Prepare and review the data

# Starting our example by import the data into R
library(haven)
hsb <- hsbdemo # Get a new copy of data
summary(hsb)
##        id             female           ses            schtyp
##  Min.   :  1.00   Min.   :0.000   Min.   :1.000   Min.   :1.00
##  1st Qu.: 50.75   1st Qu.:0.000   1st Qu.:2.000   1st Qu.:1.00
##  Median :100.50   Median :1.000   Median :2.000   Median :1.00
##  Mean   :100.50   Mean   :0.545   Mean   :2.055   Mean   :1.16
##  3rd Qu.:150.25   3rd Qu.:1.000   3rd Qu.:3.000   3rd Qu.:1.00
##  Max.   :200.00   Max.   :1.000   Max.   :3.000   Max.   :2.00
##       prog            read           write            math
##  Min.   :1.000   Min.   :28.00   Min.   :31.00   Min.   :33.00
##  1st Qu.:2.000   1st Qu.:44.00   1st Qu.:45.75   1st Qu.:45.00
##  Median :2.000   Median :50.00   Median :54.00   Median :52.00
##  Mean   :2.025   Mean   :52.23   Mean   :52.77   Mean   :52.65
##  3rd Qu.:2.250   3rd Qu.:60.00   3rd Qu.:60.00   3rd Qu.:59.00
##  Max.   :3.000   Max.   :76.00   Max.   :67.00   Max.   :75.00
##     science          socst           honors          awards          cid
##  Min.   :26.00   Min.   :26.00   Min.   :0.000   Min.   :0.00   Min.   : 1.00
##  1st Qu.:44.00   1st Qu.:46.00   1st Qu.:0.000   1st Qu.:0.00   1st Qu.: 5.00
##  Median :53.00   Median :52.00   Median :0.000   Median :1.00   Median :10.50
##  Mean   :51.85   Mean   :52.41   Mean   :0.265   Mean   :1.67   Mean   :10.43
##  3rd Qu.:58.00   3rd Qu.:61.00   3rd Qu.:1.000   3rd Qu.:2.00   3rd Qu.:15.00
##  Max.   :74.00   Max.   :71.00   Max.   :1.000   Max.   :7.00   Max.   :20.00
# Load the jmv package for frequency table
library(jmv)

Now let’s do the descriptive analysis

# Use the descritptives function to get the descritptive data
descriptives(hsb, vars = vars(ses, prog, math, science), freq = TRUE)
##
##  DESCRIPTIVES
##
##  Descriptives
##  ───────────────────────────────────────────────────────────
##               ses         prog        math        science
##  ───────────────────────────────────────────────────────────
##    N               200         200         200         200
##    Missing           0           0           0           0
##    Mean       2.055000    2.025000    52.64500    51.85000
##    Median     2.000000    2.000000    52.00000    53.00000
##    Minimum    1.000000    1.000000    33.00000    26.00000
##    Maximum    3.000000    3.000000    75.00000    74.00000
##  ───────────────────────────────────────────────────────────
##
##
##  FREQUENCIES
##
##  Frequencies of ses
##  ──────────────────────────────────────────────────
##    Levels    Counts    % of Total    Cumulative %
##  ──────────────────────────────────────────────────
##    1             47      23.50000        23.50000
##    2             95      47.50000        71.00000
##    3             58      29.00000       100.00000
##  ──────────────────────────────────────────────────
##
##
##  Frequencies of prog
##  ──────────────────────────────────────────────────
##    Levels    Counts    % of Total    Cumulative %
##  ──────────────────────────────────────────────────
##    1             45      22.50000        22.50000
##    2            105      52.50000        75.00000
##    3             50      25.00000       100.00000
##  ──────────────────────────────────────────────────
# To see the crosstable, we need CrossTable function from gmodels package
library(gmodels)
# Build a crosstable between admit and rank
CrossTable(hsb$ses, hsb$prog)
##
##
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table:  200
##
##
##              | hsb$prog ## hsb$ses |         1 |         2 |         3 | Row Total |
## -------------|-----------|-----------|-----------|-----------|
##            1 |        16 |        19 |        12 |        47 |
##              |     2.783 |     1.305 |     0.005 |           |
##              |     0.340 |     0.404 |     0.255 |     0.235 |
##              |     0.356 |     0.181 |     0.240 |           |
##              |     0.080 |     0.095 |     0.060 |           |
## -------------|-----------|-----------|-----------|-----------|
##            2 |        20 |        44 |        31 |        95 |
##              |     0.088 |     0.692 |     2.213 |           |
##              |     0.211 |     0.463 |     0.326 |     0.475 |
##              |     0.444 |     0.419 |     0.620 |           |
##              |     0.100 |     0.220 |     0.155 |           |
## -------------|-----------|-----------|-----------|-----------|
##            3 |         9 |        42 |         7 |        58 |
##              |     1.257 |     4.381 |     3.879 |           |
##              |     0.155 |     0.724 |     0.121 |     0.290 |
##              |     0.200 |     0.400 |     0.140 |           |
##              |     0.045 |     0.210 |     0.035 |           |
## -------------|-----------|-----------|-----------|-----------|
## Column Total |        45 |       105 |        50 |       200 |
##              |     0.225 |     0.525 |     0.250 |           |
## -------------|-----------|-----------|-----------|-----------|
##
## 

### 11.7.3 Run the Multinomial Model using “nnet” package

Below we use the multinom function from the nnet package to estimate a multinomial logistic regression model. There are other functions in other R packages capable of multinomial regression. We chose the multinom function because it does not require the data to be reshaped (as the mlogit package does) and to mirror the example code found in Hilbe’s Logistic Regression Models.

First, we need to choose the level of our outcome that we wish to use as our baseline and specify this in the relevel function. Then, we run our model using multinom. The multinom package does not include p-value calculation for the regression coefficients, so we calculate p-values using Wald tests (here z-tests).

# Load the multinom package
library(nnet)
# Since we are going to use Academic as the reference group, we need relevel the group.
hsb$prog2 <- relevel(as.factor(hsb$prog), ref = 2)
hsb$ses <- as.factor(hsb$ses)
levels(hsb$prog2) ## [1] "2" "1" "3" # Give the names to each level levels(hsb$prog2) <- c("academic","general","vocational")
# Run a "only intercept" model
OIM <- multinom(prog2 ~ 1, data = hsb)
## # weights:  6 (2 variable)
## initial  value 219.722458
## final  value 204.096674
## converged
summary(OIM)
## Call:
## multinom(formula = prog2 ~ 1, data = hsb)
##
## Coefficients:
##            (Intercept)
## general     -0.8472980
## vocational  -0.7419374
##
## Std. Errors:
##            (Intercept)
## general      0.1781742
## vocational   0.1718249
##
## Residual Deviance: 408.1933
## AIC: 412.1933
# Run a multinomial model
multi_mo <- multinom(prog2 ~ ses + math + science + math*science, data = hsb,model=TRUE)
## # weights:  21 (12 variable)
## initial  value 219.722458
## iter  10 value 173.831002
## iter  20 value 167.382760
## final  value 166.951813
## converged
summary(multi_mo)
## Call:
## multinom(formula = prog2 ~ ses + math + science + math * science,
##     data = hsb, model = TRUE)
##
## Coefficients:
##            (Intercept)       ses2       ses3       math     science
## general       5.897618 -0.4081497 -1.1254491 -0.1852220  0.01323626
## vocational   22.728283  0.8402168 -0.5605656 -0.5036705 -0.28297703
##            math:science
## general     0.001025283
## vocational  0.006185571
##
## Std. Errors:
##            (Intercept)      ses2      ses3       math    science math:science
## general    0.002304064 0.2613732 0.2134308 0.02694593 0.02953364 0.0004761369
## vocational 0.003856861 0.2959741 0.1984775 0.02681947 0.03142872 0.0004760567
##
## Residual Deviance: 333.9036
## AIC: 357.9036
# Check the Z-score for the model (wald Z)
z <- summary(multi_mo)$coefficients/summary(multi_mo)$standard.errors
z
##            (Intercept)      ses2      ses3       math    science math:science
## general       2559.659 -1.561559 -5.273132  -6.873837  0.4481759     2.153336
## vocational    5892.948  2.838819 -2.824328 -18.780028 -9.0037711    12.993348
# 2-tailed z test
p <- (1 - pnorm(abs(z), 0, 1)) * 2
p
##            (Intercept)        ses2         ses3         math   science
## general              0 0.118391943 1.341147e-07 6.249667e-12 0.6540263
## vocational           0 0.004528089 4.737981e-03 0.000000e+00 0.0000000
##            math:science
## general      0.03129229
## vocational   0.00000000

These are the logit coefficients relative to the reference category. For example,under ‘math’, the -0.185 suggests that for one unit increase in ‘science’ score, the logit coefficient for ‘low’ relative to ‘middle’ will go down by that amount, -0.185.

### 11.7.4 Check the model fit information

# the anova function is confilcted with JMV's anova function, so we need to unlibrary the JMV function before we use the anova function.
# Compare the our test model with the "Only intercept" model
# anova(OIM,multi_mo)

Interpretation of the Model Fit information

• The log-likelihood is a measure of how much unexplained variability there is in the data. Therefore, the difference or change in log-likelihood indicates how much new variance has been explained by the model.

• The chi-square test tests the decrease in unexplained variance from the baseline model (408.1933) to the final model (333.9036), which is a difference of 408.1933 - 333.9036 = 74.29. This change is significant, which means that our final model explains a significant amount of the original variability.

• The likelihood ratio chi-square of 74.29 with a p-value < 0.001 tells us that our model as a whole fits significantly better than an empty or null model (i.e., a model with no predictors).

### 11.7.5 Calculate the Goodness of fit

# Check the predicted probability for each program
head(multi_mo$fitted.values,30) ## academic general vocational ## 1 0.18801940 0.17122451 0.6407561 ## 2 0.12019189 0.10715542 0.7726527 ## 3 0.52212681 0.08123771 0.3966355 ## 4 0.23683979 0.23125435 0.5319059 ## 5 0.10132130 0.12329032 0.7753884 ## 6 0.38079544 0.10780793 0.5113966 ## 7 0.32321815 0.16454057 0.5122413 ## 8 0.09033932 0.08381233 0.8258484 ## 9 0.02336687 0.09050704 0.8861261 ## 10 0.32321815 0.16454057 0.5122413 ## 11 0.16304678 0.29839918 0.5385540 ## 12 0.22842326 0.27539161 0.4961851 ## 13 0.32747927 0.28141483 0.3911059 ## 14 0.05717483 0.12540921 0.8174160 ## 15 0.15003741 0.52649953 0.3234631 ## 16 0.12638004 0.20495962 0.6686603 ## 17 0.25269654 0.39841475 0.3488887 ## 18 0.05771613 0.08029142 0.8619924 ## 19 0.27404420 0.16131436 0.5646414 ## 20 0.25197679 0.20299587 0.5450273 ## 21 0.15870561 0.17100945 0.6702849 ## 22 0.27404420 0.16131436 0.5646414 ## 23 0.16304678 0.29839918 0.5385540 ## 24 0.16340250 0.31572103 0.5208765 ## 25 0.26080538 0.36665885 0.3725358 ## 26 0.74715288 0.12007209 0.1327750 ## 27 0.33135572 0.26860688 0.4000374 ## 28 0.32321815 0.16454057 0.5122413 ## 29 0.40025162 0.32553566 0.2742127 ## 30 0.19518342 0.30892507 0.4958915 # We can get the predicted result by use predict function head(predict(multi_mo),30) ## [1] vocational vocational academic vocational vocational vocational ## [7] vocational vocational vocational vocational vocational vocational ## [13] vocational vocational general vocational general vocational ## [19] vocational vocational vocational vocational vocational vocational ## [25] vocational academic vocational vocational academic vocational ## Levels: academic general vocational # Test the goodness of fit chisq.test(hsb$prog2,predict(multi_mo))
##
## 	Pearson's Chi-squared test
##
## data:  hsb$prog2 and predict(multi_mo) ## X-squared = 47.841, df = 4, p-value = 1.019e-09 ### 11.7.6 Calculate the Pseudo R-Square # Please takeout the "#" Sign to run the code # Load the DescTools package for calculate the R square # library("DescTools") # Calculate the R Square # PseudoR2(multi_mo, which = c("CoxSnell","Nagelkerke","McFadden")) Interpretation of the R-Square: • These are three pseudo R squared values. Logistic regression does not have an equivalent to the R squared that is found in OLS regression; however, many people have tried to come up with one. These statistics do not mean exactly what R squared means in OLS regression (the proportion of variance of the response variable explained by the predictors), we suggest interpreting them with great caution. • Cox and Snell’s R-Square imitates multiple R-Square based on ‘likelihood’, but its maximum can be (and usually is) less than 1.0, making it difficult to interpret. Here it is indicating that there is the relationship of 31% between the dependent variable and the independent variables. Or it is indicating that 31% of the variation in the dependent variable is explained by the logistic model. • The Nagelkerke modification that does range from 0 to 1 is a more reliable measure of the relationship. Nagelkerke’s R2 will normally be higher than the Cox and Snell measure. In our case it is 0.357, indicating a relationship of 35.7% between the predictors and the prediction. • McFadden = {LL(null) – LL(full)} / LL(null). In our case it is 0.182, indicating a relationship of 18.2% between the predictors and the prediction. ### 11.7.7 Likelihood Ratio Tests # Use the lmtest package to run Likelihood Ratio Tests library(lmtest) lrtest(multi_mo, "ses") # Chi-Square=12.922,p=0.01166* ## # weights: 15 (8 variable) ## initial value 219.722458 ## iter 10 value 176.645309 ## iter 20 value 173.670728 ## iter 30 value 173.430200 ## final value 173.412997 ## converged ## Likelihood ratio test ## ## Model 1: prog2 ~ ses + math + science + math * science ## Model 2: prog2 ~ math + science + math:science ## #Df LogLik Df Chisq Pr(>Chisq) ## 1 12 -166.95 ## 2 8 -173.41 -4 12.922 0.01166 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 lrtest(multi_mo, "math") # Chi-Square=10.613,p=0.004959* ## # weights: 18 (10 variable) ## initial value 219.722458 ## iter 10 value 172.387155 ## final value 172.258318 ## converged ## Likelihood ratio test ## ## Model 1: prog2 ~ ses + math + science + math * science ## Model 2: prog2 ~ ses + science + math:science ## #Df LogLik Df Chisq Pr(>Chisq) ## 1 12 -166.95 ## 2 10 -172.26 -2 10.613 0.004959 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 lrtest(multi_mo, "science") # Chi-Square=5.3874,p=0.06763  ## # weights: 18 (10 variable) ## initial value 219.722458 ## iter 10 value 169.972735 ## final value 169.645522 ## converged ## Likelihood ratio test ## ## Model 1: prog2 ~ ses + math + science + math * science ## Model 2: prog2 ~ ses + math + math:science ## #Df LogLik Df Chisq Pr(>Chisq) ## 1 12 -166.95 ## 2 10 -169.65 -2 5.3874 0.06763 . ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 lrtest(multi_mo, "math:science") # Chi-Square=5.249,p=0.072 ## # weights: 18 (10 variable) ## initial value 219.722458 ## iter 10 value 170.088834 ## final value 169.576379 ## converged ## Likelihood ratio test ## ## Model 1: prog2 ~ ses + math + science + math * science ## Model 2: prog2 ~ ses + math + science ## #Df LogLik Df Chisq Pr(>Chisq) ## 1 12 -166.95 ## 2 10 -169.58 -2 5.2491 0.07247 . ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Interpretation of the Likelihood Ratio Tests • The results of the likelihood ratio tests can be used to ascertain the significance of predictors to the model. This table tells us that SES and math score had significant main effects on program selection, $$X^2$$(4) = 12.917, p = .012 for SES and $$X^2$$(2) = 10.613, p = .005 for SES. • These likelihood statistics can be seen as sorts of overall statistics that tell us which predictors significantly enable us to predict the outcome category, but they don’t really tell us specifically what the effect is. To see this we have to look at the individual parameter estimates. ### 11.7.8 Parameter Estimates # Let's check our model again summary(multi_mo) ## Call: ## multinom(formula = prog2 ~ ses + math + science + math * science, ## data = hsb, model = TRUE) ## ## Coefficients: ## (Intercept) ses2 ses3 math science ## general 5.897618 -0.4081497 -1.1254491 -0.1852220 0.01323626 ## vocational 22.728283 0.8402168 -0.5605656 -0.5036705 -0.28297703 ## math:science ## general 0.001025283 ## vocational 0.006185571 ## ## Std. Errors: ## (Intercept) ses2 ses3 math science math:science ## general 0.002304064 0.2613732 0.2134308 0.02694593 0.02953364 0.0004761369 ## vocational 0.003856861 0.2959741 0.1984775 0.02681947 0.03142872 0.0004760567 ## ## Residual Deviance: 333.9036 ## AIC: 357.9036 # Check the Wald Z again z <- summary(multi_mo)$coefficients/summary(multi_mo)$standard.errors z ## (Intercept) ses2 ses3 math science math:science ## general 2559.659 -1.561559 -5.273132 -6.873837 0.4481759 2.153336 ## vocational 5892.948 2.838819 -2.824328 -18.780028 -9.0037711 12.993348 # 2-tailed z test p <- (1 - pnorm(abs(z), 0, 1)) * 2 p ## (Intercept) ses2 ses3 math science ## general 0 0.118391943 1.341147e-07 6.249667e-12 0.6540263 ## vocational 0 0.004528089 4.737981e-03 0.000000e+00 0.0000000 ## math:science ## general 0.03129229 ## vocational 0.00000000 Note that the table is split into two rows. This is because these parameters compare pairs of outcome categories. We specified the second category (2 = academic) as our reference category; therefore, the first row of the table labelled General is comparing this category against the ‘Academic’ category. the second row of the table labelled Vocational is also comparing this category against the ‘Academic’ category. Because we are just comparing two categories the interpretation is the same as for binary logistic regression: # extract the coefficients from the model and exponentiate exp(coef(multi_mo)) ## (Intercept) ses2 ses3 math science math:science ## general 3.641690e+02 0.6648794 0.3245067 0.8309198 1.0133243 1.001026 ## vocational 7.426219e+09 2.3168692 0.5708861 0.6043085 0.7535371 1.006205 The relative log odds of being in general program versus in academic program will decrease by 1.125 if moving from the highest level of SES (SES = 3) to the lowest level of SES (SES = 1) , b = -1.125, Wald χ2(1) = -5.27, p <.001. • Exp(-1.1254491) = 0.3245067 means that when students move from the highest level of SES (SES = 3) to the lowest level of SES (1= SES) the odds ratio is 0.325 times as high and therefore students with the lowest level of SES tend to choose general program against academic program more than students with the highest level of SES. • The relative log odds of being in vocational program versus in academic program will decrease by 0.56 if moving from the highest level of SES (SES = 3) to the lowest level of SES (SES = 1) , b = -0.56, Wald χ2(1) = -2.82, p < 0.01. • Exp(-0.56) = 0.57 means that when students move from the highest level of SES (SES = 3) to the lowest level of SES (SES=1) the odds ratio is 0.57 times as high and therefore students with the lowest level of SES tend to choose vocational program against academic program more than students with the highest level of SES. ### 11.7.9 Interpretation of the Predictive Equation Please check your slides for detailed information. You can find all the values on above R outcomes. ### 11.7.10 Build a classification table # Load the summarytools package to use the classification function library(summarytools) # Build a classification table by using the ctable function ctable <- table(hsb$prog2,predict(multi_mo))
ctable
##
##              academic general vocational
##   academic         88       4         13
##   general          24      12          9
##   vocational       21       4         25

## 11.8 Supplementary Learning Materials

• Field, A (2013). Discovering statistics using IBM SPSS statistics (4th ed.). Los Angeles, CA: Sage Publications

• Agresti, A. (1996). An introduction to categorical data analysis. New York, NY: Wiley & Sons.

• IBM SPSS Regression 22.

• Data files