第 12 章 R for Multinomial Choice

12.1 多元可排序選擇模型(ordered)

資料:German_Health_Care_Utilization.csv

戴入使用套件MASS與資籵

library(MASS)
data.set2<-read.csv("https://raw.githubusercontent.com/tpemartin/github-data/master/German_Health_Care_Utilization.csv")

將應變數改成ordered格式

data.set2$hsat<-as.ordered(data.set2$hsat)

Ordered Logit估計

o.logit<-polr(hsat~age+hhninc+hhkids+educ+married+working,
     data.set2,method="logistic")
library(pscl)
pR2(o.logit)
hitmiss(o.logit)
summary(o.logit)

Ordered Probit估計

o.probit<-polr(hsat~age+married,
               data.set2,method='probit')
pR2(o.probit)
hitmiss(o.probit)
summary(o.probit)

12.2 多元不可排序

12.2.1 兩種常見資料格式

library(mlogit)
library(AER)

Wide shape

每一行是一筆個人資料,其中一個變數載明選擇結果。

data("Fishing", package = "mlogit")
head(Fishing)
Fish <- mlogit.data(Fishing, shape="wide", varying=2:9, choice="mode")
Fish

Long shape

每一行是一個人在一個選項下的選擇結果及相關特徵。

data("TravelMode", package = "AER")
head(TravelMode)
# long shape, the choice is recoceded in variable "choice", alternative variables is "mode"
TM <- mlogit.data(TravelMode, shape = "long", choice = "choice", alt.var = "mode")
TM

12.2.2 Formula

回顧三大區塊解釋變數:
  1. A: xij with constant coefficient: β(xijxi1)
  2. B: zi with alternative varying coefficient: (αjα1)+(γjγ1)zi
  3. C: wij with alternative varying coefficient: δjwijδjwi1.

Formula右邊: ~ A | B | C

以Travel Mode資料為例(TM)

# seven different forumua
f <- mFormula(choice ~ vcost | income + size | travel)
f2 <- mFormula(choice ~ vcost + travel | income + size)
f3 <- mFormula(choice ~ 0 | income)
f4 <- mFormula(choice ~ vcost + travel)
f5 <- mFormula(choice ~ vcost | 0 | travel)
f6 <- mFormula(choice ~ vcost | income + 0 | travel)
f7 <- mFormula(choice ~ 0 | income -1 | travel)

12.2.3 Multinomial logit

Uij=Vij+ϵij 以f6為例:

f6 <- mFormula(choice ~ vcost | income + 0 | travel)

VijVi,air=β(vcostijvcosti,air)+(γjγair)incomei+δjtravelijδairtraveli,air

以f7為例:

f7 <- mFormula(choice ~ 0 | income -1 | travel)

VijVi,air=(γjγair)incomei+δjtravelijδairtraveli,air

ml.TM <- mlogit(f, TM)

coeftest(ml.TM)
## 
## t test of coefficients:
## 
##                   Estimate Std. Error t value Pr(>|t|)
## train:(intercept)  1.12690    0.93492    1.21   0.2295
## bus:(intercept)    0.25583    1.02157    0.25   0.8025
## car:(intercept)   -1.82840    0.89024   -2.05   0.0413
## vcost             -0.00440    0.00703   -0.63   0.5323
## train:income      -0.06397    0.01309   -4.89  2.1e-06
## bus:income        -0.04269    0.01381   -3.09   0.0023
## car:income        -0.01882    0.01165   -1.62   0.1079
## train:size         0.45588    0.25204    1.81   0.0720
## bus:size          -0.07620    0.35353   -0.22   0.8296
## car:size           0.77716    0.22826    3.40   0.0008
## air:travel        -0.03647    0.00681   -5.36  2.3e-07
## train:travel      -0.00742    0.00119   -6.23  2.8e-09
## bus:travel        -0.00669    0.00136   -4.93  1.8e-06
## car:travel        -0.00661    0.00115   -5.76  3.2e-08
##                      
## train:(intercept)    
## bus:(intercept)      
## car:(intercept)   *  
## vcost                
## train:income      ***
## bus:income        ** 
## car:income           
## train:size        .  
## bus:size             
## car:size          ***
## air:travel        ***
## train:travel      ***
## bus:travel        ***
## car:travel        ***
## ---
## Signif. codes:  
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

由於多一單位air通勤時間效用變化-0.0365, 而多花一塊錢會使效用變化-0.0044,故一單位air通勤時間等值於8.2915元。

有了點估計,我們還會想算它的standard error。

取出共變異矩陣

取出vcost與air:travel係數的variance-covariance matrix(V):

vcov(ml.TM)->vcov.TM #利用vcov()取出所有係數的共變異矩陣
vcov.TM[c("vcost","air:travel"),c("vcost","air:travel")]->V
#再取出其中屬於我們要的係數的共變異矩陣

Delta Method

ˆθaN(θ0,V),g(ˆθ)aN(g(θ0),g(θ0)TVg(θ0)) a代表大樣本下的逼近分配。

這裡

θ=[βvcostδair:travel]g(θ)=δair:travelβvcost

g(θ)=[g/βvcostg/δair:travel]=[δair:travelβ2vcost1βvcost]

下面的dg即為g(ˆθ)

delta_air<-ml.TM$coefficients["air:travel"]
beta<-ml.TM$coefficients["vcost"]
g<-delta_air/beta
dg<-matrix(c(-delta_air/(beta^2),1/beta),2,1)

使用Delta method計算Standard Error. t()為矩陣transpose的意思;而%*%是矩陣相乘的運算。

Vnew<-t(dg) %*% V %*% dg
SE_g<-sqrt(Vnew)
SE_g

Delta method介紹

Delta method套件msm::deltamethod()使用說明

12.2.4 Multinomial Probit

ml.TM_probit <- mlogit(f7, TM,
                       probit=TRUE)
summary(ml.TM_probit)

只分析部分選項(除air, 只留下train,bus,car)。

ml.TM_probit2 <- mlogit(f7, TM,
                       probit=TRUE,
                       alt.subset = c("train","bus","car"))
summary(ml.TM_probit2)

ml.TM_probit2$omega$train可查看共變異矩陣估計值: var([ϵbusϵtrainϵcarϵtrain])

ml.TM_probit2$omega$train