第 11 章 Multinomial choice model
11.1 Ordered choice(可排序選擇)
0-10分自我衡量健康滿意度
1-5分課程評量
There are J+1 options from 0 to J. An individual’s choice depends on the latent variable Y_i^* such that Y∗i=X′iβ+ϵi, where ϵ has a pdf f(.) and a CDF F(.).
The larger the Y∗i the higher option number that he will choose. There must be J thresholds μ0<...<μJ−1 such that the option of person i: Yi={0ifY∗i<μ0kifμk−1≤Y∗i<μkJifμJ−1≤Y∗i. It follows that Pr(Yi=yi)={Pr(X′iβ+ϵi<μ0)=F(μ0−X′iβ),yi=0Pr(μk−1≤X′iβ+ϵi<μk)=F(μk−X′iβ)−F(μk−1−X′iβ),1≤yi=k≤J−1Pr(X′iβ+ϵi≥μJ−1)=1−F(μJ−1−X′iβ),yi=J
Goodness-of-Fit
Pseudo-R2.
Prediction: Predicted choice ˆyi=argmax And compute the percentage of correct prediction.
若樣本數有500個,y\in\{1,2,3\}, 其中y=1的有30個, y=2有300個,y=3有170個,請問在Probit和Logit模型下,\ln L_0為多少?
概似函數
The likelihood function and MLE
Since Y_{i} is discrete, L(Y_{i})=\Pr(Y_{i}=y_{i}),where \Pr(Y_{i}=y_{i}) is defined by ([eq:orderPr]). The sample log-likelihood function
\ln L=\sum\ln L(Y_{i})=\sum\ln\Pr(Y_{i}=y_{i}).
The MLEs is to solve for \{\beta,\mu_{0},\cdots,\mu_{j-1}\}.
寫下明確概似函數定義。
Marginal Effect
If X_{i} is continuous :
• Before (two options) :
\Pr(Y_{i}=0)+\Pr(Y_{i}=1)=1.
\frac{\partial Pr(Y_{i}=1)}{\partial X_{i}}+\frac{\partial Pr(Y_{i}=0)}{\partial X_{i}}=0.
• Now (multiple ordered options):
\Pr(Y_{i}=0)+\dots+\Pr(Y_{i}=J)=1.
11.2 Unordered choice(不可排序選擇)
假設有A,B,C三個選擇,同一層次3選1。
11.2.1 Random Utility
個人i選擇j的效用:
U_{ij}=V_{ij}+\epsilon_{ij}, 其中V_{ij}為可被解釋的部份, \epsilon_{ij}為殘差項。
11.2.2 Multinomial Logit Model
假設
\epsilon\sim\text{Gumbel distribution}:
\begin{align} f(\epsilon) & = e^{-\epsilon}e^{-e^{-\epsilon}},\\ F(\epsilon) & = e^{-e^{-\epsilon}} \end{align}
不同選項間的殘差互相獨立,即\epsilon_{ij}\perp\epsilon_{ij'}當j\neq j'。
以三個選項{A,B,C}為例,我們可以證明 \begin{eqnarray*} \Pr(\text{choice =}A) & = & \Pr(U_A>U_B,U_A>U_C)\\ & = & \frac{1}{1+\exp(V_{B}-V_A)+\exp(V_{C}-V_A)}. \end{eqnarray*}
11.2.3 Identification
For U=V+\epsilon, V consists of all the regressors {\bf X} so that V={\bf X}\beta. However, not all \beta can be estimated (or identified more specifically) since we can only infer the difference of V between options. Consider the following V setup :
V_{ij}=\alpha_j+\beta x_{ij}+\gamma_j z_i+\delta_j w_{ij}+\tau q_{i}.
To what extend can we estimate those parameters?
- only the \alpha_j-\alpha_{k} can be estimated, but their not separate levels.
- \beta can be estimated.
- only the \gamma_j-\gamma_k can be estimated, but their not separate levels.
- all \delta_js can be estimated.
- \tau can not be estimated.
- x_{ij} with constant coefficient: \beta (x_{ij}-x_{i1})
- z_i with alternative varying coefficient: (\alpha_j-\alpha_1)+(\gamma_j-\gamma_1) z_i
- w_{ij} with alternative varying coefficient: \delta_j w_{ij}-\delta_j w_{i1}.
11.2.4 Multinomial Probit
假設 \left[\begin{array}{c} \epsilon_{i0}\\ \epsilon_{i1}\\ \vdots\\ \epsilon_{iJ} \end{array}\right]\sim N(0,\Sigma)
Multinomial Probit比起Multinomial Logit還多了選項間的variance-covariance matrix得估算。
以A,B,C三選項為例,任何選擇只會透露兩兩效用比較結果,我們可以只看U_B-U_A,U_C-U_A
說明不論選擇結果為何,U_B-U_A及U_C-U_A兩個隨機變數即足夠表示所有對應的訊息。
\begin{align} U_B-U_A &= V_B-V_A+(\epsilon_B-\epsilon_A)\\ U_C-U_A &= V_C-V_A+(\epsilon_C-\epsilon_A) \end{align}
因為\epsilon的常態假設,故
\left[\begin{array}{c} \epsilon_{A}-\epsilon_B\\ \epsilon_{A}-\epsilon_C\\ \end{array}\right]\sim N(0,\tilde{\Sigma})
也是常態分配。
效用函數同乘\alpha>0倍不會改變選擇結果,也就是說U_{j}與U'_{j}若有此倍數關係,它們顯示的選擇會相同。
令\Theta代表V_A-V_B,V_A-V_C裡的參數,故模型的概似函數可寫成L(\Theta,\tilde{\Sigma}),請說明(\Theta,\tilde{\Sigma})=(\Theta_0,\tilde{\Sigma}_0)與(\Theta,\tilde{\Sigma})=(\alpha\Theta_0,\alpha^2\tilde{\Sigma}_0)會有相同的概似函數值。
Multinomial Probit在估算時,除了和Multinomial Logit一樣要有一個選項為比較選項外,必需選擇一個\alpha值來滿足認定條件。一般是選\alpha=1/\sigma(\epsilon_B-\epsilon_A)使得\tilde{\Sigma}對角線第一個variance值: \tilde{\Sigma}_{11}=1