第 11 章 Multinomial choice model

11.1 Ordered choice(可排序選擇)

  • 0-10分自我衡量健康滿意度

  • 1-5分課程評量

There are J+1 options from 0 to J. An individual’s choice depends on the latent variable Y_i^* such that Yi=Xiβ+ϵi, where ϵ has a pdf f(.) and a CDF F(.).

The larger the Yi the higher option number that he will choose. There must be J thresholds μ0<...<μJ1 such that the option of person i: Yi={0ifYi<μ0kifμk1Yi<μkJifμJ1Yi. It follows that Pr(Yi=yi)={Pr(Xiβ+ϵi<μ0)=F(μ0Xiβ),yi=0Pr(μk1Xiβ+ϵi<μk)=F(μkXiβ)F(μk1Xiβ),1yi=kJ1Pr(Xiβ+ϵiμJ1)=1F(μJ1Xiβ),yi=J

Goodness-of-Fit

  1. Pseudo-R2.

  2. Prediction: Predicted choice ˆyi=argmax And compute the percentage of correct prediction.

若樣本數有500個,y\in\{1,2,3\}, 其中y=1的有30個, y=2有300個,y=3有170個,請問在Probit和Logit模型下,\ln L_0為多少?

概似函數

The likelihood function and MLE

Since Y_{i} is discrete, L(Y_{i})=\Pr(Y_{i}=y_{i}),where \Pr(Y_{i}=y_{i}) is defined by ([eq:orderPr]). The sample log-likelihood function

\ln L=\sum\ln L(Y_{i})=\sum\ln\Pr(Y_{i}=y_{i}).

The MLEs is to solve for \{\beta,\mu_{0},\cdots,\mu_{j-1}\}.

寫下明確概似函數定義。

Marginal Effect

If X_{i} is continuous :

• Before (two options) :

\Pr(Y_{i}=0)+\Pr(Y_{i}=1)=1.

\frac{\partial Pr(Y_{i}=1)}{\partial X_{i}}+\frac{\partial Pr(Y_{i}=0)}{\partial X_{i}}=0.

• Now (multiple ordered options):

\Pr(Y_{i}=0)+\dots+\Pr(Y_{i}=J)=1.

11.2 Unordered choice(不可排序選擇)

假設有A,B,C三個選擇,同一層次3選1。

11.2.1 Random Utility

個人i選擇j的效用:

U_{ij}=V_{ij}+\epsilon_{ij}, 其中V_{ij}為可被解釋的部份, \epsilon_{ij}為殘差項。

11.2.2 Multinomial Logit Model

假設

  1. \epsilon\sim\text{Gumbel distribution}

    \begin{align} f(\epsilon) & = e^{-\epsilon}e^{-e^{-\epsilon}},\\ F(\epsilon) & = e^{-e^{-\epsilon}} \end{align}

  2. 不同選項間的殘差互相獨立,即\epsilon_{ij}\perp\epsilon_{ij'}j\neq j'

以三個選項{A,B,C}為例,我們可以證明 \begin{eqnarray*} \Pr(\text{choice =}A) & = & \Pr(U_A>U_B,U_A>U_C)\\ & = & \frac{1}{1+\exp(V_{B}-V_A)+\exp(V_{C}-V_A)}. \end{eqnarray*}

11.2.3 Identification

For U=V+\epsilon, V consists of all the regressors {\bf X} so that V={\bf X}\beta. However, not all \beta can be estimated (or identified more specifically) since we can only infer the difference of V between options. Consider the following V setup :

V_{ij}=\alpha_j+\beta x_{ij}+\gamma_j z_i+\delta_j w_{ij}+\tau q_{i}.

To what extend can we estimate those parameters?

  1. only the \alpha_j-\alpha_{k} can be estimated, but their not separate levels.
  2. \beta can be estimated.
  3. only the \gamma_j-\gamma_k can be estimated, but their not separate levels.
  4. all \delta_js can be estimated.
  5. \tau can not be estimated.
V_{ij}-V_{i1}為例,最後我們只會有 V_{ij}-V_{i1}=(\alpha_j-\alpha_1)+\beta (x_{ij}-x_{i1})+(\gamma_j-\gamma_1) z_i+\delta_j w_{ij}-\delta_j w_{i1}. 可以分成三大區塊:
  1. x_{ij} with constant coefficient: \beta (x_{ij}-x_{i1})
  2. z_i with alternative varying coefficient: (\alpha_j-\alpha_1)+(\gamma_j-\gamma_1) z_i
  3. w_{ij} with alternative varying coefficient: \delta_j w_{ij}-\delta_j w_{i1}.

11.2.4 Multinomial Probit

假設 \left[\begin{array}{c} \epsilon_{i0}\\ \epsilon_{i1}\\ \vdots\\ \epsilon_{iJ} \end{array}\right]\sim N(0,\Sigma)

Multinomial Probit比起Multinomial Logit還多了選項間的variance-covariance matrix得估算。

以A,B,C三選項為例,任何選擇只會透露兩兩效用比較結果,我們可以只看U_B-U_A,U_C-U_A

說明不論選擇結果為何,U_B-U_AU_C-U_A兩個隨機變數即足夠表示所有對應的訊息。

\begin{align} U_B-U_A &= V_B-V_A+(\epsilon_B-\epsilon_A)\\ U_C-U_A &= V_C-V_A+(\epsilon_C-\epsilon_A) \end{align}

因為\epsilon的常態假設,故

\left[\begin{array}{c} \epsilon_{A}-\epsilon_B\\ \epsilon_{A}-\epsilon_C\\ \end{array}\right]\sim N(0,\tilde{\Sigma})

也是常態分配。

效用函數同乘\alpha>0倍不會改變選擇結果,也就是說U_{j}U'_{j}若有此倍數關係,它們顯示的選擇會相同。

\Theta代表V_A-V_B,V_A-V_C裡的參數,故模型的概似函數可寫成L(\Theta,\tilde{\Sigma}),請說明(\Theta,\tilde{\Sigma})=(\Theta_0,\tilde{\Sigma}_0)(\Theta,\tilde{\Sigma})=(\alpha\Theta_0,\alpha^2\tilde{\Sigma}_0)會有相同的概似函數值。

上面論述表示,若不進一步限制模型參數空間,最大概似估計會有無窮多組解,也就是認定不足(under-identifying)的現象。

Multinomial Probit在估算時,除了和Multinomial Logit一樣要有一個選項為比較選項外,必需選擇一個\alpha值來滿足認定條件。一般是選\alpha=1/\sigma(\epsilon_B-\epsilon_A)使得\tilde{\Sigma}對角線第一個variance值: \tilde{\Sigma}_{11}=1