2.2 Functional form selection
So far we have learned that the model can be expressed as a single equation or a system of equations including two variables only (bivariate model) or more than two variables (multivariate model), all variables don’t have to be numerical, and they can have different roles.
Econometric model is misspecified if we have chosen the incorrect functional form \(f(x)\) or maybe wrong variables. In this case, the estimated parameters may not be statistically significant!
The functional form \(f(x)\) can be linear or nonlinear. However, it is required that model is linear in the parameters, regardless of the variables (variables are allowed to be linear or nonlinear).
We should always transform nonlinear models into linear ones if possible!
Following bivariate models are linear in the parameters but nonlinear in variables \[\begin{equation} \begin{aligned} y_i=&\beta_0+\beta_1x_i^{2}+u_i \\ y_i=&\beta_0+\beta_1log~x_i+u_i \\ y_i=&\beta_0+\beta_1cos~x_i+u_i \end{aligned} \tag{2.2} \end{equation}\]
Solution
Model (a) and (c) can be transform into a linear models in parameters, while model (b) can not. Model (a) requires taking the logs and model (c) requires inverse transformation of dependent variable: \[a)~~~log(y_i)=\beta_0+\beta_1x_i+u_i~~~~~~~~~~c)~~~\frac{1}{y_i}=\beta_0+\beta_1x_i+u_i\]Solution
Models (a), (c) and (d) are linear in parameters. Models (b) and (d) are linear in variables. Thus, model (d) is linear in parameters and in variables (so called lin-lin model). Specificaly, model (b) can not be transformed into a linear model in parameters, although it is linear in variables!Solution
Log transformation is required: \[log(y_i)=log(\beta_0)+\beta_1 log(x_i)+\beta_2 log(z_i)+u_i\] Parameter \(\beta_1\) provides the \(\%\) change of production with respect to \(1\%\) change of employees, assuming constant capital. Parameter \(\beta_2\) provides the \(\%\) change of production with respect to \(1\%\) change of capital, assuming constant number of employees. Since all observed variables are transformed into logs, all changes are interpreted as percentages, free from any measurement units (elasticity).- Due to variables transformation you should be careful with slope coefficient interpretation!
\(~~\)Model | \(~~~\)Equation | Slope coefficient |
---|---|---|
lin-lin | \(y_i = \beta_0 + \beta_1 x_i + u_i\) | \(\beta_1 = \Delta y / \Delta x\) |
log-log | \(\log (y_i) = \beta_0 + \beta_1 \log (x_i) + u_i\) | \(\beta_1 = \%\Delta y / \%\Delta x\) |
log-lin | \(\log (y_i) = \beta_0 + \beta_1 x_i + u_i\) | \(\beta_1 \times 100 \approx \%\Delta y / \Delta x\) |
lin-log | \(y_i = \beta_0 + \beta_1 \log (x_i) + u_i\) | \(\beta_1 / 100 \approx \Delta y / \%\Delta x\) |
polynomial\(~~~~\) | \(y_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^{2} + u_i~~~~~\) | \(~~\beta_1 + 2\beta_2 x_0 \approx \Delta y / \Delta x_0\) |
Solution
The second-order polynomial is linear in the parameters although it includes quadratic term \(x^2\). Elasticity coefficient is \(\beta_1\) from log-log model, while semi-elasticity is \(\beta_1\) from log-lin model. The exact \(\%\) change in the variable \(y\) with respect to a \(1\) unit change in the variable \(x\) from log-lin model is computed as \((e^{\beta_1}-1)100\%\).Solution
Case a) if the gas price increases by \(1\) USD, the gas consumption is expected to decrease by \(13.48\) liters per month. Case b) if the gas price increases by \(1%\), the monthly gas consumption is expected to decrease by \(0.87%\). Case c) if the gas price increases by \(1\) USD, the monthly gas consumption is expected to decrease by approximately \(14%\).Regardless of the functional form you should always keep in mind parsimony principle with respect to the number of variables on the right-hand side (less is better)!
This principle balances model goodness-of-fit with it’s simplicity to avoid overfitting