6.2 Goodness-of-fit measuring

As the number of RHS variables $k$ increases, $R^2$ is approaching to it’s maximum which directly affects the outcome of F-statistic, i.e. $F^{\prime}$ becomes overestimated (higher then it should be) and therefore a null hypothesis could be mistakenly rejected.
At the same time p-values of t-statistics are increasing due to reduction of degrees of freedom $df$ and significance level $\alpha$ is exceeded (null hypothesis of t-statistic is mistakenly not rejected)
This contradiction is common in small samples where statistical significance is sensitive to the loss of degrees of freedom $df$ .

Adding more irrelevant variables on the RHS will create a misleading impression about the model’s goodness-of-fit. Therefore, an adjusted coefficient of determination $\bar{R}^2$ should be applied due to it’s correction for the loss of degrees of freedom: $\begin{equation}\bar{R}^2=1-\frac{n-1}{n-k-1}(1-{R}^2) \tag{6.13} \end{equation}$

$\bar{R}^2$ increases less than $R^2$ and sometimes may be negative
Choosing the model that gives the highest $\bar{R}^2$ may be dangerous! We should be more concerned about the theoretical relevance of the variables and their statistical significance.

Exercise 30. Two estimated models are given as:

$(1)~~~log(y_i)=0,89+0,05x_i-0,23q_i+\hat{u}_i$

$(2)~~~~~~~~~~~y_i=4,11+1,96x_i+0,53z_i+\hat{u}_i$ a. Are two models nested or non-nested (neither of them is a special case of other)?
b. Would be appropriate to use adjusted R-square to decide which model fits better?

Solution

Given models are non-nested, meaning that neither of them is a special case of other. Additionally, models use different transformations of dependent variable

$y$ , making them structurally different. Non-nested models can not be compared in terms of

$R^2$ or adjusted

$R^2$ .

Another model selection criteria can be used in balancing model goodness-of-fit and it’s complexity (number of parameters and/or number of observations)

$\begin{equation} \begin{aligned} AIC&=-2Log.Lik.+2k \\ BIC&=-2Log.Lik.+log(n)k \end{aligned} \tag{6.14} \end{equation}$

Akaike Information Criterion (AIC) as well as Bayesian Information Criterion (BIC) penalizes the goodness-of-fit to avoid overfitting
Lower AIC and BIC values indicate a better fit
Under the assumption of error terms normality, the maximum value of the log-likelihood can be easily calculated from the residual sum of squares

$\begin{equation} Log.Lik.=-\frac{n}{2} \bigg( 1+log(2\pi)+log \Big( \frac{RSS}{n} \Big) \bigg) \tag{6.15} \end{equation}$

Root Mean Square Error (RMSE) is yet another widely used measure for evaluating the goodness-of-fit
RMSE measures the average magnitude of the residuals, and thus provides information how well the model predicts the dependent variable
RMSE differs slightly from regression standard error

$\begin{equation} RMSE=\sqrt{\frac{RSS}{n}}~~~~~~versus~~~~~~\hat{\sigma}_u=\sqrt{\frac{RSS}{n-k-1}} \tag{6.16} \end{equation}$

It is expressed in the same units as the dependent variable, while lower RMSE indicate better model fit when comparing to other nested models.