2.1 Variable selection

  • Assuming unidirectional causality, where \(x\) causes \(y\), there are several alternative terms that can be used for the variables \(y\) and \(x\)
TABLE 2.2: Variables terminology
\(~~~~~~\)variable \(y~~~~~~\) \(~~~~~~~\)variable \(x~~~~~~\)
dependent independent
outcome predictor
response explanatory
regressand regressor
endogenous exogenous
  • Variables on the right-hand side can have different roles; some may serve as control variables, whil others can be mutiplied to represent interaction term

  • When dealing with time-series variables (data observed over time) it is common for the dependent variable \(y\) to also appear as an independent variable, making it endogenous

Exercise 1. Which variable is endogenous and which one is exogenous in the following equation? What does subscript \(t\) represent? Which variable is lagged? \[y_t=\beta_0+\beta_1y_{t-1}+\beta_2x_t+u_t~~~~~~~~~t=1,~2, ...,T\]

Exercise 2. Which variable is endogenous and which one is exogenous in the system of equations? How many parameters we need to estimate? System of two equations write in a matrix form!

\[y_t=\beta_{1,0}+\beta_{1,1}y_{t-1}+\beta_{1,2}x_{t-1}+u_{1,t}\] \[x_t=\beta_{2,0}+\beta_{2,1}y_{t-1}+\beta_{2,2}x_{t-1}+u_{2,t}\]

Exercise 3. Is the following model bivariate or multivariate? What does subscript \(i\) represent? Which terms are variables and which are parameters? Which variables are known (observed) and which are unknown? \[y_i=\alpha+\beta x_i+\gamma z_i+u_i~~~~~~~~~i=1,~2, ...,n\]

Exercise 4. Which parameter represents the constant term, and which one represents the interaction term? Explain the interaction term, assuming that \(y=\) income, \(x=\) years of working experience and \(z=\) gender (\(1\) for males and \(0\) for females). \[y_i=\alpha+\beta x_i+\gamma z_i+ \lambda (x_i z_i) + u_i\]