# Chapter 11 Simultaneous Equations Models

rm(list=ls()) #Removes all items in Environment!
library(systemfit)
library(broom) #for glance() and tidy()
library(PoEdata) #for PoE4 dataset
library(knitr) #for kable()

New package: systemfit (Henningsen and Hamann 2015).

Simultaneous equations are models with more than one response variable, where the solution is determined by an equilibrium among opposing forces. The econometric problem is similar to the endogenous variables we have studied already in the previous chapter because the mutual interaction between dependent variables can be considered a form of endogeneity. The typical example of an economic simultaneous equation problem is the supply and demand model, where price and quantity are interdependent and are determined by the interaction between supply and demand.

Usually, an economic model such as demand and supply equations include several of the depednedent (endogenous) variables in each equation. Such a model is called the structural form of the model. If the structural form is transformed such that each equation shows one dependent variable as a function of only exogenous independent variables, the new form is called the reduced form. The reduced form can be estimated by least squares, while the structural form cannot because it includes endogenous variables on its right-hand side.

The necessary condition for identification requires that, for the problem to have a solution each equation in the structural form of the system should miss at least an exogenous variable that is present in other equations.

Simultaneous equations are the object of package systemfit in $$R$$, with the function systemfit(), which requires the following main arguments: formula= a list describing the equations of the system; method= the desired (appropriate) method of estimation, which can be one of “OLS”, “WLS”, “SUR”, “2SLS”, “W2SLS”, or “3SLS” (we have only studied OLS, WLS, and 2SLS so far); inst= a list of instrumental variables under the form of one-sided model formulas; all the endogenous variables in the system must be in this list.

The following example uses the dataset $$truffles$$, where $$q$$ is quantity of truffles traded, $$p$$ is the market price, $$ps$$ is the price of a substitute, $$di$$ is income, and $$pf$$ is a measure of costs of production. The structural demand and supply equations (Equations \ref{eq:trufflestrD10} and \ref{eq:trufflestrS10}) are formulated based on economic theory; quantity and price are endogenous, and all the other variables are considered exogenous.

$\begin{equation} q=\alpha_{1}+\alpha_{2}p+\alpha_{3}ps+\alpha_{4}di+ e_{d} \label{eq:trufflestrD10} \end{equation}$ $\begin{equation} q=\beta_{1}+\beta_{2}p+\beta_{3}pf+e_{s} \label{eq:trufflestrS10} \end{equation}$
data("truffles", package="PoEdata")
D <- q~p+ps+di
S <- q~p+pf
sys <- list(D,S)
instr <- ~ps+di+pf
truff.sys <- systemfit(sys, inst=instr,
method="2SLS", data=truffles)
summary(truff.sys)
##
## systemfit results
## method: 2SLS
##
##         N DF    SSR detRCov  OLS-R2 McElroy-R2
## system 60 53 692.47  49.803 0.43896    0.80741
##
##      N DF     SSR     MSE   RMSE       R2   Adj R2
## eq1 30 26 631.917 24.3045 4.9300 -0.02395 -0.14210
## eq2 30 27  60.555  2.2428 1.4976  0.90188  0.89461
##
## The covariance matrix of the residuals
##         eq1    eq2
## eq1 24.3045 2.1694
## eq2  2.1694 2.2428
##
## The correlations of the residuals
##         eq1     eq2
## eq1 1.00000 0.29384
## eq2 0.29384 1.00000
##
##
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: q ~ p + ps + di
## Instruments: ~ps + di + pf
##
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.27947    5.54388 -0.7719  0.44712
## p           -0.37446    0.16475 -2.2729  0.03154 *
## ps           1.29603    0.35519  3.6488  0.00116 **
## di           5.01398    2.28356  2.1957  0.03724 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.92996 on 26 degrees of freedom
## Number of observations: 30 Degrees of Freedom: 26
## SSR: 631.91714 MSE: 24.30451 Root MSE: 4.92996
## Multiple R-Squared: -0.02395 Adjusted R-Squared: -0.1421
##
##
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: q ~ p + pf
## Instruments: ~ps + di + pf
##
##              Estimate Std. Error t value  Pr(>|t|)
## (Intercept) 20.032802   1.223115  16.378 1.554e-15 ***
## p            0.337982   0.024920  13.563 1.434e-13 ***
## pf          -1.000909   0.082528 -12.128 1.946e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.49759 on 27 degrees of freedom
## Number of observations: 30 Degrees of Freedom: 27
## SSR: 60.55457 MSE: 2.24276 Root MSE: 1.49759
## Multiple R-Squared: 0.90188 Adjusted R-Squared: 0.89461

The output of the systemfit() function shows the estimates by structural equation:eq1is the demand function, where, as expected, price has a negative sign, andeq2 is the supply equation, with a positive sign for price.

By evaluating the reduced form equation using OLS, one can determinne the effects of changes in exogenous variables on the equilibrium market price and quantity, while the structural equations show the effects of such changes on the quantity demanded, respectively on the quantity supplied. Estimating the structural equations by such methods as 2SLS is, in fact, estimating the market demand and supply curves, which is extremly useful for economic analysis. Estimating the reduced forms, while being useful for prediction, does not allow for deep analysis - it only gives the equilibrium point, not the whole curves.

Q.red <- lm(q~ps+di+pf, data=truffles)
P.red <- lm(q~ps+di+pf, data=truffles)
kable(tidy(Q.red), digits=4,
caption="Reduced form for quantity")
Table 11.1: Reduced form for quantity
term estimate std.error statistic p.value
(Intercept) 7.8951 3.2434 2.4342 0.0221
ps 0.6564 0.1425 4.6051 0.0001
di 2.1672 0.7005 3.0938 0.0047
pf -0.5070 0.1213 -4.1809 0.0003
kable(tidy(P.red), digits=4,
caption="Reduced form for price")
Table 11.2: Reduced form for price
term estimate std.error statistic p.value
(Intercept) 7.8951 3.2434 2.4342 0.0221
ps 0.6564 0.1425 4.6051 0.0001
di 2.1672 0.7005 3.0938 0.0047
pf -0.5070 0.1213 -4.1809 0.0003

Tables 11.1 and 11.2 show that all the exogenous variables have significant effects on the equilibrium quantity and price and have the expected signs.

The $$fultonfish$$ dataset provides another demand and supply example where the simultaneous equations method can be applied. The purpose of this example is to emphasize that the exogenous variables that are key for identification must be statistically significant. Otherwise, the structural equation that needs to be identified by those variables cannot be reliably estimated. The remaining equations in the structural system are, however, not affected.

$\begin{equation} log(quan)=\alpha_{1}+\alpha_{2}log(price)+\alpha_{3}mon+\alpha_{4}tue+\alpha_{4}wed+\alpha_{5}thu+e_{D} \label{eq:fishDstr10} \end{equation}$ $\begin{equation} log(quan)=\beta_{1}+\beta_{2}log(price)+\beta_{3}stormy+e_{S} \label{eq:fishSstr10} \end{equation}$

In the $$fultonfish$$ example, the endogenous variables are $$lprice$$, the log of price, and $$lquan$$; the exogenous variables are the indicator variables for the day of the week, and whether the catching day was stormy. The identification variable for the demand equation is $$stormy$$, which will only show up in the supply equation; the identification variables for the supply equation will be $$mon$$, $$tue$$, $$wed$$, and $$thu$$.

$\begin{equation} log(q)=\pi_{11}+\pi_{21}mon+\pi_{31}tue+\pi_{41}wed+\pi_{51}thu+\pi_{61}stormy+\nu_{1} \label{eq:fishredQ10} \end{equation}$ $\begin{equation} log(p)=\pi_{12}+\pi_{22}mon+\pi_{32}tue+\pi_{42}wed+\pi_{52}thu+\pi_{62}stormy+\nu_{2} \label{eq:fishredP10} \end{equation}$

Now, let us consider the reduced form equations (Equations \ref{eq:fishredQ10} and \ref{eq:fishredP10}). Since the endogenous variable that appears in the right-hand side of the structural equations (Equations \ref{eq:fishDstr10} and \ref{eq:fishSstr10}) is $$price$$, the $$price$$ reduced equation (Equation \ref{eq:fishredP10}) is essential for evaluating the identification state of the model. Let us focus on this equation. If the weekday indicators are all insignificant, the supply equation cannot be identified; if $$stormy$$ turns out insignificant, the demand equation cannot be identified; if the weekday indicators are insignificat but $$stormy$$ is significant the supply is not identified, but the demand is; if at least one weekday indicator turns out significant but $$stormy$$ turns out insignificant, the demand equation is not identified but the supply equation is. Equations \ref{eq:fishDstr10} and \ref{eq:fishSstr10} display the structural demand and supply equations for the $$fultonfish$$ example.

data("fultonfish", package="PoEdata")
fishQ.ols <- lm(lquan~mon+tue+wed+thu+stormy, data=fultonfish)
kable(tidy(fishQ.ols), digits=4,
caption="Reduced 'Q' equation for the fultonfish example")
Table 11.3: Reduced ‘Q’ equation for the fultonfish example
term estimate std.error statistic p.value
(Intercept) 8.8101 0.1470 59.9225 0.0000
mon 0.1010 0.2065 0.4891 0.6258
tue -0.4847 0.2011 -2.4097 0.0177
wed -0.5531 0.2058 -2.6875 0.0084
thu 0.0537 0.2010 0.2671 0.7899
stormy -0.3878 0.1437 -2.6979 0.0081
fishP.ols <- lm(lprice~mon+tue+wed+thu+stormy, data=fultonfish)
kable(tidy(fishP.ols), digits=4,
caption="Reduced 'P' equation for the fultonfish example")
Table 11.4: Reduced ‘P’ equation for the fultonfish example
term estimate std.error statistic p.value
(Intercept) -0.2717 0.0764 -3.5569 0.0006
mon -0.1129 0.1073 -1.0525 0.2950
tue -0.0411 0.1045 -0.3937 0.6946
wed -0.0118 0.1069 -0.1106 0.9122
thu 0.0496 0.1045 0.4753 0.6356
stormy 0.3464 0.0747 4.6387 0.0000

The relevant equation for evaluating identification is shown in Table 11.4, which is the price reduced equation. The results show that the weekday indicators are not significant, which will make the 2SLS estimation of the supply equation unreliable; the coefficient on $$stormy$$ is significant, thus the estimation of the (structural) demand equation will be reliable. The following code sequence and output show the 2SLS estimates of the demand and supply (the structural) equations.

fish.D <- lquan~lprice+mon+tue+wed+thu
fish.S <- lquan~lprice+stormy
fish.eqs <- list(fish.D, fish.S)
fish.ivs <- ~mon+tue+wed+thu+stormy
fish.sys <- systemfit(fish.eqs, method="2SLS",
inst=fish.ivs, data=fultonfish)
summary(fish.sys)
##
## systemfit results
## method: 2SLS
##
##          N  DF    SSR detRCov  OLS-R2 McElroy-R2
## system 222 213 109.61  0.1073 0.09424   -0.59781
##
##       N  DF    SSR     MSE    RMSE      R2  Adj R2
## eq1 111 105 52.090 0.49610 0.70434 0.13912 0.09813
## eq2 111 108 57.522 0.53261 0.72980 0.04936 0.03176
##
## The covariance matrix of the residuals
##         eq1     eq2
## eq1 0.49610 0.39614
## eq2 0.39614 0.53261
##
## The correlations of the residuals
##         eq1     eq2
## eq1 1.00000 0.77065
## eq2 0.77065 1.00000
##
##
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: lquan ~ lprice + mon + tue + wed + thu
## Instruments: ~mon + tue + wed + thu + stormy
##
##              Estimate Std. Error t value  Pr(>|t|)
## (Intercept)  8.505911   0.166167 51.1890 < 2.2e-16 ***
## lprice      -1.119417   0.428645 -2.6115  0.010333 *
## mon         -0.025402   0.214774 -0.1183  0.906077
## tue         -0.530769   0.208000 -2.5518  0.012157 *
## wed         -0.566351   0.212755 -2.6620  0.008989 **
## thu          0.109267   0.208787  0.5233  0.601837
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.70434 on 105 degrees of freedom
## Number of observations: 111 Degrees of Freedom: 105
## SSR: 52.09032 MSE: 0.4961 Root MSE: 0.70434
## Multiple R-Squared: 0.13912 Adjusted R-Squared: 0.09813
##
##
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: lquan ~ lprice + stormy
## Instruments: ~mon + tue + wed + thu + stormy
##
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  8.6283544  0.3889702 22.1826   <2e-16 ***
## lprice       0.0010593  1.3095470  0.0008   0.9994
## stormy      -0.3632461  0.4649125 -0.7813   0.4363
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7298 on 108 degrees of freedom
## Number of observations: 111 Degrees of Freedom: 108
## SSR: 57.52184 MSE: 0.53261 Root MSE: 0.7298
## Multiple R-Squared: 0.04936 Adjusted R-Squared: 0.03176

In the output of the 2SLS estimation, eq1 is the demand equation, and eq2` is the supply. As we have seen the demand equation is identified, i.e., reliable, while the supply equation is not. A solution might be to find better instruments, other than the weekdays for the demand equation. Finding valid instruments is, however, a difficult task in many problems.

### References

Henningsen, Arne, and Jeff D. Hamann. 2015. Systemfit: Estimating Systems of Simultaneous Equations. https://CRAN.R-project.org/package=systemfit.