ECOM30001/ECOM90001
Tutorial 11
Simultaneous Equations Bias
Instrumental Variables (continued)


This tutorial reviews some concepts for the basic linear model, using the econometrics software package R. Specifically, the tutorial reviews:

This tutorial requires one (1) data file:

This file can be obtained from the Canvas subject page.

In addition, the file, tut11.R , provides the program code (R script file) necessary to complete this titprial. The R script file uses the following packages which need to be installed prior to running this program:

stargazer : for easily generating summary statistics for an R data file
ggplot2 : for eaily porducing grpahs in R
car: for easily conducting hypothesis tests in R
lmtest : for easily conducting the Ramsey RESET test in R
sandwich: for easily calculating robust standard errors in R
rio: for easily importing data into R
sandwich: for easily calculating robust (Huber-White) heteroskedasticty consistent standard errors in R
AER: for easily estimating models by the method of Instrumental Variables (IV) in R

These can be installed directly in RStudio from the packages tab or by using the command install.packages() and inserting the name of the package in the brackets.

Question 1

[Hill,Griffiths,Lim 5th Ed., pp 542-544]

This question reviews example 11.2 in the textbook.

Hint: In all regressions, use the White (Huber-White) standard errors.

Consider the following daily demand function for whiting (fish) at the Fulton fish market:

ln Qt=α0+α1ln Pt+α2 MONt+α3 TUESt+α4 WEDt+α5 THUt+εdt

where Qt represents the quantity sold (in pounds), Pt represents the average daily price per pound.
The remaining variables are indicator variables for the days of the workweek, with Friday as the omitted category. The supply equation is:

lnQt=β0+β1lnPt+β2STORMYt+εst

The variable STORMY is an indicator variable denoting stormy weather in the previous three days. We expect β2 to be negative - stormy weather reduces the supply of fish brought to the market.

The data file fultonfish.csv contains daily observations on the price of whiting, the quantity sold, and weather conditions, from December 2 1991 until May 8, 1992. ::: column-margin

(a)

Does the demand equation equation satisfy the necessary condition for identification. Why or why not?

Solution

There are M=2 endogenous variables (Qt and Pt) so the necessary condition for identification requires at least (M1)=1 variables be excluded from the demand function. The variable stromy is omitted from the demand function so it satisfies this condition for identification.
Stormy conditions shift the supply function, relative to a fixed demand function (since it does not contain the stormy variable), tracing out the demand curve.

(b)

Does the supply equation equation satisfy the necessary condition for identification. Why or why not?

Solution

There are M=2 endogenous variables (Qt and Pt) so the necessary condition for identification requires at least (M1)=1 variables be excluded from the supply function.
The day of the week indicator variables are omitted from the supply function so the necessary condition for identification is satisfied.
The demand function shifts daily around a fixed supply curve (since it does not contain the day of the week controls), tracing out the supply curve.

(c)

The reduced-form equations for the demand-supply system are given by:

lnQt=πQ0+πQ1MONt+πQ2TUESt+πQ3WEDt+πQ4THUt+πQ5STORMYt+υQtlnPt=πP0+πP1MONt+πP2TUESt+πP3WEDt+πP4THUt+πP5STORMYt+υPt

Estimate these reduced forms by OLS.

Run the following :R code given in tut11.R (click on the Code button to see and/or copy and paste the R code chunk):

Code
#------------------------------------------
# Reduced Form for lnq, with robust standard errors
#---------------------------------------
# read data file into R
fish <- import("fultonfish.csv")
# endogenous variables : lnprice, lnquant
# exogenous variables: days of the week: {mon, tue, wed, thu}
# exogenous variable: stormy         

reduced_lnq <- lm(lnquan ~ mon + tue + wed + thu +  stormy, data = fish)
# print(summary(reduced_lnq))
demdf10 <-df.residual(reduced_lnq)
# Robust standard errors
# Adjust standard errors using sandwich package
cov5 <- vcovHC(reduced_lnq, type = "HC1")
reduced_lnq_se_r <-coeftest(reduced_lnq, vcov=cov5)
# print(reduced_lnq_se_r)
reduced_lnq_robust    <- sqrt(diag(cov5))                      # OLS estimates with robust standard errors
# Adjust F statistic 
wald_reduced_lnq_r <- waldtest(reduced_lnq, vcov = cov5)       # Sample F test statistic using cov5 varcov matrix
# print(wald_reduced_lnq_r)
fstat10 <- round(wald_reduced_lnq_r$"F"[2], digits=4)          # Sample value of F stat
pvalf10 <- round(wald_reduced_lnq_r$'Pr(>F)'[2], digits=4)     # p value of F test
numdf10 <- abs(wald_reduced_lnq_r$"Df"[2])    
#-----------------------------------------------
# Reduced Form for lnp, with robust standard errors
#--------------------------------------------
reduced_lnp <- lm(lnprice ~ mon + tue + wed + thu + stormy, data = fish)
# print(summary(reduced_lnp))
demdf12 <-df.residual(reduced_lnp)
# Robust standard errors
# Adjust standard errors using sandwich package
cov6         <- vcovHC(reduced_lnp, type = "HC1")
reduced_lnp_se_r <-coeftest(reduced_lnp, vcov=cov6)
# print(reduced_lnp_se_r)
reduced_lnp_robust    <- sqrt(diag(cov6))                      # OLS estimates with robust standard errors
# Adjust F statistic 
wald_reduced_lnp_r <- waldtest(reduced_lnp, vcov = cov6)      # Sample F test statistic using cov6 varcov matrix
# print(wald_reduced_lnp_r)
fstat12 <- round(wald_reduced_lnp_r$"F"[2], digits=4)          # Sample value of F stat
pvalf12 <- round(wald_reduced_lnp_r$'Pr(>F)'[2], digits=4)     # p value of F test
numdf12 <- abs(wald_reduced_lnp_r$"Df"[2])    
#------------------------------------

The reduced form supply and demand equations are reported below:

Code
stargazer(reduced_lnq,reduced_lnp, type = "html", dep.var.labels=c("(Log) Quantity", "(Log) Price"),
          covariate.labels=c("Intercept", "Monday", "Tuesday", "Wednesday", "Thursday", "Stormy"), 
          column.labels = c("(Robust)", "(Robust)"),
          se        = list(reduced_lnq_robust, reduced_lnp_robust),
          omit.stat = "f",
          add.lines = list(c("F Statistic", fstat10, fstat12),
                           c("F p value", pvalf10, pvalf12),
                           c("F num df", numdf10, numdf12),
                           c("F dem df", demdf10, demdf12)),
                   digits=4, align=TRUE,
          intercept.bottom=FALSE,
                 star.cutoffs = c(0.05, 0.01, 0.001))
Dependent variable:
(Log) Quantity (Log) Price
(Robust) (Robust)
(1) (2)
Intercept 8.8101*** -0.2717**
(0.1174) (0.0952)
Monday 0.1010 -0.1129
(0.1978) (0.1154)
Tuesday -0.4847* -0.0411
(0.1939) (0.1164)
Wednesday -0.5531** -0.0118
(0.1986) (0.1135)
Thursday 0.0537 0.0496
(0.1720) (0.1156)
Stormy -0.3878** 0.3464***
(0.1426) (0.0723)
F Statistic 5.9278 5.8996
F p value 0.0001 0.0001
F num df 5 5
F dem df 105 105
Observations 111 111
R2 0.1934 0.1789
Adjusted R2 0.1550 0.1398
Residual Std. Error (df = 105) 0.6818 0.3542
Note: p<0.05; p<0.01; p<0.001
Figure 1: Reduced Forms for ln Qt and ln Pt

i)

Test the hypothesis that necessary conditions for identification of the supply function are satisfied.

Solution

Aside: Recall that for the 2SLS estimator we replace the endogenous variable lnPt with the predicted values from the reduced form for lnPt^.
Suppose that the day of the week indicator variables were jointly zero so π^P1=π^P2=π^P3=π^P4=0. The predicted values become:

lnPt^=π^P0+π^P5,STORMY_t But if we replace lnPt with lnPt^ in the supply equation there will be exact collinearity. The estimated supply equation would be: lnQt=β0+β1lnP^t+β2STORMYt+εst=β0+β1{π^P0+π^P5STORMYt}+β2STORMYt+εst={β0+β1π^P0}+{β1π^P5+β2}STORMYt+εst=γ0+γ1STORMYt+εst

where γ0=(β0+β1π^P0) and γ1=(β1π^P5+β2).

So only estimates of γ0 and γ1 are identified. We are unable to obtain separate estimates of the structural parameters β1 and β2.

Now if the coefficients on the day of the week variables (that is, πP1, πP2, πP3, and πP4 in the reduced form for lnPt) are not identically zero but jointly insignificant, there will be a situation of almost exact collinearity. The estimated 2SLS coefficients b^1 and b^2 (in the structural supply function) will be imprecisely estimated and only weakly identified.

For the reduced form for lnPt consider the null hypothesis:

H0:πP1=πP2=πP3=πP4=0

against the alternative HA that at least one of these parameters is non-zero. The test statistic will follow a F-distribution with (M,N-K-1) degrees of freedom.

Here the number of restrictions M=4 and the model degrees of freedom is (1116)=105.
The F critical value Fc2.68.
The decision rule - reject H0 if the sample value of F test statistic exceeds the Fc critical value.
Alternatively, reject H0 if the p-value for the sample value of the test statistic is less than α=0.05.

Code
#------------------------------------
# Is Supply Equation Identified?
# Test Days of Week in reduced form for lnp
hnull_1 <- c("mon=0", "tue = 0", "wed=0", "thu=0")
linearHypothesis(reduced_lnp, hnull_1, vcov=cov6)

Linear hypothesis test:
mon = 0
tue = 0
wed = 0
thu = 0

Model 1: restricted model
Model 2: lnprice ~ mon + tue + wed + thu + stormy

Note: Coefficient covariance matrix supplied.

  Res.Df Df      F Pr(>F)
1    109                 
2    105  4 0.7426 0.5651
Figure 2: F test dfor significance of Day of Week Indicators in Reduced Form: In Pt


The output provides a sample F-statistic of F=0.7426 with a p-value of 0.5651.

Since the p-value is larger than the desired level of significance we do not reject the null hypothesis.
Even if we could reject H0 our rule of thumb requires a value for the F test statistic of at least 10 in order to avoid the weak identification problem.
In practice, the supply equation is not identified in this example.

ii)

Test the hypothesis that necessary conditions for identification of the demand function are satisfied.

Solution

Aside: Recall that for the 2SLS estimator we replace the endogenous variable lnPt with the predicted values from the reduced form for lnPt^. Suppose that the stormy weather variable was zero so π^P5=0.
The predicted values become:

lnPt^=πP0+π^P1,MONt+ππ^P2TUESt+π^P3WEDt+π^P4THUt

But if we replace lnPt with lnPt^ in the demand equation there will be exact collinearity. The estimated demand equation would be:

lnQt=α0+α1lnP^t+α2MONt+α3TUESt+α4WEDt+α5THUt+εdt=α0+α1{π^P0+π^P1MONt+π^P2TUESt+π^P3WEDt+π^P4THUt}+α2MONt+α3TUESt+α4WEDt+α5THUt+εdt={α0+α1π^P0}+{α2+α1π^P1}MONt+{α3+α1π^P2}TUESt+{α4+α1π^P3}WEDt+{α5+α1π^P4}THURSt+εdt=γ0+γ1MONt+γ2TUESt+γ3WEDt+γ4THUt+εdt

where:

γ0=α0+α1π^P0γ1=α2+α1π^P1γ2=α3+α1π^P2γ3=α4+α1π^P3γ4=α5+α1π^P4

So only estimates of γ0, γ1, γ2, γ3, and γ4 are identified. We are unable to obtain separate estimate of the structural parameter of interest α1.

Now if the coefficient on the stormy variable (that is, πP5 in the reduced form for lnPt) is not identically zero but statistically insignificant, there will be a situation of almost exact collinearity.
The estimated 2SLS coefficient of interest a^1 (in the structural demand equation) will be imprecisely estimated and only weakly identified.

The requirement for identification is that the coefficient on the variable stormy be statistically significant in the reduced form for lnPt. In this case, either a t-test or a F-test is sufficient. For the reduced form for lnPt consider the null hypothesis:

H0:πP5=0

against the alternative HA that at least one of these parameters is non-zero.

The test statistic will follow a F-distribution with (M,N-K-1) degrees of freedom. Here the number of restrictions M=1 and the model degrees of freedom is (1116)=105.
The F critical value Fc3.92.
The decision rule - reject H0 if the sample value of the F test statistic exceeds the Fc critical value. Alternatively, reject H0 if the p-value for the sample value of the test statistic is less than α=0.05.

Code
#------------------------------------
# Is Demand Equation Identified?
# Test stormy in reduced form for lnp
# can do simple t-test or F test
# print(summary(reduced_lnp))
hnull_2 <- c("stormy=0")
linearHypothesis(reduced_lnp, hnull_2, vcov=cov6)

Linear hypothesis test:
stormy = 0

Model 1: restricted model
Model 2: lnprice ~ mon + tue + wed + thu + stormy

Note: Coefficient covariance matrix supplied.

  Res.Df Df      F      Pr(>F)    
1    106                          
2    105  1 22.929 0.000005537 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Figure 3: F Test for Significance of Stormy Weather in Reduced Form: lnPt

The R output from the Wald test of:

H0:πp5=0 is given in Figure3.

The output provides a sample F-statistic of F=22.929 with a p-value of 0.0000. Since the p-value is less than the desired level of significance we reject the null hypothesis.
The variable is statistically significant in the reduced form for lnPt and the demand function is `practically’ identified.

Alternatively consider a two-sided t-test of the null hypothesis H0:πP5=0.
The test statistic will follow a t-distribution with (N-K-1) degrees of freedom. Here the degrees of freedom are (1116)=105.
The t critical value tc1.96.
The decision rule - reject H0 if t>tc or t<tc.
Alternatively, reject H0 if the p-value for the sample value of the test statistic is less than α=0.05.

The sample value of the test statistic (using the robust standard error) is calculated as:

t=π^50se(π^5)=0.3464060.072343=4.7884

Since t>tc we reject the null hypothesis. Based on the sample value of the test statistic, we would also reject the null hypothesis H0:α11 against the alternative HA:α1>1.
Based on the OLS results, the sample evidence is consistent with an estimated elasticity of demand that is inelastic (an absolute value between 0 and 1).

d)

Estimate the demand equation by 2SLS. Comment on the magnitude of the estimated price elasticity.

Solution

Run the following code

Code
demand_ols <- lm(lnquan ~ lnprice + mon + tue + wed+ thu, data=fish)
# print(summary(demand_ols))
demdf8 <-df.residual(demand_ols)
# Robust standard errors
# Adjust standard errors using sandwich package
cov4         <- vcovHC(demand_ols, type = "HC1")
demand_ols_se_r <-coeftest(demand_ols, vcov=cov4)
# print(demand_ols_se_r)
demand_ols_robust    <- sqrt(diag(cov4))                            # OLS estimates with robust standard errors
# Adjust F statistic 
wald_demand_ols_r <- waldtest(demand_ols, vcov = cov4, test="F")    # Sample F test statistic using cov4 varcov matrix
# print(wald_demand_ols_r)
fstat8 <- round(wald_demand_ols_r$"F"[2], digits=4)          # Sample value of F stat
pvalf8 <- round(wald_demand_ols_r$'Pr(>F)'[2], digits=4)     # p value of F test
numdf8 <- abs(wald_demand_ols_r$"Df"[2])   
demand_iv <- ivreg(lnquan ~ lnprice + mon + tue + wed + thu |
                     mon + tue + wed + thu  + stormy, data=fish)
# print(summary(demand_iv))
demdf4 <-df.residual(demand_ols)
# Robust standard errors
# Adjust standard errors using sandwich package
cov4 <- vcovHC(demand_iv, type = "HC1")
demand_iv_se_r <-coeftest(demand_iv, vcov=cov4)
# print(demand_iv_se_r)
demand_iv_robust    <- sqrt(diag(cov4))                                  # IV estimates with robust standard errors
# Adjust F statistic 
wald_demand_iv_r <- waldtest(demand_iv, vcov = cov4, test="F")           # Sample F test statistic using cov7 varcov matrix
# print(wald_demand_iv_r)
fstat4 <- round(wald_demand_iv_r$"F"[2], digits=4)                      # Sample value of F stat
pvalf4 <- round(wald_demand_iv_r$'Pr(>F)'[2], digits=4)                 # p value of F test
numdf4 <- abs(wald_demand_iv_r$"Df"[2])
#####################
# 2SLS for Demand Equation, with robust standard errors
# lnp is endogenous
# Instrument Set: mon tue wed thu stormy
#--------------------------------------
demand_iv <- ivreg(lnquan ~ lnprice + mon + tue + wed + thu |
                     mon + tue + wed + thu  + stormy, data=fish)
# print(summary(demand_iv))
demdf14 <-df.residual(demand_ols)
# Robust standard errors
# Adjust standard errors using sandwich package
cov7 <- vcovHC(demand_iv, type = "HC1")
demand_iv_se_r <-coeftest(demand_iv, vcov=cov7)
# print(demand_iv_se_r)
demand_iv_robust    <- sqrt(diag(cov7))                                  # IV estimates with robust standard errors
# Adjust F statistic 
wald_demand_iv_r <- waldtest(demand_iv, vcov = cov7, test="F")           # Sample F test statistic using cov7 varcov matrix
# print(wald_demand_iv_r)
fstat14 <- round(wald_demand_iv_r$"F"[2], digits=4)                      # Sample value of F stat
pvalf14 <- round(wald_demand_iv_r$'Pr(>F)'[2], digits=4)                 # p value of F test
numdf14 <- abs(wald_demand_iv_r$"Df"[2])
Code
stargazer(demand_ols,demand_iv, type = "html", dep.var.labels=c("(Log) Quantity"),
          covariate.labels=c("Intercept", "(Log) Price", "Monday",
                             "Tuesday", "Wednesday", "Thursday"), 
          column.labels = c("(Robust)", "(Robust)"),
          se        = list(demand_ols_robust, demand_iv_robust),
          omit.stat = "f",
          add.lines = list(c("F Statistic", fstat8, fstat14),
                           c("F p value", pvalf8, pvalf14),
                           c("F num df", numdf8, numdf14),
                           c("F dem df", demdf8, demdf14)),
          digits=4, align=TRUE,
          title = "Figure 4: Two Stage Least Squares Results for the Demand Function",
          intercept.bottom=FALSE,
          star.cutoffs = c(0.05, 0.01, 0.001))
Figure 4: Two Stage Least Squares Results for the Demand Function
Dependent variable:
(Log) Quantity
OLS instrumental
variable
(Robust) (Robust)
(1) (2)
Intercept 8.6069*** 8.5059***
(0.1183) (0.1521)
(Log) Price -0.5625*** -1.1194*
(0.1522) (0.4432)
Monday 0.0143 -0.0254
(0.2057) (0.2214)
Tuesday -0.5162** -0.5308**
(0.1897) (0.2021)
Wednesday -0.5554** -0.5664**
(0.1937) (0.2069)
Thursday 0.0816 0.1093
(0.1620) (0.1784)
F Statistic 9.4007 4.7195
F p value 0 0.0006
F num df 5 5
F dem df 105 105
Observations 111 111
R2 0.2205 0.1391
Adjusted R2 0.1834 0.0981
Residual Std. Error (df = 105) 0.6702 0.7043
Note: p<0.05; p<0.01; p<0.001

The OLS estimation results (with robust standard errors) provide an estimate of the price elasticity of demand of -0.5625 so a 1% increase in price is associated with a 0.56% decrease in quantity demanded. However, in the presence of simultaneity bias, the OLS estimator will be both biased and inconsistent. Consider the null hypothesis H0:α1=1 against the alternative HA:α11. The test statistic will follow a t-distribution with (N-K-1) degrees of freedom. Here the degrees of freedom are (1116)=105. The t critical value tc1.96. The decision rule—reject H0 if t>tc or t<tc. Alternatively, reject H0 if the p-value for the sample value of the test statistic is less than α=0.05.

The sample value of the test statistic (using the robust standard error) is calculated as: t=a1+1se(a1)=0.5625+10.1522=2.8747 Since t>tc we reject the null hypothesis. Based on the sample value of the test statistic, we would also reject the null hypothesis H0:α11 against the alternative HA:α1>1. Based on the OLS results, the sample evidence is consistent with an estimated elasticity of demand that is inelastic (an absolute value between 0 and 1).

The 2SLS estimation results are reported in Figure 4. The estimated coefficient on lnP represents an elasticity so a 1% increase in price is associated with a 1.12% decrease in quantity demanded.
The p-value for a two-sided t-test about zero is 0.0130 so the estimate is statistically significant at the 5% level of significance. Consider the null hypothesis H0:α1=1 against the alternative HA:α11. The test statistic will follow a t-distribution with (N-K-1) degrees of freedom. Here the degrees of freedom are (1116)=105. The t critical value tc1.96. The decision rule - reject H0 if t>tc or t<tc. Alternatively, reject H0 if the p-value for the sample value of the test statistic is less than α=0.05.

The sample value of the test statistic (using the robust standard error) is calculated as: t=a1+1se(a1)=1.1194+10.4432=0.2694 Since tc<t<tc we do not reject the null hypothesis. Based on the IV results, the sample evidence is consistent with an estimated elasticity of demand that is unitary elastic (an absolute value of exactly 1). Based on the sample value of the test statistic, we would also not reject the null hypothesis H0:α11 against the alternative HA:α1>1.

Since the OLS estimator of the price elasticity of demand (a1) will generally be upward biased (less negative) as a result of simultaneity bias, we would make different conclusions regarding the elasticity of demand whether we ignore this simultaneity bias (OLS) or use the 2SLS estimator.

Question 2

The labour market outcomes for partnered women are of great interest to governments, economists, financial planners, and many other interested players in the economy. Consider the following labour demand equation for partnered women: lnwagei=α0+α1hoursi+α2educi+α3experi+α4experi2+α5unioni+εwi where:
lnwagei=(log) hourly wage of individual ihoursi=weekly hours of work of individual ieduci=completed educational attainment of individual iexperi=years of labour market experience of individual iunioni=1 if individual i is a member of a union, 0 otherwise Consider the following labour supply function for partnered women:
hoursi=β0+β1lnwagei+β2educi+β3childlt6i+β4childge6i+β5nlinci+εhi where:
childlt6i=1 if the youngest child in the household of individual i is <6, 0 otherwisechildge6i=1 if the youngest child in the household of individual i is 6, 0 otherwisenlinci=household income from all sources, excluding the employment incomeof individual i, in thousands of dollars The reduced form equations for this system are given by:
lnwagei=πw0+πw1educi+πw2experi+πw3experi2+πw4unioni+πw5childlt6i+πw6childge6i+πw7nlinci+υwihoursi=πh0+πh1educi+πh2experi+πh3experi2+πh4unioni+πh5childlt6i+πh6childge6i+πh7nlinci+υhi You have available a sample containing 2,867 observations that includes data on all the variables defined above.

a)

Consider the labour supply function (2). Do you thins the condition COV[lnwage,educ,childlt6,childge6,nlinc]=0 is likely to be satisfied?
Clearly explain why or why not. Outline three possible reasons why this condition might not be satisfied. Explain the consequences for the OLS estimator if this condition is not satisfied.

Solution

The condition might not be satisfied as a result of:

  • Measurement Error: Measurement error in wages will induce a correlation between (observed) wages and the unobservable determinants of hours. Our prior is that the labour supply curve is upward sloping (when the substitution effect dominates the income effect) so β1>0. In the case of classical measurement error, the OLS estimates of β1 will be downward biased towards zero.

  • Omitted Variable Bias: There are likely (omitted) variables that are both correlated with wages and the unobservable determinants of hours. For example, unobserved variables such as individual ability, motivation, or personality are likely correlated with wages and correlated with the unobserved tastes for work (εh).

  • Simultaneous Equation Bias: Equilibrium wages and equilibrium hours are likely determined jointly within a system of demand and supply. Wage is an endogenous variable and likely correlated with the error term εh in model (2). Generally, the direction of the bias in the OLS estimator for the labour supply function will be difficult to determine. However, if the labour supply curve is upward sloping (β1>0) and the labour demand function is downward sloping (α1<0), we would expect the OLS estimate of β1 in the labour supply function to be biased downward.

If the condition COV(lnwage,εh|X)0 , the OLS estimator will be biased and inconsistent - the bias does not disappear in large enough samples. The OLS estimators for all of the parameters in model (2) are biased, not just the estimator for the parameters associated with the endogenous variable ln wage

Figure 5: Wald Test of Hypotheses H0:πh5=πh6=πh7=0


Figure 6: Wald Test of Hypotheses H0:πh1=πh5=πh6=πh7=0


Figure 7: Wald Test of Hypotheses H0:πh1=πh2=πh3=0


b)

Clearly explain whether the labour demand equation (1) satisfies the necessary condition for identification. Why or why not?
Using the information contained in Figure 5, Figure 6, Figure 7, or Figure 8, test the hypothesis that the necessary condition(s) for identification of the labour demand function (1) are satisfied, at the 5% level of significance.
Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your conclusion.

Solution

Aside: Recall that the 2SLS estimator for the labour demand function replaces the endogenous variable hours with the predicted values from the reduced form for hours^. Suppose that the variables excluded from the demand function (childlt6, childge6, and nlinc) were jointly zero so πh5=πh6=πh7=0 in the reduced form for hours. The predicted values become: hours^i=π^h0+π^h1educi+π^h2experi+π^h3experi2+π^h4unioni In this case, the estimated structural demand equation (1) would be: lnwagei=α0+α1hours^i+α2educi+α3experi+α4experi2+α5unioni+εwi=α0+α1{π^h0+π^h1educi+π^h2experi+π^h3experi2+π^h4unioni}+α2educi+α3experi+α4experi2+α5unioni+εwi=(α0+α1π^h0)+(α2+α1π^h1)educi+(α3+α1π^h2)experi+(α4+α1π^h3)experi2+(α5+α1π^h4)unioni+εwi=γ0+γ1educi+γ2experi+γ3experi2+γ4unioni+εwi

where: γ0=α0+α1π^h0γ1=α2+α1π^h1γ2=α3+α1π^h2γ3=α4+α1π^h3γ4=α5+α1π^h4

Only the parameters γ0, γ1, γ2, γ3, and γ4 are (econometrically) identified. We are unable to obtain separate estimates of the structural parameters of interest α0, α1, α2, α3, α4, and α5. Importantly, an estimate of α1, which is related to the slope of the labour demand function, cannot be obtained.

Now consider the case where the reduced form coefficients on the variables excluded from the demand function (that is πh5, πh6, and πh6) are not identically zero but jointly insignificant. The estimated 2SLS coefficients in the structural demand function will be imprecisely estimated and only weakly identified

Consider the joint test of the null hypothesis H0:πh5=πh6=πh7=0 against the alternative hypothesis that at least one of the coefficients on these variables is non-zero.
The test statistic will follow a F distribution with 3 numerator degrees of freedom and (2,8678)=2,859 denominator degrees of freedom.
The decision rule is to reject H0 if the sample value of the test statistic exceeds some critical value Fc. Alternatively, at the 5%, reject H0 if the p value of the sample test statistic exceeds 0.05.
Figure 5 provides the value of the F test statistic for the null hypothesis that all of the excluded variables are jointly insignificant in the reduced form for hours. The value of the test statistic is 30.247 with a p value of 0.0000. Reject the null hypothesis.
In practice, the demand equation is identified in this example.

c)

Clearly explain whether the labour supply equation (2) satisfies the necessary condition for identification. Why or why not?
Using the information contained in Figure 5, Figure 6, Figure 7, or Figure 8, test the hypothesis that the necessary condition(s) for identification of the labour supply function (2) are satisfied, at the 5% level of significance. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your conclusion.

Solution

Now consider the case where the reduced form coefficients on the variables excluded from the demand function (that is πw2, πw3, and πw4) are not identically zero but jointly insignificant. The estimated 2SLS coefficients in the structural supply function will be imprecisely estimated and only weakly identified.

Aside : Recall that the 2SLS estimator for the labour supply function replaces the endogenous variable lnwage with the predicted values from the reduced form for lnwage^. Suppose that the variables excluded from the supply function (exper, expersq, and union) were jointly zero so πw2=πw3=πw4=0. The predicted values for the reduced form for lnwage become: lnwage^i=π^w0+π^w1educi+π^w5childlt6i+π^w6childge6i+π^w7nlinci In this case, the estimated structural supply equation (2) would be: hoursi=β0+β1lnwage^i+β2educi+β3childlt6i+β4childge6i+β5nlinci+εhi=β0+β1{π^w0+π^w1educi+π^w5childlt6i+π^w6childge6i+π^w7nlinci}+β2educi+β3childlt6i+β4childge6i+β5nlinci+εhi=(β0+β1π^w0)+(β2+β1π^w1)educi+(β3+β1π^w5)childlt6i+(β4+β1π^w6)childge6i+(β5+β1π^w7)nlinci+εhi=δ0+δ1educi+δ2childlt6i+δ3childge6i+δ4nlinci+εhi where: δ0=β0+β1π^w0δ1=β2+β1π^w1δ2=β3+β1π^w5δ3=β4+β1π^w6δ4=β5+β1π^w7 Only the parameters δ0, δ1, δ2, δ3, and δ4 are (econometrically) identified. We are unable to obtain separate estimates of the structural parameters of interest β0, β1, β2, β3, β4, and β5.
Importantly, an estimate of β1, which is related to the slope of the labour supply function, cannot be obtained.

Consider the joint test of the null hypothesis H0:πw2=πw3=πw4=0 against the alternative hypothesis that at least one of the coefficients on these variables is non-zero. The test statistic will follow a F distribution with 3 numerator degrees of freedom and (2,8678)=2,859 denominator degrees of freedom.
The decision rule is to reject H0 if the sample value of the test statistic exceeds some critical value Fc. Alternatively, at the 5%, reject H0 if the p value of the sample test statistic exceeds 0.05.
8 provides the value of the F test statistic for the null hypothesis that all of the excluded variables are jointly insignificant in the reduced form for lnwage. The value of the test statistic is 50.925 with a p value of 0.0000. Reject the null hypothesis. In practice, the labour supply equation is identified in this example.

d)

The labour supply equation (2) was estimated by the method of Two-Stage Least Squares (2SLS) and the results are reported in Figure 9.

Figure 9: 2SLS Regression REsults (with robust standard errors) for Model 2


What is the interpretation of the parameter estimate β^3 in Figure 9.

Solution

The variable childlt6 is an indicator (dummy) variable such that: β3=E[hours|childlt6=1,X]E[hours|childlt6=0,X] so β3 is the average difference in hours of work for partnered women with (at least) one child under 6 in their household, relative to partnered women who do not have (at least) one child under 6 in their household, controlling for hourly wage, education, and non-labour income.
The estimate β^3=3.0832 implies that, on average, partnered women with (at least) one child under 6 in their household work approximately three (3) hours less per week, relative to partnered women who do not have (at least) one child under 6 in their household, controlling for hourly wage, education, and non-labour income.