8.3 Asymptotic normality
Let \(\theta\) denote a scalar parameter of the GWN model for returns and let \(\hat{\theta}\) denote the plug-in estimator of \(\theta\). In Chapter 7 we stated that \(\hat{\theta}\) is asymptotically normally distributed
\[\begin{equation*} \hat{\theta}\sim N(\theta,\widehat{\mathrm{se}}(\hat{\theta})^{2}), \end{equation*}\]
for large enough \(T\), where \(\widehat{\mathrm{se}}(\hat{\theta})\) denotes the estimated standard error of \(\hat{\theta}.\) This result is justified by the CLT. Let \(f(\theta)\) denote a real-valued continuous and differentiable scalar function of \(\theta\). Using a technique called the delta method, we can show that the plug-in estimator \(f(\hat{\theta})\) is also asymptotically normally distributed:
\[\begin{equation*} f(\hat{\theta}) \sim N(f(\theta),\widehat{\mathrm{se}}(f(\hat{\theta}))^{2}), \end{equation*}\]
for large enough \(T\), where \(\widehat{\mathrm{se}}(f(\hat{\theta}))\) is the estimated standard error for \(f(\hat{\theta})\). The delta method shows us how to compute \(\widehat{\mathrm{se}}(f(\hat{\theta}))^2\):
\[\begin{equation} \widehat{\mathrm{se}}(f(\hat{\theta}))^{2} = f'(\hat{\theta})^2 \times \widehat{\mathrm{se}}(\hat{\theta})^{2}, \end{equation}\]
where \(f'(\hat{\theta})\) denotes the derivative of \(f(\theta)\) evaluated at \(\hat{\theta}\). Then,
\[\begin{equation} \widehat{\mathrm{se}}(f(\hat{\theta})) = \sqrt{f'(\hat{\theta})^2 \times \widehat{\mathrm{se}}(\hat{\theta})^{2}}. \end{equation}\]
In Chapter 7 we stated that, for large enough \(T\),
\[\begin{equation*} \hat{\sigma}^2 \sim N(\sigma^2, \mathrm{se}(\hat{\sigma}^2)^2), ~ \hat{\sigma} \sim N(\sigma,\mathrm{se}(\hat{\sigma})^2), \end{equation*}\]
where \(\mathrm{se}(\sigma^2)^2 = 2\sigma^4/T\) and \(\mathrm{se}(\sigma)^2 = \sigma^2/2T\). The formula for \(\mathrm{se}(\hat{\sigma})^2\) is the result of the delta method for the function \(f(\sigma^2)= \sqrt{\sigma^2}=\sigma\) applied to the asymptotic distribution of \(\hat{\sigma}^2\). To see this, first note that by the chain rule
\[\begin{equation*} f'(\sigma^2) = \frac{1}{2} (\sigma^2)^{-1/2}, \end{equation*}\]
so that \(f'(\sigma^2)^2 = \frac{1}{4}(\sigma^2)^{-1}\). Then,
\[\begin{align*} \mathrm{se}(f(\hat{\theta}))^{2} = f'(\sigma^2)^2 \times \mathrm{se}(\hat{\sigma}^2)^2 & = \frac{1}{4}(\sigma^2)^{-1} \times 2\sigma^4/T \\ & = \frac{1}{2}\sigma^2/T = \sigma^2/2T \\ & = \mathrm{se}(\hat{\sigma})^2. \end{align*}\]
The estimated squared standard error replaces \(\sigma^2\) with its estimate \(\hat{\sigma}^2\) and is given by \[\begin{equation*} \widehat{\mathrm{se}}(\hat{\sigma}^2)^2 = \hat{\sigma}^2/2T. \end{equation*}\]
\(\blacksquare\)
Now suppose \(\theta\) is a \(k \times 1\) vector of GWN model parameters, and \(f:\mathbb{R}^k \rightarrow \mathbb{R}\) is a continuous and differentiable function of \(\theta\). Define the \(k \times 1\) gradient function \(g(\theta)\) as:
\[\begin{equation} g(\theta) = \frac{\partial f(\theta)}{\partial \theta} = \left( \begin{array}{c} \frac{\partial f(\theta)}{\partial \theta_1} \\ \vdots \\ \frac{\partial f(\theta)}{\partial \theta_k} \end{array} \right). \end{equation}\]
Assume that \(\hat{\theta}\) is asymptotically normally distributed:
\[\begin{equation*} \hat{\theta} \sim N(\theta, \widehat{\mathrm{var}}(\hat{\theta})), \end{equation*}\]
where \(\widehat{\mathrm{var}}(\hat{\theta})\) is the \(k \times k\) estimated variance-covariance matrix of \(\hat{\theta}\). Then the delta method shows us that \(f(\hat{\theta})\) is asymptotically normally distributed:
\[\begin{equation*} f(\hat{\theta}) \sim N(f(\theta), \widehat{\mathrm{se}}(f(\hat{\theta}))^{2}), \end{equation*}\]
where
\[\begin{equation*} \widehat{\mathrm{se}}(f(\hat{\theta}))^{2} = g(\hat{\theta})'\widehat{\mathrm{var}}(\hat{\theta})g(\hat{\theta}). \end{equation*}\]
Let \(\theta = (\mu, \sigma)'\) and consider \(f_1(\theta) = q_{\alpha}^R = \mu + \sigma q_{\alpha}^Z\) (normal simple return quantile). The gradient function \(g_1(\theta)\) for \(f_1(\theta)\) is:
\[\begin{align*} g_1(\theta) = \frac{\partial f_1(\theta)}{\partial \theta} = \left( \begin{array}{c} \frac{\partial f_1(\theta)}{\partial \mu} \\ \frac{\partial f_1(\theta)}{\partial \sigma} \end{array} \right) = \left( \begin{array}{c} 1 \\ q_{\alpha}^Z \end{array} \right). \end{align*}\]
Regarding \(\mathrm{var}(\hat{\theta})\), recall from Chapter 7, in the GWN model: \[\begin{equation*} \left(\begin{array}{c} \hat{\mu}\\ \hat{\sigma}^2 \end{array}\right)\sim N\left(\left(\begin{array}{c} \mu\\ \sigma^2 \end{array}\right),\left(\begin{array}{cc} \mathrm{se}(\hat{\mu})^{2} & 0\\ 0 & \mathrm{se}(\hat{\sigma}^2)^{2} \end{array}\right)\right) \end{equation*}\]
for large enough \(T\). We show below, using the delta method, that
\[\begin{equation*} \left(\begin{array}{c} \hat{\mu}\\ \hat{\sigma} \end{array}\right)\sim N\left(\left(\begin{array}{c} \mu\\ \sigma \end{array}\right),\left(\begin{array}{cc} \mathrm{se}(\hat{\mu})^{2} & 0\\ 0 & \mathrm{se}(\hat{\sigma})^{2} \end{array}\right)\right) \end{equation*}\]
Hence,
\[\begin{equation*} \mathrm{var}(\hat{\theta}) = \left(\begin{array}{cc} \mathrm{se}(\hat{\mu})^{2} & 0\\ 0 & \mathrm{se}(\hat{\sigma})^{2} \end{array}\right) = \left(\begin{array}{cc} \frac{\sigma^2}{T} & 0\\ 0 & \frac{\sigma^2}{2T} \end{array}\right) \end{equation*}\]
Then,
\[\begin{align*} g_1(\theta)'\mathrm{var}(\hat{\theta})g_1(\theta) &= \left( \begin{array}{cc} 1 & q_{\alpha}^Z \end{array} \right) \left(\begin{array}{cc} \mathrm{se}(\hat{\mu})^{2} & 0\\ 0 & \mathrm{se}(\hat{\sigma})^{2} \end{array}\right) \left( \begin{array}{c} 1 \\ q_{\alpha}^Z \end{array} \right) \\ &= \mathrm{se}(\hat{\mu})^{2} + (q_{\alpha}^Z)^2 \mathrm{se}(\hat{\sigma})^{2} \\ & =\frac{\sigma^{2}}{T}+\frac{\left(q_{\alpha}^{Z}\right)^{2}\sigma^{2}}{2T}\\ & =\frac{\sigma^{2}}{T}\left[1+\frac{1}{2}\left(q_{\alpha}^{Z}\right)^{2}\right] \\ &= \mathrm{se}(\hat{q}_{\alpha}^Z)^2. \end{align*}\]
The standard error for \(\hat{q}_{\alpha}^Z\) is: \[\begin{equation*} \mathrm{se}(\hat{q}_{\alpha}^{R})=\sqrt{\mathrm{var}(\hat{q}_{\alpha}^{R})}=\frac{\sigma}{\sqrt{T}}\left[1+\frac{1}{2}\left(q_{\alpha}^{Z}\right)^{2}\right]^{1/2}.\tag{8.5} \end{equation*}\]
The formula (8.5) shows that \(\mathrm{se}(\hat{q}_{\alpha}^{R})\) increases with \(\sigma\) and \(q_{\alpha}^{Z}\), and decreases with \(T\). In addition, for fixed \(T\), \(\mathrm{se}(\hat{q}_{\alpha}^{R})\rightarrow\infty\) as \(\alpha\rightarrow0\) or as \(\alpha\rightarrow1\) because \(q_{\alpha}^{Z}\rightarrow-\infty\) as \(\alpha\rightarrow0\) and \(q_{\alpha}^{Z}\rightarrow\infty\) as \(\alpha\rightarrow1\). Hence, we have very large estimation error in \(\hat{q}_{\alpha}^{R}\) when \(\alpha\) is very close to zero or one (for fixed values of \(\sigma\) and \(T\)). This makes intuitive sense as we have very few observations in the extreme tails of the empirical distribution of the data for a fixed sample size and so we expect a larger estimation error.
Replacing \(\sigma\) with its estimate \(\hat{\sigma}\) gives the estimated standard error for \(\hat{q}_{\alpha}^{R}\):
\[\begin{equation} \widehat{\mathrm{se}}(\hat{q}_{\alpha}^{R})=\frac{\hat{\sigma}}{\sqrt{T}}\left[1+\frac{1}{2}\left(q_{\alpha}^{Z}\right)^{2}\right]^{1/2}.\tag{8.5} \end{equation}\]
Using the above results, the sampling distribution of \(\hat{q}_{\alpha}^{R}\) can be approximated by the normal distribution: \[\begin{equation} \hat{q}_{\alpha}^{R}\sim N(q_{\alpha}^{R},~\widehat{\mathrm{se}}(\hat{q}_{\alpha}^{R})^{2}),\tag{8.6} \end{equation}\] for large enough \(T\).
The estimates of \(q_{\alpha}^{R}\), \(\mathrm{se}(\hat{q}_{\alpha}^{R})\), and 95% confidence intervals, for \(\alpha=0.05,\,0.01\) and \(0.001\), from the simple monthly returns for Microsoft are:
qhat.05 = muhatS + sigmahatS*qnorm(0.05)
qhat.01 = muhatS + sigmahatS*qnorm(0.01)
qhat.001 = muhatS + sigmahatS*qnorm(0.001)
seQhat.05 = (sigmahatS/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.05)^2)
seQhat.01 = (sigmahatS/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.01)^2)
seQhat.001 = (sigmahatS/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.001)^2)
lowerQhat.05 = qhat.05 - 2*seQhat.05
upperQhat.05 = qhat.05 + 2*seQhat.05
lowerQhat.01 = qhat.01 - 2*seQhat.01
upperQhat.01 = qhat.01 + 2*seQhat.01
lowerQhat.001 = qhat.001 - 2*seQhat.001
upperQhat.001 = qhat.001 + 2*seQhat.001
ans = cbind(c(qhat.05, qhat.01, qhat.001),
c(seQhat.05, seQhat.01, seQhat.001),
c(lowerQhat.05, lowerQhat.01, lowerQhat.001),
c(upperQhat.05, upperQhat.01, upperQhat.001))
colnames(ans) = c("Estimates", "Std Errors", "2.5%", "97.5%")
rownames(ans) = c("q.05", "q.01", "q.001")
ans
## Estimates Std Errors 2.5% 97.5%
## q.05 -0.158 0.0119 -0.182 -0.134
## q.01 -0.227 0.0149 -0.257 -0.197
## q.001 -0.304 0.0186 -0.342 -0.267
For Microsoft the values of \(\widehat{\mathrm{se}}(\hat{q}_{\alpha}^{R})\) are about 1.2%, 1.5%, and 1.9% for \(\alpha=0.05\), \(\alpha=0.01\) and \(\alpha=0.001\), respectively. The 95% confidence interval widths increase with decreases in \(\alpha\) indicating more uncertainty about the true value of \(q_{\alpha}^R\) for smaller values of \(\alpha\). For example, the 95% confidence for interval \(q_{0.01}^R\) indicates that the true loss with \(1\%\) probability (one month out of 100 months) is between \(19.7\%\) and \(25.7\%\).
\(\blacksquare\)
Let \(\theta = (\mu, \sigma)'\) and consider the functions \(f_2(\theta) = \mathrm{VaR}_{\alpha}^N = -W_0(\mu + \sigma q_{\alpha}^Z)\) (normal VaR), \(f_3(\theta) = \mathrm{VaR}_{\alpha}^{LN} = -W_0(\exp((\mu + \sigma q_{\alpha}^Z)) - 1)\) (log-normal VaR), and \(f_4(\theta) = \mathrm{SR} = \frac{\mu-r_f}{\sigma}\) (Sharpe Ratio). It is straightforward to show (see end-of-chapter exercises) that the gradient functions are:
\[\begin{align*} g_2(\theta) & = \frac{\partial f_2(\theta)}{\partial \theta} = \left( \begin{array}{c} -W_0 \\ -W_0 q_{\alpha}^Z \end{array} \right), \\ g_3(\theta) & = \frac{\partial f_3(\theta)}{\partial \theta} = \left( \begin{array}{c} -W_0 \exp(\mu + \sigma q_{\alpha}^Z) \\ -W_0 \exp(\mu + \sigma q_{\alpha}^Z)q_{\alpha}^Z \end{array} \right), \\ g_4(\theta) & = \frac{\partial f_4(\theta)}{\partial \theta} = \left( \begin{array}{c} \frac{-1}{\sigma} \\ -(\sigma)^{-2}(\mu - r_f) \end{array} \right). \end{align*}\]
The delta method variances are:
\[\begin{align*} g_2(\theta)'\mathrm{var}(\hat{\theta})g_2(\theta) &= W_0^2 \mathrm{se}(\hat{q}_{\alpha}^R)^2 = \mathrm{se} \left( \widehat{\mathrm{VaR}}_{\alpha}^{N} \right )^2 \\ g_3(\theta)'\mathrm{var}(\hat{\theta})g_3(\theta) &= W_0^2 \exp(q_{\alpha}^R)^2 \mathrm{se}(\hat{q}_{\alpha}^R)^2 = \mathrm{se} \left( \widehat{\mathrm{VaR}}_{\alpha}^{LN} \right )^2 \\ g_4(\theta)'\mathrm{var}(\hat{\theta})g_4(\theta) &= \frac{1}{T}\left( 1 + \frac{1}{2}\mathrm{SR}^2 \right ) = \mathrm{se}(\widehat{\mathrm{SR}})^2. \end{align*}\]
The corresponding standard errors are:
\[\begin{align*} \mathrm{se} \left( \widehat{\mathrm{VaR}}_{\alpha}^{N} \right ) &= W_0 \mathrm{se}(\hat{q}_{\alpha}^R) \\ \mathrm{se} \left( \widehat{\mathrm{VaR}}_{\alpha}^{LN} \right ) &= W_0 \exp(q_{\alpha}^R) \mathrm{se}(\hat{q}_{\alpha}^R) \\ \mathrm{se}(\widehat{\mathrm{SR}}) &= \sqrt{\frac{1}{T}\left( 1 + \frac{1}{2}\mathrm{SR}^2 \right ) }, \end{align*}\]
where
\[\begin{equation*} \mathrm{se}(\hat{q}_{\alpha}^Z) = \sqrt{\frac{\sigma^{2}}{T}\left[1+\frac{1}{2}\left(q_{\alpha}^{Z}\right)^{2}\right]}. \end{equation*}\]
We make the following remarks:
\(\mathrm{se}(\widehat{\mathrm{VaR}}_{\alpha}^{N})\) and \(\mathrm{se}(\widehat{\mathrm{VaR}}_{\alpha}^{LN})\) increase with \(W_{0}\), \(\sigma\), and \(q_{\alpha}^{Z}\), and decrease with \(T\).
For fixed values of \(W_{0}\), \(\sigma\), and \(T\), \(\mathrm{se}(\widehat{\mathrm{VaR}}_{\alpha}^N)\) and \(\mathrm{se}(\widehat{\mathrm{VaR}}_{\alpha}^{LN})\) become very large as \(\alpha\) approaches zero. That is, for very small loss probabilities we will have very bad estimates of \(\mathrm{VaR}_{\alpha}^N\) and \(\mathrm{VaR}_{\alpha}^{LN}\).
\(\mathrm{se}(\mathrm{SR})\) increases with \(\mathrm{SR}^2\) and decreases with \(T\).
The practically useful estimated standard errors replace the unknown values of \(q_{\alpha}^R\) and \(\mathrm{SR}\) with their estimated values \(\hat{q}_{\alpha}^R\) and \(\widehat{\mathrm{SR}}\) and are given by:
\[\begin{align} \mathrm{\widehat{se}} \left( \widehat{\mathrm{VaR}}_{\alpha}^{N} \right ) &= W_0 \mathrm{\widehat{se}}(\hat{q}_{\alpha}^R) \tag{8.7}\\ \mathrm{\widehat{se}} \left( \widehat{\mathrm{VaR}}_{\alpha}^{LN} \right ) &= W_0 \exp(\hat{q}_{\alpha}^R) \mathrm{\widehat{se}}(\hat{q}_{\alpha}^R) \tag{8.8} \\ \mathrm{\widehat{se}}(\widehat{\mathrm{SR}}) &= \sqrt{\frac{1}{T}\left( 1 + \frac{1}{2}\widehat{\mathrm{SR}}^2 \right ) } \end{align}\]
\(\blacksquare\)
Consider a \(\$100,000\) investment for one month in Microsoft. The estimates of \(\mathrm{VaR}_{\alpha}^{N}\), its standard error, and \(95\%\) confidence intervals, for \(\alpha=0.05\) and \(\alpha=0.01\) are:
W0=100000
qhat.05 = muhatS + sigmahatS*qnorm(0.05)
qhat.01 = muhatS + sigmahatS*qnorm(0.01)
VaR.N.05 = -qhat.05*W0
seVaR.N.05 = W0*seQhat.05
VaR.N.01 = -qhat.01*W0
seVaR.N.01 = W0*seQhat.01
lowerVaR.N.05 = VaR.N.05 - 2*seVaR.N.05
upperVaR.N.05 = VaR.N.05 + 2*seVaR.N.05
lowerVaR.N.01 = VaR.N.01 - 2*seVaR.N.01
upperVaR.N.01 = VaR.N.01 + 2*seVaR.N.01
ans = cbind(c(VaR.N.05, VaR.N.01),
c(seVaR.N.05, seVaR.N.01),
c(lowerVaR.N.05, lowerVaR.N.01),
c(upperVaR.N.05,upperVaR.N.01))
colnames(ans) = c("Estimate", "Std Error", "2.5%", "97.5%")
rownames(ans) = c("VaR.N.05", "VaR.N.01")
ans
## Estimate Std Error 2.5% 97.5%
## VaR.N.05 15780 1187 13405 18154
## VaR.N.01 22696 1490 19717 25676
The estimated values of \(\mathrm{VaR}_{.05}^N\) and \(\mathrm{VaR}_{.01}^N\) are \(\$15,780\) and \(\$22,696\), respectively, with estimated standard errors of \(\$1,172\) and \(\$1,471\). The estimated standard errors are fairly small compared with the estimated VaR values, so we can say that VaR is estimated fairly precisely here. Notice that the estimated standard error for \(\mathrm{VaR}_{.01}\) is about \(26\%\) larger than the estimated standard error for \(\mathrm{VaR}_{.05}^N\) indicating we have less precision for estimating VaR for very small values of \(\alpha\).
The estimates of \(\mathrm{VaR}_{\alpha}^{LN}\) and its standard error for \(\alpha=0.05\) and \(\alpha=0.01\) are:
qhat.05 = muhatC + sigmahatC*qnorm(0.05)
qhat.01 = muhatC + sigmahatC*qnorm(0.01)
seQhat.05 = (sigmahatC/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.05)^2)
seQhat.01 = (sigmahatC/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.01)^2)
VaR.LN.05 = -W0*(exp(qhat.05)-1)
seVaR.LN.05 = W0*exp(qhat.05)*seQhat.05
VaR.LN.01 = -W0*(exp(qhat.01)-1)
seVaR.LN.01 = W0*exp(qhat.01)*seQhat.01
lowerVaR.LN.05 = VaR.LN.05 - 2*seVaR.LN.05
upperVaR.LN.05 = VaR.LN.05 + 2*seVaR.LN.05
lowerVaR.LN.01 = VaR.LN.01 - 2*seVaR.LN.01
upperVaR.LN.01 = VaR.LN.01 + 2*seVaR.LN.01
ans = cbind(c(VaR.LN.05, VaR.LN.01),
c(seVaR.LN.05, seVaR.LN.01),
c(lowerVaR.LN.05, lowerVaR.LN.01),
c(upperVaR.LN.05,upperVaR.LN.01))
colnames(ans) = c("Estimate", "Std Error", "2.5%", "97.5%")
rownames(ans) = c("VaR.LN.05", "VaR.LN.01")
ans
## Estimate Std Error 2.5% 97.5%
## VaR.LN.05 14846 998 12850 16842
## VaR.LN.01 20467 1170 18128 22807
The estimates of \(\mathrm{VaR}_{\alpha}^{LN}\) (and its standard error) are slightly smaller than the estimates of \(\mathrm{VaR}_{\alpha}^{N}\) (and its standard error) due to the positive skewness in the log-normal distribution.
\(\blacksquare\)
The estimated monthly SR for Microsoft (using \(r_f = 0.0025\)), its standard error and 95% confidence interval are computed using:
SRhat = (muhatS - r.f)
seSRhat = sqrt((1/n.obs)*(1 + 0.5*(SRhat^2)))
lowerSR = SRhat - 2*seSRhat
upperSR = SRhat + 2*seSRhat
ans = c(SRhat, seSRhat, lowerSR, upperSR)
names(ans) = c("Estimate", "Std Error", "2.5%", "97.5%")
ans
## Estimate Std Error 2.5% 97.5%
## 0.00665 0.07625 -0.14585 0.15915
The estimated monthly SR for Microsoft is close to zero, with a large estimated standard error and a 95% confidence interval containing both negative and positive values. Clearly, the SR is not estimated very well.
\(\blacksquare\)
8.3.1 The numerical delta method
The delta method is an advanced statistical technique and requires a bit of calculus to implement, especially for nonlinear vector-valued functions. It can be easy to make mistakes when working out the math! Fortunately, in R there is an easy way to implement the delta method that does not require you to work out the math. The car function deltaMethod
implements the delta method numerically, utilizing the stats function D()
which implements symbolic derivatives. The arguments expected by deltaMethod
are:
## function (object, g., vcov., func = g., constants, level = 0.95,
## rhs = NULL, ..., envir = parent.frame())
## NULL
Here, object
is the vector of named estimated parameters, \(\hat{\theta}\), g.
is a quoted string that is the function of the parameter estimates to be evaluated, \(f(\theta)\), and vcov
is the named estimated covariance matrix of the coefficient estimates, \(\widehat{\mathrm{var}}(\hat{\theta})\). The function returns an object of class “deltaMethod” for which there is a print
method. The “deltaMethod” object is essentially a data.frame
with columns giving the parameter estimate(s), the delta method estimated standard error(s), and lower and upper confidence limits. The following example illustrates how to use deltaMethod
for the example functions.
We first create the named inputs \(\hat{\theta}\) and \(\widehat{\mathrm{var}}(\hat{\theta})\) from the GWN model for simple returns:
thetahatS = c(muhatS, sigmahatS)
names(thetahatS) = c("mu", "sigma")
var.thetahatS = matrix(c(se.muhatS^2,0,0,se.sigmahatS^2), 2, 2, byrow=TRUE)
rownames(var.thetahatS) = colnames(var.thetahatS) = names(thetahatS)
thetahatS
## mu sigma
## 0.00915 0.10150
## mu sigma
## mu 5.99e-05 0.00e+00
## sigma 0.00e+00 2.99e-05
We do the same for the inputs from the GWN model for continuously compounded returns:
thetahatC = c(muhatC, sigmahatC)
names(thetahatC) = c("mu", "sigma")
var.thetahatC = matrix(c(se.muhatC^2,0,0,se.sigmahatC^2), 2, 2, byrow=TRUE)
rownames(var.thetahatC) = colnames(var.thetahatC) = names(thetahatC)
It is important to name the elements in thetahat
and var.thetahat
as these names will be used in defining \(f(\theta)\). To implement the numerical delta method for the normal quantile, we need to supply a quoted string specifying the function \(f_1(\theta) = \mu + \sigma q_{\alpha}^Z\). Here we use the string "mu+sigma*q.05"
, where q.05 = qnorm(0.05)
. For, \(\alpha = 0.05\) the R code is:
## [1] "deltaMethod" "data.frame"
The returned object dm1
is of class “deltaMethod”, inhereting from “data.frame”, and has named columns:
## [1] "Estimate" "SE" "2.5 %" "97.5 %"
The print
method shows the results of applying the delta method:
## Estimate SE 2.5 % 97.5 %
## mu + sigma * q.05 -0.1578 0.0119 -0.1811 -0.13
The results match the analytic calculations performed above.40
The delta method for the normal VaR function \(f_2(\theta)\) is:
## Estimate SE 2.5 % 97.5 %
## -W0 * (mu + sigma * q.05) 15780 1187 13453 18106
The delta method for the log-normal VaR function \(f_3(\theta)\) is:
## Estimate SE 2.5 % 97.5 %
## -W0 * (exp(mu + sigma * q.05) - 1) 14846 998 12890 16802
The delta method for the Sharpe ratio function \(f_4(\theta)\) is
## Estimate SE 2.5 % 97.5 %
## (mu - r.f)/sigma 0.0655 0.0763 -0.0841 0.22
\(\blacksquare\)
8.3.2 the delta method for vector valued functions (advanced)
Let \(\theta\) denote a \(k \times 1\) vector of GWN model parameters. For example, \(\theta = (\mu, \sigma^2)'\). Assume that \(\hat{\theta}\) is asymptotically normally distributed: \[\begin{equation} \hat{\theta} \sim N( \theta, \widehat{\mathrm{var}}(\theta)), \end{equation}\] for large enough \(T\) where \(\widehat{\mathrm{var}}(\theta)\) is the \(k \times k\) variance-covariance matrix of \(\hat{\theta}\). Sometimes we need to consider the asymptotic distribution of a vector-valued function of \(\theta\). Let \(f:\mathbb{R}^k \rightarrow \mathbb{R}^j\) where \(j \le k\) be a continuous and differentiable function, and define the \(j \times 1\) vector \(\eta=f(\theta)\). It is useful to express \(\eta\) as:
\[\begin{equation} \eta = \begin{pmatrix} \eta_1 \\ \eta_2 \\ \vdots \\ \eta_j \end{pmatrix} = \begin{pmatrix} f_1(\theta) \\ f_2(\theta) \\ \vdots \\ f_j(\theta) \end{pmatrix}. \end{equation}\]
For the first example let \(\theta=(\sigma_1^2, \sigma_2^2)'\), and define the \(2 \times 1\) vector of volatilities: \(\eta = f(\theta)=(\sigma_1, \sigma_2)'\). Then, \(\eta_1 = f_1(\theta) = \sqrt{\sigma_2^2}\) and \(\eta_2 = f_2(\theta) = \sqrt{\sigma_2^2}\).
For the second example, let \(\theta\) = \((\mu_1,\mu_2,\sigma_1,\sigma_2)'\) and define the \(2 \times 1\) vector of Sharpe ratios:
\[\begin{equation} \eta = f(\theta) = \begin{pmatrix} \frac{\mu_1 - r_f}{\sigma_1} \\ \frac{\mu_1 - r_f}{\sigma_1} \end{pmatrix} = \begin{pmatrix} \mathrm{SR}_1 \\ \mathrm{SR}_2 \end{pmatrix}, \end{equation}\]
where \(r_f\) denotes the constant risk-free rate. Then \[\begin{equation} \eta_1 = f_1(\theta) = \frac{\mu_1 - r_f}{\sigma_1} = \mathrm{SR}_1, ~ \eta_2 = f_2(\theta) = \frac{\mu_2 - r_f}{\sigma_2} = \mathrm{SR}_2. \end{equation}\]
\(\blacksquare\)
Define the \(j \times k\) Jacobian matrix: \[\begin{equation} \mathbf{G}(\theta)' = \begin{pmatrix} \frac{\partial f_1(\theta)}{\partial \theta_1} & \frac{\partial f_1(\theta)}{\partial \theta_2} & \cdots & \frac{\partial f_1(\theta)}{\partial \theta_k} \\ \frac{\partial f_2(\theta)}{\partial \theta_1} & \frac{\partial f_2(\theta)}{\partial \theta_2} & \cdots & \frac{\partial f_2(\theta)}{\partial \theta_k} \\ \vdots & \vdots & \ldots & \vdots \\ \frac{\partial f_j(\theta)}{\partial \theta_1} & \frac{\partial f_j(\theta)}{\partial \theta_2} & \cdots & \frac{\partial f_j(\theta)}{\partial \theta_k} \\ \end{pmatrix} \end{equation}\]
Then, by the delta method, the asymptotic normal distribution of \(\hat{\eta} = f(\hat{\theta})\) is: \[\begin{equation} \hat{\eta} \sim N(\eta, \widehat{\mathrm{var}}(\hat{\eta})), \end{equation}\] where \[\begin{equation} \widehat{\mathrm{var}}(\hat{\eta}) = \mathbf{G}(\hat{\theta})'\widehat{\mathrm{var}}(\hat{\theta})\mathbf{G}(\hat{\theta}).\tag{8.9} \end{equation}\]
Consider the first example function where \(\theta = (\sigma_1^2,\sigma_2^2)'\) and \(\eta = f(\theta)=(\sigma_1, \sigma_2)'\). Now, using the chain-rule:
\[\begin{align*} \frac{\partial f_1(\theta)}{\partial \sigma_1^2} & = \frac{1}{2}\sigma_1^{-1}, ~ \frac{\partial f_1(\theta)}{\partial \sigma_2^2} = 0 \\ \frac{\partial f_2(\theta)}{\partial \sigma_1^2} & = 0, ~ \frac{\partial f_2(\theta)}{\partial \sigma_2^2} = \frac{1}{2}\sigma_1^{-1} \end{align*}\]
The Jacobian matrix is
\[\begin{equation} \mathbf{G}(\theta)' = \begin{pmatrix} \frac{1}{2}\sigma_1^{-1} & 0 \\ 0 & \frac{1}{2}\sigma_2^{-1} \end{pmatrix} \end{equation}\]
From Chapter 7, Proposition 7.9,
\[\begin{equation} \widehat{\mathrm{var}}(\hat{\theta}) = \frac{1}{T}\begin{pmatrix} 2\hat{\sigma}_1^4 & 2\hat{\sigma}_{12}^2 \\ 2\hat{\sigma}_{12}^2 & 2\hat{\sigma}_2^4 \end{pmatrix} \end{equation}\]
Then
\[\begin{align*} \widehat{\mathrm{var}}(\hat{\eta}) & = \mathbf{G}(\hat{\theta})'\widehat{\mathrm{var}}(\hat{\theta})\mathbf{G}(\hat{\theta}) \\ & = \begin{pmatrix} \frac{1}{2}\hat{\sigma}_1^{-1} & 0 \\ 0 & \frac{1}{2}\hat{\sigma}_2^{-1} \end{pmatrix} \frac{1}{T}\begin{pmatrix} 2\hat{\sigma}_1^4 & 2\hat{\sigma}_{12}^2 \\ 2\hat{\sigma}_{12}^2 & 2\hat{\sigma}_2^4 \end{pmatrix} \begin{pmatrix} \frac{1}{2}\hat{\sigma}_1^{-1} & 0 \\ 0 & \frac{1}{2}\hat{\sigma}_2^{-1} \end{pmatrix} \\ &= \frac{1}{T} \begin{pmatrix} \frac{1}{2}\hat{\sigma}_1^2 & \frac{1}{2}\hat{\sigma}_{12}^2 \hat{\sigma}_{1}^{-1} \hat{\sigma}_{2}^{-1}\\ \frac{1}{2}\hat{\sigma}_{12}^2 \hat{\sigma}_{1}^{-1} \hat{\sigma}_{2}^{-1} & \frac{1}{2}\hat{\sigma}_2^2 \end{pmatrix} = \frac{1}{T} \begin{pmatrix} \frac{1}{2}\hat{\sigma}_1^2 & \frac{1}{2}\hat{\sigma}_{12} \hat{\rho}_{12} \\ \frac{1}{2}\hat{\sigma}_{12} \hat{\rho}_{12} & \frac{1}{2}\hat{\sigma}_2^2 \end{pmatrix}. \end{align*}\]
\(\blacksquare\)
Consider the second example function where \(\theta =(\mu_1,\mu_2, \sigma_1, \sigma_2)'\) and \(\eta = f(\theta)=(f_1(\theta), f_2(\theta))'\) where \[\begin{equation} f_1(\theta) = \frac{\mu_1 - r_f}{\sigma_1} = \mathrm{SR}_1, ~ f_2(\theta) = \frac{\mu_2 - r_f}{\sigma_2} = \mathrm{SR}_2. \end{equation}\]
Here, we want to use the delta method to get the joint distribution of \(\eta = (\mathrm{SR}_1, \mathrm{SR}_2)'\). Now, using the chain-rule:
\[\begin{align*} \frac{\partial f_1(\theta)}{\partial \mu_1} & = \sigma_1^{-1}, ~ \frac{\partial f_1(\theta)}{\partial \mu_2} = 0, ~ & \frac{\partial f_1(\theta)}{\partial \sigma_1} = -\sigma_1^{-2}(\mu_1 - r_f), ~ & \frac{\partial f_1(\theta)}{\partial \sigma_2} = 0, \\ \frac{\partial f_2(\theta)}{\partial \mu_1} & = 0, ~ \frac{\partial f_2(\theta)}{\partial \mu_2} = \sigma_2^{-1}, ~ & \frac{\partial f_2(\theta)}{\partial \sigma_1} = 0, ~ & \frac{\partial f_2(\theta)}{\partial \sigma_2} = -\sigma_1^{-2}(\mu_1 - r_f). \end{align*}\]
Then, using \(\mathrm{SR}_{i} = (\mu_i - r_f)/\sigma_i ~ (i=1,2)\), we have:
\[\begin{equation} \mathbf{G}(\theta)' = \begin{pmatrix} \sigma_1^{-1} & 0 & -\sigma_1^{-1} \mathrm{SR}_1 & 0 \\ 0 & \sigma_2^{-1} & 0 & -\sigma_2^{-1}\mathrm{SR}_2 & 0 \end{pmatrix} \end{equation}\]
Using Proposition 7.10 and the results from the previous example we have:
\[\begin{equation} \mathrm{var}(\hat{\theta}) = \frac{1}{T} \begin{pmatrix} \sigma_1^2 & \sigma_{12} & 0 & 0 \\ \sigma_{12} & \sigma_2^2 & 0 & 0 \\ 0 & 0 & \frac{1}{2}\sigma_1^2 & \frac{1}{2} \sigma_{12}^2 \sigma_1^{-1} \sigma_2^{-1} \\ 0 & 0 & \frac{1}{2} \sigma_{12}^2 \sigma_1^{-1} \sigma_2^{-1} & \frac{1}{2}\sigma_2^2 \end{pmatrix}. \end{equation}\]
Then, after some straightforward matrix algebra calculations (see end-of-chapter exercises) we get:
\[\begin{align} \widehat{\mathrm{var}}(\hat{\eta}) & = \mathbf{G}(\hat{\theta})'\widehat{\mathrm{var}}(\hat{\theta})\mathbf{G}(\hat{\theta}) \\ &= \begin{pmatrix} \widehat{\mathrm{var}}(\widehat{\mathrm{SR}}_1) & \widehat{\mathrm{cov}}(\widehat{\mathrm{SR}}_1, \widehat{\mathrm{SR}}_2) \\ \widehat{\mathrm{cov}}(\widehat{\mathrm{SR}}_1, \widehat{\mathrm{SR}}_2) & \widehat{\mathrm{var}}(\widehat{\mathrm{SR}}_2) \end{pmatrix} \\ & = \begin{pmatrix} \frac{1}{T}(1+\frac{1}{2}\widehat{\mathrm{SR}}_1^2) & \frac{\hat{\rho}_{12}}{T}(1 + \frac{1}{2}\hat{\rho}_{12}\widehat{\mathrm{SR}}_1 \widehat{\mathrm{SR}}_2) \\ \frac{\hat{\rho}_{12}}{T}(1 + \frac{1}{2}\hat{\rho}_{12} \widehat{\mathrm{SR}}_1\widehat{\mathrm{SR}}_2) & \frac{1}{T}(1+\frac{1}{2}\widehat{\mathrm{SR}}_2^2) \end{pmatrix}. \end{align}\]
The diagonal elements match the delta method variances for the estimated Sharpe ratio derived earlier. The off-diagonal covariance term shows that the two estimated Sharpe ratios are correlated. We explore this correlation in more depth in the end-of-chapter exercises.
\(\blacksquare\)
8.3.3 The delta method explained
The delta method deduces the asymptotic distribution of \(f(\hat{\theta})\) using a first order Taylor series approximation of \(f(\hat{\theta})\) evaluated at the true value \(\theta\). For simplicity, assume that \(\theta\) is a scalar parameter. Then, the first order Taylor series expansion gives
\[\begin{align*} f(\hat{\theta}) & = f(\theta) + f'(\theta)(\hat{\theta} - \theta) + \text{remainder} \\ & = f(\theta) + f'(\theta)\hat{\theta} + f'(\theta)\theta + \text{remainder} \\ & \approx f(\theta) + f'(\theta)\hat{\theta} + f'(\theta)\theta \end{align*}\]
The first order Taylor series expansion shows that \(f(\hat{\theta})\) is an approximate linear function of the random variable \(\hat{\theta}\). The values \(\theta\), \(f(\theta)\), and \(f'(\theta)\) are constants. Assuming that, for large enough T,
\[\begin{equation*} \hat{\theta}\sim N(\theta,\widehat{\mathrm{se}}(\hat{\theta})^{2}), \end{equation*}\]
it follows that
\[\begin{equation*} f(\hat{\theta}) \sim N(f(\theta), f'(\theta)^2 \widehat{\mathrm{se}}(\hat{\theta})^{2}) \end{equation*}\]
Since \(\hat{\theta}\) is consistent for \(\theta\) we can replace \(f'(\theta)^2\) with \(f'(\hat{\theta})^2\) giving the practically useful result
\[\begin{equation*} f(\hat{\theta}) \sim N(f(\theta), f'(\hat{\theta})^2 \widehat{\mathrm{se}}(\hat{\theta})^{2}) \end{equation*}\]
The 95% confidence interval computed in
deltaMethod
uses 1.96 instead of the rule-of-thumb value of 2 used in the analytic calculations.↩︎