8.3 Asymptotic normality

Let θ denote a scalar parameter of the GWN model for returns and let ˆθ denote the plug-in estimator of θ. In Chapter 7 we stated that ˆθ is asymptotically normally distributed

ˆθN(θ,^se(ˆθ)2),

for large enough T, where ^se(ˆθ) denotes the estimated standard error of ˆθ. This result is justified by the CLT. Let f(θ) denote a real-valued continuous and differentiable scalar function of θ. Using a technique called the delta method, we can show that the plug-in estimator f(ˆθ) is also asymptotically normally distributed:

f(ˆθ)N(f(θ),^se(f(ˆθ))2),

for large enough T, where ^se(f(ˆθ)) is the estimated standard error for f(ˆθ). The delta method shows us how to compute ^se(f(ˆθ))2:

^se(f(ˆθ))2=f(ˆθ)2×^se(ˆθ)2,

where f(ˆθ) denotes the derivative of f(θ) evaluated at ˆθ. Then,

^se(f(ˆθ))=f(ˆθ)2×^se(ˆθ)2.

Example 4.4 (Delta method standard error for ˆσ)

In Chapter 7 we stated that, for large enough T,

ˆσ2N(σ2,se(ˆσ2)2), ˆσN(σ,se(ˆσ)2),

where se(σ2)2=2σ4/T and se(σ)2=σ2/2T. The formula for se(ˆσ)2 is the result of the delta method for the function f(σ2)=σ2=σ applied to the asymptotic distribution of ˆσ2. To see this, first note that by the chain rule

f(σ2)=12(σ2)1/2,

so that f(σ2)2=14(σ2)1. Then,

se(f(ˆθ))2=f(σ2)2×se(ˆσ2)2=14(σ2)1×2σ4/T=12σ2/T=σ2/2T=se(ˆσ)2.

The estimated squared standard error replaces σ2 with its estimate ˆσ2 and is given by ^se(ˆσ2)2=ˆσ2/2T.

Now suppose θ is a k×1 vector of GWN model parameters, and f:RkR is a continuous and differentiable function of θ. Define the k×1 gradient function g(θ) as:

g(θ)=f(θ)θ=(f(θ)θ1f(θ)θk).

Assume that ˆθ is asymptotically normally distributed:

ˆθN(θ,^var(ˆθ)),

where ^var(ˆθ) is the k×k estimated variance-covariance matrix of ˆθ. Then the delta method shows us that f(ˆθ) is asymptotically normally distributed:

f(ˆθ)N(f(θ),^se(f(ˆθ))2),

where

^se(f(ˆθ))2=g(ˆθ)^var(ˆθ)g(ˆθ).

Example 2.9 (Delta method standard error for ˆqZα)

Let θ=(μ,σ) and consider f1(θ)=qRα=μ+σqZα (normal simple return quantile). The gradient function g1(θ) for f1(θ) is:

g1(θ)=f1(θ)θ=(f1(θ)μf1(θ)σ)=(1qZα).

Regarding var(ˆθ), recall from Chapter 7, in the GWN model: (ˆμˆσ2)N((μσ2),(se(ˆμ)200se(ˆσ2)2))

for large enough T. We show below, using the delta method, that

(ˆμˆσ)N((μσ),(se(ˆμ)200se(ˆσ)2))

Hence,

var(ˆθ)=(se(ˆμ)200se(ˆσ)2)=(σ2T00σ22T)

Then,

g1(θ)var(ˆθ)g1(θ)=(1qZα)(se(ˆμ)200se(ˆσ)2)(1qZα)=se(ˆμ)2+(qZα)2se(ˆσ)2=σ2T+(qZα)2σ22T=σ2T[1+12(qZα)2]=se(ˆqZα)2.

The standard error for ˆqZα is: se(ˆqRα)=var(ˆqRα)=σT[1+12(qZα)2]1/2.

The formula (8.5) shows that se(ˆqRα) increases with σ and qZα, and decreases with T. In addition, for fixed T, se(ˆqRα) as α0 or as α1 because qZα as α0 and qZα as α1. Hence, we have very large estimation error in ˆqRα when α is very close to zero or one (for fixed values of σ and T). This makes intuitive sense as we have very few observations in the extreme tails of the empirical distribution of the data for a fixed sample size and so we expect a larger estimation error.

Replacing σ with its estimate ˆσ gives the estimated standard error for ˆqRα:

^se(ˆqRα)=ˆσT[1+12(qZα)2]1/2.

Using the above results, the sampling distribution of ˆqRα can be approximated by the normal distribution: ˆqRαN(qRα, ^se(ˆqRα)2), for large enough T.

Example 2.10 (Applying the delta method to ˆqRα for Microsoft)

The estimates of qRα, se(ˆqRα), and 95% confidence intervals, for α=0.05,0.01 and 0.001, from the simple monthly returns for Microsoft are:

qhat.05 = muhatS + sigmahatS*qnorm(0.05)
qhat.01 = muhatS + sigmahatS*qnorm(0.01)
qhat.001 = muhatS + sigmahatS*qnorm(0.001)
seQhat.05 = (sigmahatS/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.05)^2)
seQhat.01 = (sigmahatS/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.01)^2)
seQhat.001 = (sigmahatS/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.001)^2)
lowerQhat.05 = qhat.05 - 2*seQhat.05
upperQhat.05 = qhat.05 + 2*seQhat.05
lowerQhat.01 = qhat.01 - 2*seQhat.01
upperQhat.01 = qhat.01 + 2*seQhat.01
lowerQhat.001 = qhat.001 - 2*seQhat.001
upperQhat.001 = qhat.001 + 2*seQhat.001
ans = cbind(c(qhat.05,  qhat.01, qhat.001),
            c(seQhat.05, seQhat.01, seQhat.001),
            c(lowerQhat.05, lowerQhat.01, lowerQhat.001),
            c(upperQhat.05, upperQhat.01, upperQhat.001))
colnames(ans) = c("Estimates", "Std Errors", "2.5%", "97.5%")
rownames(ans) = c("q.05", "q.01", "q.001")
ans
##       Estimates Std Errors   2.5%  97.5%
## q.05     -0.158     0.0119 -0.182 -0.134
## q.01     -0.227     0.0149 -0.257 -0.197
## q.001    -0.304     0.0186 -0.342 -0.267

For Microsoft the values of ^se(ˆqRα) are about 1.2%, 1.5%, and 1.9% for α=0.05, α=0.01 and α=0.001, respectively. The 95% confidence interval widths increase with decreases in α indicating more uncertainty about the true value of qRα for smaller values of α. For example, the 95% confidence for interval qR0.01 indicates that the true loss with 1% probability (one month out of 100 months) is between 19.7% and 25.7%.

Example 4.5 (Delta method standard errors for ^VaRNα, ^VaRLNα and ^SR)

Let θ=(μ,σ) and consider the functions f2(θ)=VaRNα=W0(μ+σqZα) (normal VaR), f3(θ)=VaRLNα=W0(exp((μ+σqZα))1) (log-normal VaR), and f4(θ)=SR=μrfσ (Sharpe Ratio). It is straightforward to show (see end-of-chapter exercises) that the gradient functions are:

g2(θ)=f2(θ)θ=(W0W0qZα),g3(θ)=f3(θ)θ=(W0exp(μ+σqZα)W0exp(μ+σqZα)qZα),g4(θ)=f4(θ)θ=(1σ(σ)2(μrf)).

The delta method variances are:

g2(θ)var(ˆθ)g2(θ)=W20se(ˆqRα)2=se(^VaRNα)2g3(θ)var(ˆθ)g3(θ)=W20exp(qRα)2se(ˆqRα)2=se(^VaRLNα)2g4(θ)var(ˆθ)g4(θ)=1T(1+12SR2)=se(^SR)2.

The corresponding standard errors are:

se(^VaRNα)=W0se(ˆqRα)se(^VaRLNα)=W0exp(qRα)se(ˆqRα)se(^SR)=1T(1+12SR2),

where

se(ˆqZα)=σ2T[1+12(qZα)2].

We make the following remarks:

  1. se(^VaRNα) and se(^VaRLNα) increase with W0, σ, and qZα, and decrease with T.

  2. For fixed values of W0, σ, and T, se(^VaRNα) and se(^VaRLNα) become very large as α approaches zero. That is, for very small loss probabilities we will have very bad estimates of VaRNα and VaRLNα.

  3. se(SR) increases with SR2 and decreases with T.

  4. The practically useful estimated standard errors replace the unknown values of qRα and SR with their estimated values ˆqRα and ^SR and are given by:

^se(^VaRNα)=W0^se(ˆqRα)^se(^VaRLNα)=W0exp(ˆqRα)^se(ˆqRα)^se(^SR)=1T(1+12^SR2)

Example 3.3 (Applying the delta method to ^VaRNα and ^VaRLNα for Microsoft)

Consider a $100,000 investment for one month in Microsoft. The estimates of VaRNα, its standard error, and 95% confidence intervals, for α=0.05 and α=0.01 are:

W0=100000
qhat.05 = muhatS + sigmahatS*qnorm(0.05)
qhat.01 = muhatS + sigmahatS*qnorm(0.01)
VaR.N.05 = -qhat.05*W0
seVaR.N.05 = W0*seQhat.05
VaR.N.01 = -qhat.01*W0
seVaR.N.01 = W0*seQhat.01
lowerVaR.N.05 = VaR.N.05 - 2*seVaR.N.05
upperVaR.N.05 = VaR.N.05 + 2*seVaR.N.05
lowerVaR.N.01 = VaR.N.01 - 2*seVaR.N.01
upperVaR.N.01 = VaR.N.01 + 2*seVaR.N.01
ans = cbind(c(VaR.N.05, VaR.N.01),
            c(seVaR.N.05, seVaR.N.01),
            c(lowerVaR.N.05, lowerVaR.N.01),
            c(upperVaR.N.05,upperVaR.N.01))
colnames(ans) = c("Estimate", "Std Error", "2.5%", "97.5%")
rownames(ans) = c("VaR.N.05", "VaR.N.01")
ans
##          Estimate Std Error  2.5% 97.5%
## VaR.N.05    15780      1187 13405 18154
## VaR.N.01    22696      1490 19717 25676

The estimated values of VaRN.05 and VaRN.01 are $15,780 and $22,696, respectively, with estimated standard errors of $1,172 and $1,471. The estimated standard errors are fairly small compared with the estimated VaR values, so we can say that VaR is estimated fairly precisely here. Notice that the estimated standard error for VaR.01 is about 26% larger than the estimated standard error for VaRN.05 indicating we have less precision for estimating VaR for very small values of α.

The estimates of VaRLNα and its standard error for α=0.05 and α=0.01 are:

qhat.05 = muhatC + sigmahatC*qnorm(0.05)
qhat.01 = muhatC + sigmahatC*qnorm(0.01)
seQhat.05 = (sigmahatC/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.05)^2)
seQhat.01 = (sigmahatC/sqrt(n.obs))*sqrt(1 + 0.5*qnorm(0.01)^2)
VaR.LN.05 = -W0*(exp(qhat.05)-1)
seVaR.LN.05 = W0*exp(qhat.05)*seQhat.05
VaR.LN.01 = -W0*(exp(qhat.01)-1)
seVaR.LN.01 = W0*exp(qhat.01)*seQhat.01
lowerVaR.LN.05 = VaR.LN.05 - 2*seVaR.LN.05
upperVaR.LN.05 = VaR.LN.05 + 2*seVaR.LN.05
lowerVaR.LN.01 = VaR.LN.01 - 2*seVaR.LN.01
upperVaR.LN.01 = VaR.LN.01 + 2*seVaR.LN.01
ans = cbind(c(VaR.LN.05, VaR.LN.01),
            c(seVaR.LN.05, seVaR.LN.01),
            c(lowerVaR.LN.05, lowerVaR.LN.01),
            c(upperVaR.LN.05,upperVaR.LN.01))
colnames(ans) = c("Estimate", "Std Error", "2.5%", "97.5%")
rownames(ans) = c("VaR.LN.05", "VaR.LN.01")
ans
##           Estimate Std Error  2.5% 97.5%
## VaR.LN.05    14846       998 12850 16842
## VaR.LN.01    20467      1170 18128 22807

The estimates of VaRLNα (and its standard error) are slightly smaller than the estimates of VaRNα (and its standard error) due to the positive skewness in the log-normal distribution.

Example 8.1 (Applying the delta method to the Sharpe Ratio for Microsoft)

The estimated monthly SR for Microsoft (using rf=0.0025), its standard error and 95% confidence interval are computed using:

SRhat = (muhatS - r.f)
seSRhat = sqrt((1/n.obs)*(1 + 0.5*(SRhat^2)))
lowerSR = SRhat - 2*seSRhat
upperSR = SRhat + 2*seSRhat
ans = c(SRhat, seSRhat, lowerSR, upperSR)
names(ans) = c("Estimate", "Std Error", "2.5%", "97.5%")
ans
##  Estimate Std Error      2.5%     97.5% 
##   0.00665   0.07625  -0.14585   0.15915

The estimated monthly SR for Microsoft is close to zero, with a large estimated standard error and a 95% confidence interval containing both negative and positive values. Clearly, the SR is not estimated very well.

8.3.1 The numerical delta method

The delta method is an advanced statistical technique and requires a bit of calculus to implement, especially for nonlinear vector-valued functions. It can be easy to make mistakes when working out the math! Fortunately, in R there is an easy way to implement the delta method that does not require you to work out the math. The car function deltaMethod implements the delta method numerically, utilizing the stats function D() which implements symbolic derivatives. The arguments expected by deltaMethod are:

args(car:::deltaMethod.default)
## function (object, g., vcov., func = g., constants, level = 0.95, 
##     rhs = NULL, ..., envir = parent.frame()) 
## NULL

Here, object is the vector of named estimated parameters, ˆθ, g. is a quoted string that is the function of the parameter estimates to be evaluated, f(θ), and vcov is the named estimated covariance matrix of the coefficient estimates, ^var(ˆθ). The function returns an object of class “deltaMethod” for which there is a print method. The “deltaMethod” object is essentially a data.frame with columns giving the parameter estimate(s), the delta method estimated standard error(s), and lower and upper confidence limits. The following example illustrates how to use deltaMethod for the example functions.

Example 4.6 (Numerical delta method standard errors for example data)

We first create the named inputs ˆθ and ^var(ˆθ) from the GWN model for simple returns:

thetahatS = c(muhatS, sigmahatS)
names(thetahatS) = c("mu", "sigma")
var.thetahatS = matrix(c(se.muhatS^2,0,0,se.sigmahatS^2), 2, 2, byrow=TRUE)
rownames(var.thetahatS) = colnames(var.thetahatS) = names(thetahatS)
thetahatS
##      mu   sigma 
## 0.00915 0.10150
var.thetahatS
##             mu    sigma
## mu    5.99e-05 0.00e+00
## sigma 0.00e+00 2.99e-05

We do the same for the inputs from the GWN model for continuously compounded returns:

thetahatC = c(muhatC, sigmahatC)
names(thetahatC) = c("mu", "sigma")
var.thetahatC = matrix(c(se.muhatC^2,0,0,se.sigmahatC^2), 2, 2, byrow=TRUE)
rownames(var.thetahatC) = colnames(var.thetahatC) = names(thetahatC)

It is important to name the elements in thetahat and var.thetahat as these names will be used in defining f(θ). To implement the numerical delta method for the normal quantile, we need to supply a quoted string specifying the function f1(θ)=μ+σqZα. Here we use the string "mu+sigma*q.05", where q.05 = qnorm(0.05). For, α=0.05 the R code is:

q.05 = qnorm(0.05)
dm1 = deltaMethod(thetahatS, "mu+sigma*q.05", var.thetahatS)
class(dm1)
## [1] "deltaMethod" "data.frame"

The returned object dm1 is of class “deltaMethod”, inhereting from “data.frame”, and has named columns:

names(dm1)
## [1] "Estimate" "SE"       "2.5 %"    "97.5 %"

The print method shows the results of applying the delta method:

dm1
##                   Estimate      SE   2.5 % 97.5 %
## mu + sigma * q.05  -0.1578  0.0119 -0.1811  -0.13

The results match the analytic calculations performed above.40

The delta method for the normal VaR function f2(θ) is:

dm2 = deltaMethod(thetahatS, "-W0*(mu+sigma*q.05)", var.thetahatS)
dm2
##                           Estimate    SE 2.5 % 97.5 %
## -W0 * (mu + sigma * q.05)    15780  1187 13453  18106

The delta method for the log-normal VaR function f3(θ) is:

dm3 = deltaMethod(thetahatC, "-W0*(exp(mu+sigma*q.05)-1)", var.thetahatC)
dm3
##                                    Estimate    SE 2.5 % 97.5 %
## -W0 * (exp(mu + sigma * q.05) - 1)    14846   998 12890  16802

The delta method for the Sharpe ratio function f4(θ) is

dm4 = deltaMethod(thetahatS, "(mu-r.f)/sigma", var.thetahatS)
dm4
##                  Estimate      SE   2.5 % 97.5 %
## (mu - r.f)/sigma   0.0655  0.0763 -0.0841   0.22

8.3.2 the delta method for vector valued functions (advanced)

Let θ denote a k×1 vector of GWN model parameters. For example, θ=(μ,σ2). Assume that ˆθ is asymptotically normally distributed: ˆθN(θ,^var(θ)), for large enough T where ^var(θ) is the k×k variance-covariance matrix of ˆθ. Sometimes we need to consider the asymptotic distribution of a vector-valued function of θ. Let f:RkRj where jk be a continuous and differentiable function, and define the j×1 vector η=f(θ). It is useful to express η as:

η=(η1η2ηj)=(f1(θ)f2(θ)fj(θ)).

Example 2.19 (Example vector functions of GWN model parameters)

For the first example let θ=(σ21,σ22), and define the 2×1 vector of volatilities: η=f(θ)=(σ1,σ2). Then, η1=f1(θ)=σ22 and η2=f2(θ)=σ22.

For the second example, let θ = (μ1,μ2,σ1,σ2) and define the 2×1 vector of Sharpe ratios:

η=f(θ)=(μ1rfσ1μ1rfσ1)=(SR1SR2),

where rf denotes the constant risk-free rate. Then η1=f1(θ)=μ1rfσ1=SR1, η2=f2(θ)=μ2rfσ2=SR2.

Define the j×k Jacobian matrix: G(θ)=(f1(θ)θ1f1(θ)θ2f1(θ)θkf2(θ)θ1f2(θ)θ2f2(θ)θkfj(θ)θ1fj(θ)θ2fj(θ)θk)

Then, by the delta method, the asymptotic normal distribution of ˆη=f(ˆθ) is: ˆηN(η,^var(ˆη)), where ^var(ˆη)=G(ˆθ)^var(ˆθ)G(ˆθ).

Example 2.20 (Asymptotic distributions of example functions)

Consider the first example function where θ=(σ21,σ22) and η=f(θ)=(σ1,σ2). Now, using the chain-rule:

f1(θ)σ21=12σ11, f1(θ)σ22=0f2(θ)σ21=0, f2(θ)σ22=12σ11

The Jacobian matrix is

G(θ)=(12σ110012σ12)

From Chapter 7, Proposition 7.9,

^var(ˆθ)=1T(2ˆσ412ˆσ2122ˆσ2122ˆσ42)

Then

^var(ˆη)=G(ˆθ)^var(ˆθ)G(ˆθ)=(12ˆσ110012ˆσ12)1T(2ˆσ412ˆσ2122ˆσ2122ˆσ42)(12ˆσ110012ˆσ12)=1T(12ˆσ2112ˆσ212ˆσ11ˆσ1212ˆσ212ˆσ11ˆσ1212ˆσ22)=1T(12ˆσ2112ˆσ12ˆρ1212ˆσ12ˆρ1212ˆσ22).

Example 2.21 (Asymptotic distributions of example functions continued)

Consider the second example function where θ=(μ1,μ2,σ1,σ2) and η=f(θ)=(f1(θ),f2(θ)) where f1(θ)=μ1rfσ1=SR1, f2(θ)=μ2rfσ2=SR2.

Here, we want to use the delta method to get the joint distribution of η=(SR1,SR2). Now, using the chain-rule:

f1(θ)μ1=σ11, f1(θ)μ2=0, f1(θ)σ1=σ21(μ1rf), f1(θ)σ2=0,f2(θ)μ1=0, f2(θ)μ2=σ12, f2(θ)σ1=0, f2(θ)σ2=σ21(μ1rf).

Then, using SRi=(μirf)/σi (i=1,2), we have:

G(θ)=(σ110σ11SR100σ120σ12SR20)

Using Proposition 7.10 and the results from the previous example we have:

var(ˆθ)=1T(σ21σ1200σ12σ22000012σ2112σ212σ11σ120012σ212σ11σ1212σ22).

Then, after some straightforward matrix algebra calculations (see end-of-chapter exercises) we get:

^var(ˆη)=G(ˆθ)^var(ˆθ)G(ˆθ)=(^var(^SR1)^cov(^SR1,^SR2)^cov(^SR1,^SR2)^var(^SR2))=(1T(1+12^SR21)ˆρ12T(1+12ˆρ12^SR1^SR2)ˆρ12T(1+12ˆρ12^SR1^SR2)1T(1+12^SR22)).

The diagonal elements match the delta method variances for the estimated Sharpe ratio derived earlier. The off-diagonal covariance term shows that the two estimated Sharpe ratios are correlated. We explore this correlation in more depth in the end-of-chapter exercises.

8.3.3 The delta method explained

The delta method deduces the asymptotic distribution of f(ˆθ) using a first order Taylor series approximation of f(ˆθ) evaluated at the true value θ. For simplicity, assume that θ is a scalar parameter. Then, the first order Taylor series expansion gives

f(ˆθ)=f(θ)+f(θ)(ˆθθ)+remainder=f(θ)+f(θ)ˆθ+f(θ)θ+remainderf(θ)+f(θ)ˆθ+f(θ)θ

The first order Taylor series expansion shows that f(ˆθ) is an approximate linear function of the random variable ˆθ. The values θ, f(θ), and f(θ) are constants. Assuming that, for large enough T,

ˆθN(θ,^se(ˆθ)2),

it follows that

f(ˆθ)N(f(θ),f(θ)2^se(ˆθ)2)

Since ˆθ is consistent for θ we can replace f(θ)2 with f(ˆθ)2 giving the practically useful result

f(ˆθ)N(f(θ),f(ˆθ)2^se(ˆθ)2)


  1. The 95% confidence interval computed in deltaMethod uses 1.96 instead of the rule-of-thumb value of 2 used in the analytic calculations.↩︎