9.3 Tests for Individual Parameters: t-tests and z-scores

In this section, we present test statistics for testing hypotheses that certain model coefficients equal specific values. In some cases, we can derive exact tests. For an exact test, the pdf of the test statistic assuming the null hypothesis is true is known exactly for a finite sample size \(T\). More generally, however, we rely on asymptotic tests. For an asymptotic test, the pdf of the test statistic assuming the null hypothesis is true is not known exactly for a finite sample size \(T\), but can be approximated by a known pdf. The approximation is justified by the Central Limit Theorem (CLT) and the approximation becomes exact as the sample size becomes infinitely large.

9.3.1 Exact tests under normality of data

In the GWN model (9.1), consider testing the hypothesis that the mean return, \(\mu_{i}\), is equal to a specific value \(\mu_{i}^{0}\):

\[\begin{equation} H_{0}:\mu_{i}=\mu_{i}^{0}.\tag{9.3} \end{equation}\]

For example, an investment analyst may have provided an expected return forecast of \(\mu_{i}^{0}\) and we would like to see if past data was consistent with such a forecast.

The alternative hypothesis can be either two-sided or one-sided. The two-sided alternative is: \[\begin{equation} H_{1}:\mu_{i}\neq\mu_{i}^{0}.\tag{9.4} \end{equation}\] Two-sided alternatives are used when we don’t care about the sign of \(\mu_{i}-\mu_{i}^{0}\) under the alternative. With one-sided alternatives, we care about the sign of \(\mu_{i}-\mu_{i}^{0}\) under the alternative: \[\begin{equation} H_{1}:\mu_{i}>\mu_{i}^{0}\,\textrm{ or }\,H_{1}:\mu_{i}<\mu_{i}^{0}.\tag{9.5} \end{equation}\] How do we come up with a test statistic \(S\) for testing (9.3) against (9.4) or (9.5)? We generally use two criteria: (1) we know the pdf of \(S\) assuming (9.3) is true; and (2) the value of \(S\) should be big if the alternative hypotheses (9.4) or (9.5) are true.

9.3.1.1 Two-sided test

To simplify matters, let’s assume that the value of \(\sigma_{i}\) in (9.1) is known and does not need to be estimated. This assumption is unrealistic and will be relaxed later. Consider testing (9.3) against the two-sided alternative (9.4) using a 5% significance level. Two-sided alternatives are typically more commonly used than one-sided alternatives. Let \(\{R_{t}\}_{t=1}^{T}\) denote a random sample from the GWN model. We estimate \(\mu_{i}\) using the sample mean:

\[\begin{equation} \hat{\mu}_{i}=\frac{1}{T}\sum_{t=1}^{T}R_{it}.\tag{9.6} \end{equation}\]

Under the null hypothesis (9.3), it is assumed that \(\mu_{i}=\mu_{i}^{0}\). Hence, if the null hypothesis is true then \(\hat{\mu}_{i}\) computed from data should be close to \(\mu_{i}^{0}\). If \(\hat{\mu}_{i}\) is far from \(\mu_{i}^{0}\) (either above or below \(\mu_{i}^{0}\)) then this evidence casts doubt on the validity of the null hypothesis. Now, there is estimation error in (9.6) which is measured by:

\[\begin{equation} \mathrm{se}(\hat{\mu}_{i})=\frac{\sigma_{i}}{\sqrt{T}}.\tag{9.7} \end{equation}\]

Because we assume \(\sigma_{i}\) is known, we also know \(\mathrm{se}(\hat{\mu}_{i})\). Because of estimation error, we don’t expect \(\hat{\mu}_{i}\) to equal \(\mu_{i}^{0}\) even if the null hypothesis (9.3) is true. How far from \(\mu_{i}^{0}\) can \(\hat{\mu}_{i}\) be if the null hypothesis is true? To answer this question, recall from Chapter 7 that the exact (finite sample) pdf of \(\hat{\mu}_{i}\) is the normal distribution:

\[\begin{equation} \hat{\mu}_{i}\sim N(\mu_{i},\mathrm{se}(\hat{\mu}_{i})^{2})=N\left(\mu_{i},\frac{\sigma_{i}^{2}}{T}\right).\tag{9.8} \end{equation}\]

Under the null hypothesis (9.3), the normal distribution for \(\hat{\mu}_{i}\) is centered at \(\mu_{i}^{0}\):

\[\begin{equation} \hat{\mu}_{i}\sim N(\mu_{i}^{0},\mathrm{se}(\hat{\mu}_{i})^{2})=N\left(\mu_{i}^{0},\frac{\sigma_{i}^{2}}{T}\right).\tag{9.9} \end{equation}\]

From properties of the normal distribution we have:

\[\begin{equation} \Pr\left(\mu_{i}^{0}-1.96\times\mathrm{se}(\hat{\mu}_{i})\leq\hat{\mu}_{i}\leq\mu_{i}^{0}+1.96\times\mathrm{se}(\hat{\mu}_{i})\right)=0.95.\tag{9.10} \end{equation}\]

Hence, if the null hypothesis (9.3) is true we only expect to see \(\hat{\mu}_{i}\) more than \(1.96\) values of \(\mathrm{se}(\hat{\mu}_{i})\) away from \(\mu_{i}^{0}\) with probability 0.05.51 Therefore, it makes intuitive sense to base a test statistic for testing (9.3) on a measure of distance between \(\hat{\mu}_{i}\) and \(\mu_{i}^{0}\) relative to \(\mathrm{se}(\hat{\mu}_{i})\). Such a statistic is the z-score:

\[\begin{equation} z_{\mu=\mu^{0}}=\frac{\hat{\mu}_{i}-\mu_{i}^{0}}{\mathrm{se}(\hat{\mu}_{i})}=\frac{\hat{\mu}_{i}-\mu_{i}^{0}}{\sigma/\sqrt{T}}.\tag{9.11} \end{equation}\]

Assuming the null hypothesis (9.3) is true, from (9.9) it follows that: \[ z_{\mu=\mu^{0}}\sim N(0,1). \]

The intuition for using the z-score (9.11) to test (9.3) is straightforward. If \(z_{\mu=\mu^{0}}\approx0\) then \(\hat{\mu}_{i}\approx\mu_{i}^{0}\) and (9.3) should not be rejected. In contrast, if \(z_{\mu=\mu^{0}}>1.96\) or \(z_{\mu=\mu^{0}}<-1.96\) then \(\hat{\mu}_{i}\) is more than \(1.96\) values of \(\mathrm{se}(\hat{\mu}_{i})\) away from \(\mu_{i}^{0}\). From (9.10), this is very unlikely (less than 5% probability) if (9.3) is true. In this case, there is strong data evidence against (9.3) and we should reject it. Notice that the condition \(z_{\mu=\mu^{0}}>1.96\textrm{ or }z_{\mu=\mu^{0}}<-1.96\) can be simplified as the condition \(\left|z_{\mu=\mu^{0}}\right|>1.96\). When \(\left|z_{\mu=\mu^{0}}\right|\) is big, larger than 1.96, we have data evidence against (9.3). Using the value \(1.96\) to determine the rejection region for the test ensures that the significance level (probability of Type I error) of the test is exactly 5%:

\[\begin{align*} \Pr(\textrm{Reject } H_{0}|H_{0}\,\textrm{is true}) &= \Pr(S>1.96|\mu_{i}=\mu_{i}^{0}) \\ &=\Pr\left(\left|z_{\mu=\mu^{0}}\right|>1.96|\mu_{i}=\mu_{i}^{0}\right)=.05 \end{align*}\]

Hence, our formal test statistic for testing (9.3) against (9.4) is

\[\begin{equation} S=\left|z_{\mu=\mu^{0}}\right|. \tag{9.12} \end{equation}\]

The 5% critical value is \(cv_{.05}=1.96\), and we reject (9.3) at the 5% significance level if \(S>1.96\).

In summary, the steps for using \(S=\left|z_{\mu=\mu^{0}}\right|\) to test the (9.3) against (9.4) are:

  1. Set the significance level \(\alpha=\Pr(\textrm{Reject } H_{0}|H_{0}\,\textrm{is true})\) and determine the critical value \(cv_{\alpha}\) such that \(\Pr(S>cv_{\alpha})=\alpha\). Now,

\[\begin{align*} \Pr(S>cv_{\alpha}) &= \Pr\left(\left|z_{\mu=\mu^{0}}\right|>cv_{\alpha}\right) \\ &= \Pr\left(z_{\mu=\mu^{0}}>cv_{\alpha}\right)+\Pr\left(z_{\mu=\mu^{0}}<-cv_{\alpha}\right)=\alpha \end{align*}\]

which implies that \(cv_{\alpha}=-q_{\alpha/2}^{Z}=q_{1-\alpha/2}^{Z}\), where \(q_{\alpha/2}^{Z}\) denotes the \(\frac{\alpha}{2}-\)quantile of \(Z\sim N(0,1)\). For example, if \(\alpha=.05\) then \(cv_{.05}=-q_{.025}^{Z}=1.96.\)

  1. Reject (9.3) at the \(\alpha\times100\%\) significance level if \(S>cv_{\alpha}\). For example, if \(\alpha=.05\) then reject (9.3) at the 5% level if \(S>1.96\).

  2. Equivalently, reject (9.3) at the \(\alpha\times100\%\) significance level if the p-value for \(S\) is less than \(\alpha\). Here, the p-value is defined as the significance level at which the test is just rejected. Let \(Z\sim N(0,1)\). The p-value is computed as: \[\begin{eqnarray} \textrm{p-value} & = & \Pr(|Z|>S)=\Pr(Z>S)+\Pr(Z<-S)\\ & = & 2\times \Pr(Z>S)=2\times(1-\Pr(Z<S)) \tag{9.13}. \end{eqnarray}\]

Example 4.1 (Testing hypothesis with the z-score using simulated data: two-sided tests)

Assume returns follow the GWN model

\[\begin{eqnarray*} R_{t} & = & 0.05+\epsilon_{t},\,t=1,\ldots60,\\ \epsilon_{t} & \sim & GWN(0,0.10). \end{eqnarray*}\]

and we are interested in testing the following hypotheses \[\begin{align*} H_{0}:\mu=0.05\,\,vs.\,\,H_{1}:\mu\neq.05 \\ H_{0}:\mu=0.06\,\,vs.\,\,H_{1}:\mu\neq.06 \\ H_{0}:\mu=0.10\,\,vs.\,\,H_{1}:\mu\neq.10 \\ \end{align*}\]

using the test statistic (9.12) with a 5% significance level. The 5% critical value is \(cv_{.05}=1.96\). One hypothetical sample from the hypothesized model is simulated using:

set.seed(123)
n.obs = 60
mu = 0.05
sigma = 0.10
ret.sim = mu + rnorm(n.obs, sd = sigma)

The estimate of \(\mu\) and the value of \(\mathrm{se}(\hat{\mu})\) are:

muhat = mean(ret.sim)
se.muhat = sigma/sqrt(n.obs)
c(muhat, se.muhat)
## [1] 0.0566 0.0129

The test statistic (9.12) for testing \(H_0:\mu=0.05\) computed from this sample is:

z.score.05 = (muhat - 0.05)/se.muhat
S = abs(z.score.05)
S
## [1] 0.508

The z-score tells us that the estimated mean, \(\hat{\mu}=0.0566\), is 0.58 values of \(\mathrm{se}(\hat{\mu})\) away from the hypothesized value \(\mu^0=0.05\). This evidence does not contradict \(H_0:\mu=0.05\). Since \(S=0.508 < 1.96\), we do not reject \(H_0:\mu=0.05\) at the 5% significance level. The p-value of the test using (9.13) is:

p.value = 2*(1 - pnorm(S))
p.value
## [1] 0.611

Here, the p-value of 0.611 is less than the significance level \(\alpha = 0.05\) so we do not reject the null at the 5% significance level. The p-value tells us that we would reject \(H0:\mu=0.05\) at the 61.1% significance level.

The z-score information for testing \(H_0:\mu=0.06\) is:

z.score.06 = (muhat - 0.06)/se.muhat
S = abs(z.score.06)
S
## [1] 0.266

Here, the z-score indicates that \(\hat{\mu}=0.0566\) is just 0.266 values of \(\mathrm{se}(\hat{\mu})\) away from the hypothesized value \(\mu^0=0.06\). This evidence also does not contradict \(H_0:\mu=0.06\). Since \(S=0.226 < 1.96\), we do not reject \(H_0:\mu=0.06\) at the 5% significance level. The p-value of the test is:

p.value = 2*(1 - pnorm(S))
p.value
## [1] 0.79

The large p-value shows that there is insufficient data to reject \(H_0:\mu=0.06\).

Last, the z-score information for testing \(H_0:\mu=0.10\) is:

z.score.10 = (muhat - 0.10)/se.muhat
S = abs(z.score.10)
S
## [1] 3.36

Now, the z-score indicates that \(\hat{\mu}=0.0566\) is 3.36 values of \(\mathrm{se}(\hat{\mu})\) away from the hypothesized value \(\mu^0=0.10\). This evidence is not in support of \(H_0:\mu=0.10\). Since \(S=3.36 > 1.96\), we reject \(H_0:\mu=0.06\) at the 5% significance level. The p-value of the test is:

p.value = 2*(1 - pnorm(S))
p.value
## [1] 0.000766

Here, the p-value is much less 5% supporting the rejection of \(H_0:\mu=0.06\).

\(\blacksquare\)

9.3.1.2 One-sided test

Here, we consider testing the null (9.3) against the one-sided alternative (9.5). For expositional purposes, consider \(H_{1}:\mu_{i} > \mu_{i}^{0}\). This alternative can be equivalently represented as \(H_{1}:\mu_{i} - \mu_{i}^{0} > 0\). That is, under the alternative hypothesis the sign of the difference \(\mu_{i} - \mu_{i}^{0}\) is positive. This is why a test against a one-sided alternative is sometimes called a test for sign. As in the previous sub-section, assume that \(\sigma_i\) is known and let the significance level be \(5\%\).

The natural test statistic is the z-score (9.11): \[ S = z_{\mu = \mu^0}. \]

The intuition is straightforward. If \(z_{\mu=\mu^0} \approx 0\) then \(\hat{\mu}_i \approx \mu_i^0\) and the null should not be rejected. However, if the one-sided alternative is true then we would expect to see \(\hat{\mu}_i > \mu_i^0\) and \(z_{\mu=\mu^0} > 0\). How big \(z_{\mu=\mu^0}\) needs to be for us to reject the null depends on the significance level. Since under the null \(z_{\mu=\mu^0} \sim N(0,1) = Z\), with a \(5\%\) significance level

\[ \Pr(\text{Reject } H_0 | H_0 \text{ is true}) = \Pr(z_{\mu=\mu^0} > 1.645) = \Pr(Z > 1.645) = 0.05. \] Hence, our \(5\%\) one-sided critical value is \(cv_{.05}=q_{.95}^Z = 1.645\), and we reject the null at the \(5\%\) significance level if \(z_{\mu=\mu^0} >1.645\).

In general, the steps for using \(z_{\mu=\mu^0}\) to test (9.3) against the one-sided alternative \(H_{1}:\mu_{i} > \mu_{i}^{0}\) are:

  1. Set the significance level \(\alpha=\Pr(\textrm{Reject } H_{0}|H_{0}\,\textrm{is true})\) and determine the critical value \(cv_{\alpha}\) such that \(\Pr(z_{\mu=\mu^0}>cv_{\alpha})=\alpha\). Since under the null \(z_{\mu=\mu^0} \sim N(0,1) = Z\), \(cv_{\alpha} = q_{1-\alpha}^Z\).

  2. Reject the null hypothesis (9.3) at the \(\alpha \times 100\%\) significance level if \(z_{\mu=\mu^0} > cv_{\alpha}\).

  3. Equivalently, reject (9.3) at the \(\alpha \times 100\%\) significance level if the one-sided p-value is less than \(\alpha\), where

\[ \text{p-value} = \Pr(Z > z_{\mu=\mu^0}) = 1 - \Pr(Z \le z_{\mu=\mu^0}) \]
Example 2.8 (Testing hypothesis with the z-score using simulated data: one-sided tests)

Here, we use the simulated GWN return data from the previous example to test (9.3) against the one-sided alternative \(H_{1}:\mu_{i} > \mu_{i}^{0}\) using a 5% significance level.

The z-scores and one-sided p-values for testing \(H_0:\mu=0.05\), \(H_0:\mu=0.06\), and \(H_0:\mu=0.10\), computed from the simulated sample are:

z.score.05 = (muhat - 0.05)/se.muhat
p.value.05 = 1 - pnorm(z.score.05)
z.score.06 = (muhat - 0.06)/se.muhat
p.value.06 = 1 - pnorm(z.score.06)
z.score.10 = (muhat - 0.10)/se.muhat
p.value.10 = 1 - pnorm(z.score.10)
ans = rbind(c(z.score.05, p.value.05),
            c(z.score.06, p.value.06),
            c(z.score.10, p.value.10))
colnames(ans) = c("Z-score", "P-value")
rownames(ans) = c("H0:mu=0.05", "H0:mu=0.06","H0:mu=0.10")
ans
##            Z-score P-value
## H0:mu=0.05   0.508   0.306
## H0:mu=0.06  -0.266   0.605
## H0:mu=0.10  -3.365   1.000

Here, we do not reject any of the null hypotheses in favor of the one-sided alternative at the 5% significance level.

9.3.2 Exact tests with \(\sigma\) unknown

In practice, \(\sigma^2\) is unknown and is estimated with \(\hat{\sigma}\), and so the exact tests described above are not feasible. Fortunately, an exact test is still available. Instead of using the z-score (9.11), we use the t-ratio (or t-score)

\[\begin{equation} t_{\mu=\mu^0}=\frac{\hat{\mu}_{i}-\mu_{i}^0}{\widehat{\mathrm{se}}(\hat{\mu}_{i})}=\frac{\hat{\mu}_{i}-\mu_{i}^0}{\hat{\sigma}/\sqrt{T}}.\tag{9.14} \end{equation}\]

Assuming the null hypothesis (9.3) is true, Proposition 7.7 tells us that (9.14) is distributed Student’s t with \(T-1\) degrees of freedom and is denoted by the random variable \(t_{T-1}\). The steps for using the z-score and the t-ratio are the same for evaluating (9.3), but now we use critical values and p-values from \(t_{T-1}\). Our test statistic for the two-sided alternative \(H_0:\mu_i \ne \mu_i^0\) is: \[\begin{equation} S = \left|t_{\mu=\mu^{0}}\right|, \tag{9.15} \end{equation}\]

and our test statistic for the one-sided alternative \(H_0:\mu_i > \mu_i^0\) is

\[\begin{equation} S = t_{\mu=\mu^{0}}, \tag{9.16} \end{equation}\]

Our critical values are determined from the quantiles of the Student’s t with \(T-1\) degrees of freedom, \(t_{T-1}(1-\alpha/2)\). For example, if \(\alpha=0.05\) and \(T-1=60\) then the two-sided critical value is \(cv_{.05} = t_{60}(0.975)=2\) (which can be verified using the R function qt()). The two-sided p-value is computed as:

\[\begin{eqnarray*} \textrm{p-value} & = & \Pr(|t_{T-1}|>S)=\Pr(t_{T-1}>S)+\Pr(t_{T-1}<-S)\\ & = & 2\times \Pr(t_{T-1}>S)=2\times(1-\Pr(t_{T-1}<S)). \end{eqnarray*}\]

The one-sided critical value is \(cv_{.05} = t_{60}(0.95)=1.67\), and the one-sided p-value is computed as:

\[ \text{p-value} = \Pr(t_{T-1} > S) = 1 - \Pr(t_{T-1} \le S). \]

As the sample size gets larger \(\hat{\sigma}_{i}\) gets closer to \(\sigma_{i}\) and the Student’s t distribution gets closer to the normal distribution. Decisions using the t-ratio and the z-score are almost the same for \(T \ge 60\).

Example 2.9 (Testing hypothesis with the t-ratio using simulated data)

We repeat the hypothesis testing from the previous example this time use the t-ratio (9.14) instead of the z-score. Using \(T-1=59\) the 5% critical value for the two-sided test is

qt(0.975, df=59)
## [1] 2

To compute the t-ratios we first estimate \(\mathrm{se}(\hat{\mu})\):

sigmahat = sd(ret.sim)
sehat.muhat = sigmahat/sqrt(n.obs)
c(sehat.muhat, se.muhat)
## [1] 0.0118 0.0129

Here, \(\widehat{\mathrm{se}}(\hat{\mu}) = 0.0118 < 0.0129 = \mathrm{se}(\hat{\mu})\) and so the t-ratios will be slightly larger than the z-scores. The t-ratios and test statistics for the three hypotheses are:

t.ratio.05 = (muhat - 0.05)/sehat.muhat
t.ratio.06 = (muhat - 0.06)/sehat.muhat
t.ratio.10 = (muhat - 0.10)/sehat.muhat
S.05 = abs(t.ratio.05)
S.06 = abs(t.ratio.06)
S.10 = abs(t.ratio.10)
ans = c(S.05, S.06, S.10)
names(ans) = c("S.05", "S.06", "S.07")
ans
##  S.05  S.06  S.07 
## 0.558 0.293 3.696

As expected the test statistics computed from the t-ratios are slightly larger than the tests computed from the z-scores. The first two statistics are less than 2, and the third statistic is bigger than 2 and so we reach the same decisions as before. The p-values of the three tests are:

S.vals = c(S.05, S.06, S.10)
p.values = 2*(1 - pt(S.vals, df=59))
p.values
## [1] 0.578747 0.770895 0.000482

Here, the p-values computed from \(t_{59}\) are very similar to those computed from \(Z\sim N(0,1)\).

\(\blacksquare\)

We can derive an exact test for testing hypothesis about the value of \(\mu_{i}\) based on the z-score or the t-ratio, but we cannot derive exact tests for the values of \(\sigma_{i}\) or for the values of \(\rho_{ij}\) based on z-scores. Exact tests for these parameters are much more complicated. While t-ratios for the values of \(\sigma_{i}\) or for the values of \(\rho_{ij}\) do not have exact t-distributions in finite samples, as the sample size gets large the distributions of the t-ratios get closer and closer to the normal distribution due to the CLT. This motivates the use of so-called asymptotic z-scores discussed in the next sub-section.

9.3.3 Z-scores under asymptotic normality of estimators

Let \(\hat{\theta}\) denote an estimator for \(\theta\). Here, we allow \(\theta\) to be a GWN model parameter or a function of GWN model parameters. For example, in the GWN model \(\theta\) could be \(\mu_{i}\), \(\sigma_{i},\), \(\rho_{ij}\), \(q_{\alpha}^R\), \(\mathrm{VaR}_{\alpha}\) or \(\mathrm{SR}_i\). As we have seen, the CLT (and the delta method if needed) justifies the asymptotic normal distribution:

\[\begin{equation} \hat{\theta}\sim N(\theta,\widehat{\mathrm{se}}(\hat{\theta})^{2}),\tag{9.17} \end{equation}\]

for large enough sample size \(T\), where \(\widehat{\mathrm{se}}(\hat{\theta})\) is the estimated standard error for \(\hat{\theta}\). Consider testing:

\[\begin{equation} H_{0}:\theta=\theta_{0}\text{ vs. }H_{1}:\theta\neq\theta_{0}.\tag{9.18} \end{equation}\]

Under \(H_{0},\) the asymptotic normality result (9.17) implies that the z-score for testing (9.18) has a standard normal distribution for large enough sample size \(T\):

\[\begin{equation} z_{\theta=\theta_{0}}=\frac{\hat{\theta}-\theta_{0}}{\widehat{\mathrm{se}}(\hat{\theta})}\sim N(0,1)=Z.\tag{9.19} \end{equation}\]

The intuition for using the z-score (9.19) is straightforward. If \(z_{\theta=\theta_{0}}\approx0\) then \(\hat{\theta}\approx\theta_{0},\) and \(H_{0}:\theta=\theta_{0}\) should not be rejected. On the other hand, if \(|z_{\theta=\theta_{0}}|>2\), say, then \(\hat{\theta}\) is more than \(2\) values of \(\widehat{\mathrm{se}}(\hat{\theta})\) away from \(\theta_{0}.\) This is very unlikely if \(\theta=\theta_{0}\) because \(\hat{\theta}\sim N(\theta_{0},\mathrm{\widehat{se}}(\hat{\theta})^{2}),\) so \(H_{0}:\theta\neq\theta_{0}\) should be rejected. Therefore, the test statistic for testing (9.19) is \(S = |z_{\theta=\theta_{0}}|\).

The steps for using the z-score (9.19) with its critical value to test the hypotheses (9.18) are:

  1. Set the significance level \(\alpha\) of the test and determine the two-sided critical value \(cv_{\alpha/2}\). Using (9.17), the critical value, \(cv_{\alpha/2},\) is determined using: \[\begin{align*} \Pr(|Z| & \geq cv_{\alpha/2})=\alpha\\ & \Rightarrow cv_{\alpha/2}=-q_{\alpha/2}^{Z}=q_{1-\alpha/2}^{Z}, \end{align*}\] where \(q_{\alpha/2}^{Z}\) denotes the \(\frac{\alpha}{2}-\)quantile of \(N(0,1)\). A commonly used significance level is \(\alpha=0.05\) and the corresponding critical value is \(cv_{.025}=-q_{.025}^{Z}=q_{.975}^{Z}=1.96\approx2\).

  2. Reject (9.18) at the \(100\times\alpha\)% significance level if: \[ S = |z_{\theta=\theta_{0}}|=\left\vert \frac{\hat{\theta}-\theta^{0}}{\widehat{\mathrm{se}}(\hat{\theta})}\right\vert >cv_{\alpha/2}. \] If the significance level is \(\alpha=0.05\), then reject (9.18) at the 5% level using the rule-of-thumb: \[ S=|z_{\theta=\theta_{0}}|>2. \]

The steps for using the z-score (9.19) with its p-value to test the hypotheses (9.18) are:

  1. Determine the two-sided p-value. The p-value of the two-sided test is the significance level at which the test is just rejected. From (9.17), the two-sided p-value is defined by \[\begin{equation} \textrm{p-value}=\Pr\left(|Z|>|z_{\theta=\theta_{0}}|\right)=2\times(1-\Pr\left(Z\leq|z_{\theta=\theta_{0}}|\right).\tag{9.20} \end{equation}\]
  2. Reject (9.18) at the \(100\times\alpha\)% significance level if the p-value (9.20) is less than \(\alpha\).
Example 9.1 (Using the z-score to test hypothesis about \(\mu\) in the GWN model)

Consider using the z-score (9.19) to test \(H_{0}:\,\mu_{i}=0\,vs.\,H_{1}:\mu_{i}\neq0\) (\(i=\)Microsoft, Starbucks, S&P 500) using a 5% significance level. First, calculate the GWN model estimates for \(\mu_{i}\):

n.obs = nrow(gwnMonthlyRetC) 
muhat.vals = apply(gwnMonthlyRetC, 2, mean) 
muhat.vals
##    MSFT    SBUX   SP500 
## 0.00413 0.01466 0.00169

Next, calculate the estimated standard errors:

sigmahat.vals = apply(gwnMonthlyRetC, 2, sd) 
se.muhat = sigmahat.vals/sqrt(n.obs) 
se.muhat
##    MSFT    SBUX   SP500 
## 0.00764 0.00851 0.00370

Then calculate the z-scores and test statistics:

z.scores = muhat.vals/se.muhat 
S.vals = abs(z.scores)
S.vals
##  MSFT  SBUX SP500 
## 0.540 1.722 0.457

Since the absolute value of all of the z-scores are less than two, we do not reject \(H_{0}:\,\mu_{i}=0\) at the 5% level for all assets.

The p-values for all of the test statistics are computed using:

2*(1-pnorm(S.vals))
##   MSFT   SBUX  SP500 
## 0.5893 0.0851 0.6480

Since all p-values are greater than \(\alpha=0.05\), we reject \(H_{0}:\,\mu_{i}=0\) at the 5% level for all assets. The p-value for Starbucks is the smallest at 0.0851. Here, we can reject \(H_{0}:\,\mu_{SBUX}=0\) at the 8.51% level.

\(\blacksquare\)

Example 4.6 (Using the z-score to test hypothesis about \(\rho\) in the GWN model)

Consider using the z-score (9.19) to test the hypotheses:

\[\begin{equation} H_{0}:\,\rho_{ij}=0.5\,vs.\,H_{1}:\rho_{ij}\neq0.5.\tag{9.21} \end{equation}\]

using a 5% significance level. Here, we use the result from the GWN model that for large enough \(T\): \[ \hat{\rho}_{ij}\sim N\left(\rho_{ij},\,\widehat{\mathrm{se}}(\hat{\rho}_{ij})^2\right),\,\widehat{\mathrm{se}}(\hat{\rho}_{ij})=\frac{1-\hat{\rho}_{ij}^{2}}{\sqrt{T}}. \] Then the z-score for testing (9.21) has the form: \[\begin{equation} z_{\rho_{ij}=0.5}=\frac{\hat{\rho}_{ij}-0.5}{\widehat{\mathrm{se}}(\hat{\rho}_{ij})}=\frac{\hat{\rho}_{ij}-0.5}{\left(1-\hat{\rho}_{ij}^{2}\right)/\sqrt{T}}.\tag{9.22} \end{equation}\] To compute the z-scores, first, calculate the GWN model estimates for \(\rho_{ij}\):

corhat.mat = cor(gwnMonthlyRetC) 
rhohat.vals = corhat.mat[lower.tri(corhat.mat)] 
names(rhohat.vals) = c("MSFT.SBUX", "MSFT.SP500", "SBUX.SP500") 
rhohat.vals
##  MSFT.SBUX MSFT.SP500 SBUX.SP500 
##      0.341      0.617      0.457

Next, calculate estimated standard errors:

se.rhohat = (1 - rhohat.vals^2)/sqrt(n.obs) 
se.rhohat
##  MSFT.SBUX MSFT.SP500 SBUX.SP500 
##     0.0674     0.0472     0.0603

Then calculate the z-scores (9.22) and test statistics:

z.scores = (rhohat.vals - 0.5)/se.rhohat 
S.vals = abs(z.scores)
S.vals
##  MSFT.SBUX MSFT.SP500 SBUX.SP500 
##      2.361      2.482      0.706

Here, the absolute value of the z-scores for \(\rho_{MSFT,SBUX}\) and \(\rho_{MSFT,SP500}\) are greater than 2 whereas the absolute value of the z-score for \(\rho_{SBUX,SP500}\) is less than 2. Hence, for the pairs (MSFT, SBUX) and (MSFT, SP500) we cannot reject the null (9.21) at the 5% level but for the pair (SBUX, SP500) we can reject the null at the 5% level. The p-values for the test statistics are:

2*(1-pnorm(S.vals))
##  MSFT.SBUX MSFT.SP500 SBUX.SP500 
##     0.0182     0.0130     0.4804

Here, the p-values for the pairs (MSFT,SBUX) and (MSFT,SP500) are less than 0.05 and the p-value for the pair (SBUX, SP500) is much greater than 0.05.

\(\blacksquare\)

Example 2.17 (Using the z-score to test hypothesis about asset Sharpe ratios in the GWN model)

Consider using the z-score to test the hypotheses:

\[ H_0: \mathrm{SR}_i = \frac{\mu_i - r_f}{\sigma_i} = 0 \text{ vs. } H_1: \mathrm{SR}_i > 0 \] using a 5% significance level. In Chapter 8 we used the delta method to show that, for large enough \(T\):

\[ \widehat{\mathrm{SR}}_i \sim N\left(\mathrm{SR}_i, \widehat{\mathrm{se}}\left(\widehat{\mathrm{SR}}_i\right)^2\right), \widehat{\mathrm{se}}\left(\widehat{\mathrm{SR}}_i\right) = \frac{1}{\sqrt{T}}\sqrt{1+\frac{1}{2}\widehat{\mathrm{SR}}_i^2} \] Then, the z-score has the form

\[ z_{\mathrm{SR}=0} = \frac{\widehat{\mathrm{SR}_i} - 0}{\widehat{\mathrm{se}}\left(\widehat{\mathrm{SR}}_i\right)} = \frac{\widehat{\mathrm{SR}}_i}{\frac{1}{\sqrt{T}}\sqrt{1+\frac{1}{2}\widehat{\mathrm{SR}}_i^2}}. \]

\(\blacksquare\)

9.3.4 Relationship between hypothesis tests and confidence intervals

Consider testing the hypotheses (9.18) at the 5% significance level using the z-score (9.19). The rule-of-thumb decision rule is to reject \(H_{0}:\,\theta=\theta_{0}\) if \(|z_{\theta=\theta_{0}}|>2\). This implies that:

\[\begin{eqnarray*} \frac{\hat{\theta}-\theta_{0}}{\widehat{\mathrm{se}}(\hat{\theta})} & > & 2\,\mathrm{~ or }\,\frac{\hat{\theta}-\theta_{0}}{\widehat{\mathrm{se}}(\hat{\theta})}<-2, \end{eqnarray*}\]

which further implies that: \[ \theta_{0}<\hat{\theta}-2\times\widehat{\mathrm{se}}(\hat{\theta})\,\mathrm{\,or}\,\,\theta_{0}>\hat{\theta}+2\times\widehat{\mathrm{se}}(\hat{\theta}). \] Recall the definition of an approximate 95% confidence interval for \(\theta\): \[ \hat{\theta}\pm2\times\widehat{\mathrm{se}}(\hat{\theta})=\left[\hat{\theta}-2\times\widehat{\mathrm{se}}(\hat{\theta}),\,\,\hat{\theta}+2\times\widehat{\mathrm{se}}(\hat{\theta})\right]. \] Notice that if \(|z_{\theta=\theta_{0}}|>2\) then \(\theta_{0}\) does not lie in the approximate 95% confidence interval for \(\theta\). Hence, we can reject \(H_{0}:\,\theta=\theta_{0}\) at the 5% significance level if \(\theta_{0}\) does not lie in the 95% confidence for \(\theta\). This result allows us to have a deeper understanding of the 95% confidence interval for \(\theta\): it contains all values of \(\theta_{0}\) for which we cannot reject \(H_{0}:\,\theta=\theta_{0}\) at the 5% significance level.

This duality between two-sided hypothesis tests and confidence intervals makes hypothesis testing for individual parameters particularly simple. As part of estimation, we calculate estimated standard errors and form approximate 95% confidence intervals. Then when we look at the approximate 95% confidence interval, it gives us all values of \(\theta_{0}\) for which we cannot reject \(H_{0}:\,\theta=\theta_{0}\) at the 5% significance level (approximately).

Example 7.1 (Hypothesis testing for \(\mu_{i}\), \(\rho_{ij}\) and Sharpe ratios using 95% confidence intervals)

For the example data, the approximate 95% confidence intervals for \(\mu_{i}\) for Microsoft, Starbucks and the S&P 500 index are:

lower = muhat.vals - 2*se.muhat 
upper = muhat.vals + 2*se.muhat 
cbind(lower, upper)
##          lower   upper
## MSFT  -0.01116 0.01941
## SBUX  -0.00237 0.03168
## SP500 -0.00570 0.00908

Consider testing \(H_{0}:\mu_{i}=0\) at the 5% level. Here we see that \(\mu_{i}^{0}=0\) lies in all of the 95% confidence intervals and so we do not reject \(H_{0}:\mu_{i}=0\) at the 5% level for any asset. Each interval gives the values of \(\mu_{i}^{0}\) for which we cannot reject \(H_{0}:\mu_{i}=\mu_{i}^{0}\) at the 5% level. For example, for Microsoft we cannot reject the hypothesis (at the 5% level) that \(\mu_{MSFT}\) is as small as -0.011 or as large as 0.019.

Next, the approximate 95% confidence intervals for \(\rho_{ij}\) for the pairs \(\rho_{MSFT,SBUX}\), \(\rho_{MSFT,SP500}\), and \(\rho_{SBUX,SP500}\) are:

lower = rhohat.vals - 2*se.rhohat 
upper = rhohat.vals + 2*se.rhohat 
cbind(lower, upper)
##            lower upper
## MSFT.SBUX  0.206 0.476
## MSFT.SP500 0.523 0.712
## SBUX.SP500 0.337 0.578

Consider testing \(H_{0}:\rho_{ij}=0.5\) for all pairs of assets. Here, we see that \(\rho_{ij}^{0}=0.5\) is not in the approximate 95% confidence intervals for \(\rho_{MSFT,SBUX}\) and \(\rho_{MSFT,SP500}\) but is in the approximate 95% confidence interval for \(\rho_{SBUX,SP500}\). Hence, we can reject \(H_{0}:\rho_{ij}=0.5\) at the 5% significance level for \(\rho_{MSFT,SBUX}\) and \(\rho_{MSFT,SP500}\) but not for \(\rho_{SBUX,SP500}\).

  • Add SR info here

\(\blacksquare\)


  1. This means that in an infinite number of hypothetical samples from the GWN model in which \(H_{0}\) is true only 5 percent of the samples produce an estimate \(\hat{\mu}_{i}\) that is more than \(1.96\) values of \(\mathrm{se}(\hat{\mu}_{i})\) away from \(\mu_{i}^{0}\).↩︎