## 9.3 Tests for Individual Parameters: t-tests and z-scores

In this section, we present test statistics for testing hypotheses
that certain model coefficients equal specific values. In some cases,
we can derive *exact tests*. For an exact test, the pdf of the
test statistic assuming the null hypothesis is true is known exactly
for a finite sample size \(T\). More generally, however, we rely on
*asymptotic tests*. For an asymptotic test, the pdf of the test
statistic assuming the null hypothesis is true is not known exactly
for a finite sample size \(T\), but can be approximated by a known
pdf. The approximation is justified by the Central Limit Theorem (CLT)
and the approximation becomes exact as the sample size becomes infinitely
large.

### 9.3.1 Exact tests under normality of data

In the GWN model (9.1), consider testing the hypothesis that the mean return, \(\mu_{i}\), is equal to a specific value \(\mu_{i}^{0}\):

\[\begin{equation} H_{0}:\mu_{i}=\mu_{i}^{0}.\tag{9.3} \end{equation}\]

For example, an investment analyst may have provided an expected return forecast of \(\mu_{i}^{0}\) and we would like to see if past data was consistent with such a forecast.

The alternative hypothesis can be either two-sided or one-sided. The two-sided alternative is: \[\begin{equation} H_{1}:\mu_{i}\neq\mu_{i}^{0}.\tag{9.4} \end{equation}\] Two-sided alternatives are used when we don’t care about the sign of \(\mu_{i}-\mu_{i}^{0}\) under the alternative. With one-sided alternatives, we care about the sign of \(\mu_{i}-\mu_{i}^{0}\) under the alternative: \[\begin{equation} H_{1}:\mu_{i}>\mu_{i}^{0}\,\textrm{ or }\,H_{1}:\mu_{i}<\mu_{i}^{0}.\tag{9.5} \end{equation}\] How do we come up with a test statistic \(S\) for testing (9.3) against (9.4) or (9.5)? We generally use two criteria: (1) we know the pdf of \(S\) assuming (9.3) is true; and (2) the value of \(S\) should be big if the alternative hypotheses (9.4) or (9.5) are true.

#### 9.3.1.1 Two-sided test

To simplify matters, let’s assume that the value of \(\sigma_{i}\) in (9.1) is known and does not need to be estimated. This assumption is unrealistic and will be relaxed later. Consider testing (9.3) against the two-sided alternative (9.4) using a 5% significance level. Two-sided alternatives are typically more commonly used than one-sided alternatives. Let \(\{R_{t}\}_{t=1}^{T}\) denote a random sample from the GWN model. We estimate \(\mu_{i}\) using the sample mean:

\[\begin{equation} \hat{\mu}_{i}=\frac{1}{T}\sum_{t=1}^{T}R_{it}.\tag{9.6} \end{equation}\]

Under the null hypothesis (9.3), it is assumed that \(\mu_{i}=\mu_{i}^{0}\). Hence, if the null hypothesis is true then \(\hat{\mu}_{i}\) computed from data should be close to \(\mu_{i}^{0}\). If \(\hat{\mu}_{i}\) is far from \(\mu_{i}^{0}\) (either above or below \(\mu_{i}^{0}\)) then this evidence casts doubt on the validity of the null hypothesis. Now, there is estimation error in (9.6) which is measured by:

\[\begin{equation} \mathrm{se}(\hat{\mu}_{i})=\frac{\sigma_{i}}{\sqrt{T}}.\tag{9.7} \end{equation}\]

Because we assume \(\sigma_{i}\) is known, we also know \(\mathrm{se}(\hat{\mu}_{i})\). Because of estimation error, we don’t expect \(\hat{\mu}_{i}\) to equal \(\mu_{i}^{0}\) even if the null hypothesis (9.3) is true. How far from \(\mu_{i}^{0}\) can \(\hat{\mu}_{i}\) be if the null hypothesis is true? To answer this question, recall from Chapter 7 that the exact (finite sample) pdf of \(\hat{\mu}_{i}\) is the normal distribution:

\[\begin{equation} \hat{\mu}_{i}\sim N(\mu_{i},\mathrm{se}(\hat{\mu}_{i})^{2})=N\left(\mu_{i},\frac{\sigma_{i}^{2}}{T}\right).\tag{9.8} \end{equation}\]

Under the null hypothesis (9.3), the normal distribution for \(\hat{\mu}_{i}\) is centered at \(\mu_{i}^{0}\):

\[\begin{equation} \hat{\mu}_{i}\sim N(\mu_{i}^{0},\mathrm{se}(\hat{\mu}_{i})^{2})=N\left(\mu_{i}^{0},\frac{\sigma_{i}^{2}}{T}\right).\tag{9.9} \end{equation}\]

From properties of the normal distribution we have:

\[\begin{equation} \Pr\left(\mu_{i}^{0}-1.96\times\mathrm{se}(\hat{\mu}_{i})\leq\hat{\mu}_{i}\leq\mu_{i}^{0}+1.96\times\mathrm{se}(\hat{\mu}_{i})\right)=0.95.\tag{9.10} \end{equation}\]

Hence, if the null hypothesis (9.3) is true we only
expect to see \(\hat{\mu}_{i}\) more than \(1.96\) values of \(\mathrm{se}(\hat{\mu}_{i})\)
away from \(\mu_{i}^{0}\) with probability 0.05.^{51} Therefore, it makes intuitive sense to base a test statistic for
testing (9.3) on a measure of distance between \(\hat{\mu}_{i}\)
and \(\mu_{i}^{0}\) relative to \(\mathrm{se}(\hat{\mu}_{i})\).
Such a statistic is the *z-score*:

\[\begin{equation} z_{\mu=\mu^{0}}=\frac{\hat{\mu}_{i}-\mu_{i}^{0}}{\mathrm{se}(\hat{\mu}_{i})}=\frac{\hat{\mu}_{i}-\mu_{i}^{0}}{\sigma/\sqrt{T}}.\tag{9.11} \end{equation}\]

Assuming the null hypothesis (9.3) is true, from (9.9) it follows that: \[ z_{\mu=\mu^{0}}\sim N(0,1). \]

The intuition for using the z-score (9.11) to test (9.3) is straightforward. If \(z_{\mu=\mu^{0}}\approx0\) then \(\hat{\mu}_{i}\approx\mu_{i}^{0}\) and (9.3) should not be rejected. In contrast, if \(z_{\mu=\mu^{0}}>1.96\) or \(z_{\mu=\mu^{0}}<-1.96\) then \(\hat{\mu}_{i}\) is more than \(1.96\) values of \(\mathrm{se}(\hat{\mu}_{i})\) away from \(\mu_{i}^{0}\). From (9.10), this is very unlikely (less than 5% probability) if (9.3) is true. In this case, there is strong data evidence against (9.3) and we should reject it. Notice that the condition \(z_{\mu=\mu^{0}}>1.96\textrm{ or }z_{\mu=\mu^{0}}<-1.96\) can be simplified as the condition \(\left|z_{\mu=\mu^{0}}\right|>1.96\). When \(\left|z_{\mu=\mu^{0}}\right|\) is big, larger than 1.96, we have data evidence against (9.3). Using the value \(1.96\) to determine the rejection region for the test ensures that the significance level (probability of Type I error) of the test is exactly 5%:

\[\begin{align*} \Pr(\textrm{Reject } H_{0}|H_{0}\,\textrm{is true}) &= \Pr(S>1.96|\mu_{i}=\mu_{i}^{0}) \\ &=\Pr\left(\left|z_{\mu=\mu^{0}}\right|>1.96|\mu_{i}=\mu_{i}^{0}\right)=.05 \end{align*}\]

Hence, our formal test statistic for testing (9.3) against (9.4) is

\[\begin{equation} S=\left|z_{\mu=\mu^{0}}\right|. \tag{9.12} \end{equation}\]

The 5% critical value is \(cv_{.05}=1.96\), and we reject (9.3) at the 5% significance level if \(S>1.96\).

In summary, the steps for using \(S=\left|z_{\mu=\mu^{0}}\right|\) to test the (9.3) against (9.4) are:

- Set the significance level \(\alpha=\Pr(\textrm{Reject } H_{0}|H_{0}\,\textrm{is true})\) and determine the critical value \(cv_{\alpha}\) such that \(\Pr(S>cv_{\alpha})=\alpha\). Now,

\[\begin{align*} \Pr(S>cv_{\alpha}) &= \Pr\left(\left|z_{\mu=\mu^{0}}\right|>cv_{\alpha}\right) \\ &= \Pr\left(z_{\mu=\mu^{0}}>cv_{\alpha}\right)+\Pr\left(z_{\mu=\mu^{0}}<-cv_{\alpha}\right)=\alpha \end{align*}\]

which implies that \(cv_{\alpha}=-q_{\alpha/2}^{Z}=q_{1-\alpha/2}^{Z}\), where \(q_{\alpha/2}^{Z}\) denotes the \(\frac{\alpha}{2}-\)quantile of \(Z\sim N(0,1)\). For example, if \(\alpha=.05\) then \(cv_{.05}=-q_{.025}^{Z}=1.96.\)

Reject (9.3) at the \(\alpha\times100\%\) significance level if \(S>cv_{\alpha}\). For example, if \(\alpha=.05\) then reject (9.3) at the 5% level if \(S>1.96\).

Equivalently, reject (9.3) at the \(\alpha\times100\%\) significance level if the p-value for \(S\) is less than \(\alpha\). Here, the

*p-value*is defined as the significance level at which the test is just rejected. Let \(Z\sim N(0,1)\). The p-value is computed as: \[\begin{eqnarray} \textrm{p-value} & = & \Pr(|Z|>S)=\Pr(Z>S)+\Pr(Z<-S)\\ & = & 2\times \Pr(Z>S)=2\times(1-\Pr(Z<S)) \tag{9.13}. \end{eqnarray}\]

**Example 4.1 (Testing hypothesis with the z-score using simulated data: two-sided tests)**

Assume returns follow the GWN model

\[\begin{eqnarray*} R_{t} & = & 0.05+\epsilon_{t},\,t=1,\ldots60,\\ \epsilon_{t} & \sim & GWN(0,0.10). \end{eqnarray*}\]

and we are interested in testing the following hypotheses \[\begin{align*} H_{0}:\mu=0.05\,\,vs.\,\,H_{1}:\mu\neq.05 \\ H_{0}:\mu=0.06\,\,vs.\,\,H_{1}:\mu\neq.06 \\ H_{0}:\mu=0.10\,\,vs.\,\,H_{1}:\mu\neq.10 \\ \end{align*}\]

using the test statistic (9.12) with a 5% significance level. The 5% critical value is \(cv_{.05}=1.96\). One hypothetical sample from the hypothesized model is simulated using:

The estimate of \(\mu\) and the value of \(\mathrm{se}(\hat{\mu})\) are:

`## [1] 0.0566 0.0129`

The test statistic (9.12) for testing \(H_0:\mu=0.05\) computed from this sample is:

`## [1] 0.508`

The z-score tells us that the estimated mean, \(\hat{\mu}=0.0566\), is 0.58 values of \(\mathrm{se}(\hat{\mu})\) away from the hypothesized value \(\mu^0=0.05\). This evidence does not contradict \(H_0:\mu=0.05\). Since \(S=0.508 < 1.96\), we do not reject \(H_0:\mu=0.05\) at the 5% significance level. The p-value of the test using (9.13) is:

`## [1] 0.611`

Here, the p-value of 0.611 is less than the significance level \(\alpha = 0.05\) so we do not reject the null at the 5% significance level. The p-value tells us that we would reject \(H0:\mu=0.05\) at the 61.1% significance level.

The z-score information for testing \(H_0:\mu=0.06\) is:

`## [1] 0.266`

Here, the z-score indicates that \(\hat{\mu}=0.0566\) is just 0.266 values of \(\mathrm{se}(\hat{\mu})\) away from the hypothesized value \(\mu^0=0.06\). This evidence also does not contradict \(H_0:\mu=0.06\). Since \(S=0.226 < 1.96\), we do not reject \(H_0:\mu=0.06\) at the 5% significance level. The p-value of the test is:

`## [1] 0.79`

The large p-value shows that there is insufficient data to reject \(H_0:\mu=0.06\).

Last, the z-score information for testing \(H_0:\mu=0.10\) is:

`## [1] 3.36`

Now, the z-score indicates that \(\hat{\mu}=0.0566\) is 3.36 values of \(\mathrm{se}(\hat{\mu})\) away from the hypothesized value \(\mu^0=0.10\). This evidence is not in support of \(H_0:\mu=0.10\). Since \(S=3.36 > 1.96\), we reject \(H_0:\mu=0.06\) at the 5% significance level. The p-value of the test is:

`## [1] 0.000766`

Here, the p-value is much less 5% supporting the rejection of \(H_0:\mu=0.06\).

\(\blacksquare\)

#### 9.3.1.2 One-sided test

Here, we consider testing the null (9.3) against the one-sided alternative (9.5). For expositional purposes, consider \(H_{1}:\mu_{i} > \mu_{i}^{0}\). This alternative can be equivalently represented as \(H_{1}:\mu_{i} - \mu_{i}^{0} > 0\). That is, under the alternative hypothesis the sign of the difference \(\mu_{i} - \mu_{i}^{0}\) is positive. This is why a test against a one-sided alternative is sometimes called a *test for sign*. As in the previous sub-section, assume that \(\sigma_i\) is known and let the significance level be \(5\%\).

The natural test statistic is the z-score (9.11): \[ S = z_{\mu = \mu^0}. \]

The intuition is straightforward. If \(z_{\mu=\mu^0} \approx 0\) then \(\hat{\mu}_i \approx \mu_i^0\) and the null should not be rejected. However, if the one-sided alternative is true then we would expect to see \(\hat{\mu}_i > \mu_i^0\) and \(z_{\mu=\mu^0} > 0\). How big \(z_{\mu=\mu^0}\) needs to be for us to reject the null depends on the significance level. Since under the null \(z_{\mu=\mu^0} \sim N(0,1) = Z\), with a \(5\%\) significance level

\[ \Pr(\text{Reject } H_0 | H_0 \text{ is true}) = \Pr(z_{\mu=\mu^0} > 1.645) = \Pr(Z > 1.645) = 0.05. \] Hence, our \(5\%\) one-sided critical value is \(cv_{.05}=q_{.95}^Z = 1.645\), and we reject the null at the \(5\%\) significance level if \(z_{\mu=\mu^0} >1.645\).

In general, the steps for using \(z_{\mu=\mu^0}\) to test (9.3) against the one-sided alternative \(H_{1}:\mu_{i} > \mu_{i}^{0}\) are:

Set the significance level \(\alpha=\Pr(\textrm{Reject } H_{0}|H_{0}\,\textrm{is true})\) and determine the critical value \(cv_{\alpha}\) such that \(\Pr(z_{\mu=\mu^0}>cv_{\alpha})=\alpha\). Since under the null \(z_{\mu=\mu^0} \sim N(0,1) = Z\), \(cv_{\alpha} = q_{1-\alpha}^Z\).

Reject the null hypothesis (9.3) at the \(\alpha \times 100\%\) significance level if \(z_{\mu=\mu^0} > cv_{\alpha}\).

Equivalently, reject (9.3) at the \(\alpha \times 100\%\) significance level if the one-sided p-value is less than \(\alpha\), where

**Example 2.8 (Testing hypothesis with the z-score using simulated data: one-sided tests)**

Here, we use the simulated GWN return data from the previous example to test (9.3) against the one-sided alternative \(H_{1}:\mu_{i} > \mu_{i}^{0}\) using a 5% significance level.

The z-scores and one-sided p-values for testing \(H_0:\mu=0.05\), \(H_0:\mu=0.06\), and \(H_0:\mu=0.10\), computed from the simulated sample are:

```
z.score.05 = (muhat - 0.05)/se.muhat
p.value.05 = 1 - pnorm(z.score.05)
z.score.06 = (muhat - 0.06)/se.muhat
p.value.06 = 1 - pnorm(z.score.06)
z.score.10 = (muhat - 0.10)/se.muhat
p.value.10 = 1 - pnorm(z.score.10)
ans = rbind(c(z.score.05, p.value.05),
c(z.score.06, p.value.06),
c(z.score.10, p.value.10))
colnames(ans) = c("Z-score", "P-value")
rownames(ans) = c("H0:mu=0.05", "H0:mu=0.06","H0:mu=0.10")
ans
```

```
## Z-score P-value
## H0:mu=0.05 0.508 0.306
## H0:mu=0.06 -0.266 0.605
## H0:mu=0.10 -3.365 1.000
```

Here, we do not reject any of the null hypotheses in favor of the one-sided alternative at the 5% significance level.

### 9.3.2 Exact tests with \(\sigma\) unknown

In practice, \(\sigma^2\) is unknown and is estimated with \(\hat{\sigma}\), and so the exact tests described above are not feasible. Fortunately, an exact test is still available. Instead of using the z-score (9.11), we use the t-ratio (or t-score)

\[\begin{equation} t_{\mu=\mu^0}=\frac{\hat{\mu}_{i}-\mu_{i}^0}{\widehat{\mathrm{se}}(\hat{\mu}_{i})}=\frac{\hat{\mu}_{i}-\mu_{i}^0}{\hat{\sigma}/\sqrt{T}}.\tag{9.14} \end{equation}\]

Assuming the null hypothesis (9.3) is true, Proposition 7.7 tells us that (9.14) is distributed Student’s t with \(T-1\) degrees of freedom and is denoted by the random variable \(t_{T-1}\). The steps for using the z-score and the t-ratio are the same for evaluating (9.3), but now we use critical values and p-values from \(t_{T-1}\). Our test statistic for the two-sided alternative \(H_0:\mu_i \ne \mu_i^0\) is: \[\begin{equation} S = \left|t_{\mu=\mu^{0}}\right|, \tag{9.15} \end{equation}\]

and our test statistic for the one-sided alternative \(H_0:\mu_i > \mu_i^0\) is

\[\begin{equation} S = t_{\mu=\mu^{0}}, \tag{9.16} \end{equation}\]

Our critical values are determined from the quantiles of the Student’s t with \(T-1\) degrees of freedom, \(t_{T-1}(1-\alpha/2)\). For example, if \(\alpha=0.05\) and \(T-1=60\) then the two-sided critical value is \(cv_{.05} = t_{60}(0.975)=2\) (which can be verified using the R function `qt()`

). The two-sided p-value is computed as:

\[\begin{eqnarray*} \textrm{p-value} & = & \Pr(|t_{T-1}|>S)=\Pr(t_{T-1}>S)+\Pr(t_{T-1}<-S)\\ & = & 2\times \Pr(t_{T-1}>S)=2\times(1-\Pr(t_{T-1}<S)). \end{eqnarray*}\]

The one-sided critical value is \(cv_{.05} = t_{60}(0.95)=1.67\), and the one-sided p-value is computed as:

\[ \text{p-value} = \Pr(t_{T-1} > S) = 1 - \Pr(t_{T-1} \le S). \]

As the sample size gets larger \(\hat{\sigma}_{i}\) gets closer to \(\sigma_{i}\) and the Student’s t distribution gets closer to the normal distribution. Decisions using the t-ratio and the z-score are almost the same for \(T \ge 60\).

**Example 2.9 (Testing hypothesis with the t-ratio using simulated data)**

We repeat the hypothesis testing from the previous example this time use the t-ratio (9.14) instead of the z-score. Using \(T-1=59\) the 5% critical value for the two-sided test is

`## [1] 2`

To compute the t-ratios we first estimate \(\mathrm{se}(\hat{\mu})\):

`## [1] 0.0118 0.0129`

Here, \(\widehat{\mathrm{se}}(\hat{\mu}) = 0.0118 < 0.0129 = \mathrm{se}(\hat{\mu})\) and so the t-ratios will be slightly larger than the z-scores. The t-ratios and test statistics for the three hypotheses are:

```
t.ratio.05 = (muhat - 0.05)/sehat.muhat
t.ratio.06 = (muhat - 0.06)/sehat.muhat
t.ratio.10 = (muhat - 0.10)/sehat.muhat
S.05 = abs(t.ratio.05)
S.06 = abs(t.ratio.06)
S.10 = abs(t.ratio.10)
ans = c(S.05, S.06, S.10)
names(ans) = c("S.05", "S.06", "S.07")
ans
```

```
## S.05 S.06 S.07
## 0.558 0.293 3.696
```

As expected the test statistics computed from the t-ratios are slightly larger than the tests computed from the z-scores. The first two statistics are less than 2, and the third statistic is bigger than 2 and so we reach the same decisions as before. The p-values of the three tests are:

`## [1] 0.578747 0.770895 0.000482`

Here, the p-values computed from \(t_{59}\) are very similar to those computed from \(Z\sim N(0,1)\).

\(\blacksquare\)

We can derive an exact test for testing hypothesis about the value of \(\mu_{i}\) based on the z-score or the t-ratio, but we cannot derive exact tests for the values of \(\sigma_{i}\) or for the values of \(\rho_{ij}\) based on z-scores. Exact tests for these parameters are much more complicated. While t-ratios for the values of \(\sigma_{i}\) or for the values of \(\rho_{ij}\) do not have exact t-distributions in finite samples, as the sample size gets large the distributions of the t-ratios get closer and closer to the normal distribution due to the CLT. This motivates the use of so-called asymptotic z-scores discussed in the next sub-section.

### 9.3.3 Z-scores under asymptotic normality of estimators

Let \(\hat{\theta}\) denote an estimator for \(\theta\). Here, we allow \(\theta\) to be a GWN model parameter or a function of GWN model parameters. For example, in the GWN model \(\theta\) could be \(\mu_{i}\), \(\sigma_{i},\), \(\rho_{ij}\), \(q_{\alpha}^R\), \(\mathrm{VaR}_{\alpha}\) or \(\mathrm{SR}_i\). As we have seen, the CLT (and the delta method if needed) justifies the asymptotic normal distribution:

\[\begin{equation} \hat{\theta}\sim N(\theta,\widehat{\mathrm{se}}(\hat{\theta})^{2}),\tag{9.17} \end{equation}\]

for large enough sample size \(T\), where \(\widehat{\mathrm{se}}(\hat{\theta})\) is the estimated standard error for \(\hat{\theta}\). Consider testing:

\[\begin{equation} H_{0}:\theta=\theta_{0}\text{ vs. }H_{1}:\theta\neq\theta_{0}.\tag{9.18} \end{equation}\]

Under \(H_{0},\) the asymptotic normality result (9.17) implies that the z-score for testing (9.18) has a standard normal distribution for large enough sample size \(T\):

\[\begin{equation} z_{\theta=\theta_{0}}=\frac{\hat{\theta}-\theta_{0}}{\widehat{\mathrm{se}}(\hat{\theta})}\sim N(0,1)=Z.\tag{9.19} \end{equation}\]

The intuition for using the z-score (9.19) is straightforward. If \(z_{\theta=\theta_{0}}\approx0\) then \(\hat{\theta}\approx\theta_{0},\) and \(H_{0}:\theta=\theta_{0}\) should not be rejected. On the other hand, if \(|z_{\theta=\theta_{0}}|>2\), say, then \(\hat{\theta}\) is more than \(2\) values of \(\widehat{\mathrm{se}}(\hat{\theta})\) away from \(\theta_{0}.\) This is very unlikely if \(\theta=\theta_{0}\) because \(\hat{\theta}\sim N(\theta_{0},\mathrm{\widehat{se}}(\hat{\theta})^{2}),\) so \(H_{0}:\theta\neq\theta_{0}\) should be rejected. Therefore, the test statistic for testing (9.19) is \(S = |z_{\theta=\theta_{0}}|\).

The steps for using the z-score (9.19) with its critical value to test the hypotheses (9.18) are:

Set the significance level \(\alpha\) of the test and determine the two-sided critical value \(cv_{\alpha/2}\). Using (9.17), the critical value, \(cv_{\alpha/2},\) is determined using: \[\begin{align*} \Pr(|Z| & \geq cv_{\alpha/2})=\alpha\\ & \Rightarrow cv_{\alpha/2}=-q_{\alpha/2}^{Z}=q_{1-\alpha/2}^{Z}, \end{align*}\] where \(q_{\alpha/2}^{Z}\) denotes the \(\frac{\alpha}{2}-\)quantile of \(N(0,1)\). A commonly used significance level is \(\alpha=0.05\) and the corresponding critical value is \(cv_{.025}=-q_{.025}^{Z}=q_{.975}^{Z}=1.96\approx2\).

Reject (9.18) at the \(100\times\alpha\)% significance level if: \[ S = |z_{\theta=\theta_{0}}|=\left\vert \frac{\hat{\theta}-\theta^{0}}{\widehat{\mathrm{se}}(\hat{\theta})}\right\vert >cv_{\alpha/2}. \] If the significance level is \(\alpha=0.05\), then reject (9.18) at the 5% level using the rule-of-thumb: \[ S=|z_{\theta=\theta_{0}}|>2. \]

The steps for using the z-score (9.19) with its p-value to test the hypotheses (9.18) are:

- Determine the two-sided p-value. The p-value of the two-sided test is the significance level at which the test is just rejected. From (9.17), the two-sided p-value is defined by \[\begin{equation} \textrm{p-value}=\Pr\left(|Z|>|z_{\theta=\theta_{0}}|\right)=2\times(1-\Pr\left(Z\leq|z_{\theta=\theta_{0}}|\right).\tag{9.20} \end{equation}\]
- Reject (9.18) at the \(100\times\alpha\)% significance level if the p-value (9.20) is less than \(\alpha\).

**Example 9.1 (Using the z-score to test hypothesis about \(\mu\) in the GWN model)**

Consider using the z-score (9.19) to test \(H_{0}:\,\mu_{i}=0\,vs.\,H_{1}:\mu_{i}\neq0\) (\(i=\)Microsoft, Starbucks, S&P 500) using a 5% significance level. First, calculate the GWN model estimates for \(\mu_{i}\):

```
## MSFT SBUX SP500
## 0.00413 0.01466 0.00169
```

Next, calculate the estimated standard errors:

```
## MSFT SBUX SP500
## 0.00764 0.00851 0.00370
```

Then calculate the z-scores and test statistics:

```
## MSFT SBUX SP500
## 0.540 1.722 0.457
```

Since the absolute value of all of the z-scores are less than two, we do not reject \(H_{0}:\,\mu_{i}=0\) at the 5% level for all assets.

The p-values for all of the test statistics are computed using:

```
## MSFT SBUX SP500
## 0.5893 0.0851 0.6480
```

Since all p-values are greater than \(\alpha=0.05\), we reject \(H_{0}:\,\mu_{i}=0\) at the 5% level for all assets. The p-value for Starbucks is the smallest at 0.0851. Here, we can reject \(H_{0}:\,\mu_{SBUX}=0\) at the 8.51% level.

\(\blacksquare\)

**Example 4.6 (Using the z-score to test hypothesis about \(\rho\) in the GWN model)**

Consider using the z-score (9.19) to test the hypotheses:

\[\begin{equation} H_{0}:\,\rho_{ij}=0.5\,vs.\,H_{1}:\rho_{ij}\neq0.5.\tag{9.21} \end{equation}\]

using a 5% significance level. Here, we use the result from the GWN model that for large enough \(T\): \[ \hat{\rho}_{ij}\sim N\left(\rho_{ij},\,\widehat{\mathrm{se}}(\hat{\rho}_{ij})^2\right),\,\widehat{\mathrm{se}}(\hat{\rho}_{ij})=\frac{1-\hat{\rho}_{ij}^{2}}{\sqrt{T}}. \] Then the z-score for testing (9.21) has the form: \[\begin{equation} z_{\rho_{ij}=0.5}=\frac{\hat{\rho}_{ij}-0.5}{\widehat{\mathrm{se}}(\hat{\rho}_{ij})}=\frac{\hat{\rho}_{ij}-0.5}{\left(1-\hat{\rho}_{ij}^{2}\right)/\sqrt{T}}.\tag{9.22} \end{equation}\] To compute the z-scores, first, calculate the GWN model estimates for \(\rho_{ij}\):

```
corhat.mat = cor(gwnMonthlyRetC)
rhohat.vals = corhat.mat[lower.tri(corhat.mat)]
names(rhohat.vals) = c("MSFT.SBUX", "MSFT.SP500", "SBUX.SP500")
rhohat.vals
```

```
## MSFT.SBUX MSFT.SP500 SBUX.SP500
## 0.341 0.617 0.457
```

Next, calculate estimated standard errors:

```
## MSFT.SBUX MSFT.SP500 SBUX.SP500
## 0.0674 0.0472 0.0603
```

Then calculate the z-scores (9.22) and test statistics:

```
## MSFT.SBUX MSFT.SP500 SBUX.SP500
## 2.361 2.482 0.706
```

Here, the absolute value of the z-scores for \(\rho_{MSFT,SBUX}\) and \(\rho_{MSFT,SP500}\) are greater than 2 whereas the absolute value of the z-score for \(\rho_{SBUX,SP500}\) is less than 2. Hence, for the pairs (MSFT, SBUX) and (MSFT, SP500) we cannot reject the null (9.21) at the 5% level but for the pair (SBUX, SP500) we can reject the null at the 5% level. The p-values for the test statistics are:

```
## MSFT.SBUX MSFT.SP500 SBUX.SP500
## 0.0182 0.0130 0.4804
```

Here, the p-values for the pairs (MSFT,SBUX) and (MSFT,SP500) are less than 0.05 and the p-value for the pair (SBUX, SP500) is much greater than 0.05.

\(\blacksquare\)

**Example 2.17 (Using the z-score to test hypothesis about asset Sharpe ratios in the GWN model)**

Consider using the z-score to test the hypotheses:

\[ H_0: \mathrm{SR}_i = \frac{\mu_i - r_f}{\sigma_i} = 0 \text{ vs. } H_1: \mathrm{SR}_i > 0 \] using a 5% significance level. In Chapter 8 we used the delta method to show that, for large enough \(T\):

\[ \widehat{\mathrm{SR}}_i \sim N\left(\mathrm{SR}_i, \widehat{\mathrm{se}}\left(\widehat{\mathrm{SR}}_i\right)^2\right), \widehat{\mathrm{se}}\left(\widehat{\mathrm{SR}}_i\right) = \frac{1}{\sqrt{T}}\sqrt{1+\frac{1}{2}\widehat{\mathrm{SR}}_i^2} \] Then, the z-score has the form

\[ z_{\mathrm{SR}=0} = \frac{\widehat{\mathrm{SR}_i} - 0}{\widehat{\mathrm{se}}\left(\widehat{\mathrm{SR}}_i\right)} = \frac{\widehat{\mathrm{SR}}_i}{\frac{1}{\sqrt{T}}\sqrt{1+\frac{1}{2}\widehat{\mathrm{SR}}_i^2}}. \]

\(\blacksquare\)

### 9.3.4 Relationship between hypothesis tests and confidence intervals

Consider testing the hypotheses (9.18) at the 5% significance level using the z-score (9.19). The rule-of-thumb decision rule is to reject \(H_{0}:\,\theta=\theta_{0}\) if \(|z_{\theta=\theta_{0}}|>2\). This implies that:

\[\begin{eqnarray*} \frac{\hat{\theta}-\theta_{0}}{\widehat{\mathrm{se}}(\hat{\theta})} & > & 2\,\mathrm{~ or }\,\frac{\hat{\theta}-\theta_{0}}{\widehat{\mathrm{se}}(\hat{\theta})}<-2, \end{eqnarray*}\]

which further implies that: \[ \theta_{0}<\hat{\theta}-2\times\widehat{\mathrm{se}}(\hat{\theta})\,\mathrm{\,or}\,\,\theta_{0}>\hat{\theta}+2\times\widehat{\mathrm{se}}(\hat{\theta}). \] Recall the definition of an approximate 95% confidence interval for \(\theta\): \[ \hat{\theta}\pm2\times\widehat{\mathrm{se}}(\hat{\theta})=\left[\hat{\theta}-2\times\widehat{\mathrm{se}}(\hat{\theta}),\,\,\hat{\theta}+2\times\widehat{\mathrm{se}}(\hat{\theta})\right]. \] Notice that if \(|z_{\theta=\theta_{0}}|>2\) then \(\theta_{0}\) does not lie in the approximate 95% confidence interval for \(\theta\). Hence, we can reject \(H_{0}:\,\theta=\theta_{0}\) at the 5% significance level if \(\theta_{0}\) does not lie in the 95% confidence for \(\theta\). This result allows us to have a deeper understanding of the 95% confidence interval for \(\theta\): it contains all values of \(\theta_{0}\) for which we cannot reject \(H_{0}:\,\theta=\theta_{0}\) at the 5% significance level.

This duality between two-sided hypothesis tests and confidence intervals makes hypothesis testing for individual parameters particularly simple. As part of estimation, we calculate estimated standard errors and form approximate 95% confidence intervals. Then when we look at the approximate 95% confidence interval, it gives us all values of \(\theta_{0}\) for which we cannot reject \(H_{0}:\,\theta=\theta_{0}\) at the 5% significance level (approximately).

**Example 7.1 (Hypothesis testing for \(\mu_{i}\), \(\rho_{ij}\) and Sharpe ratios using 95% confidence intervals)**

For the example data, the approximate 95% confidence intervals for \(\mu_{i}\) for Microsoft, Starbucks and the S&P 500 index are:

```
## lower upper
## MSFT -0.01116 0.01941
## SBUX -0.00237 0.03168
## SP500 -0.00570 0.00908
```

Consider testing \(H_{0}:\mu_{i}=0\) at the 5% level. Here we see that \(\mu_{i}^{0}=0\) lies in all of the 95% confidence intervals and so we do not reject \(H_{0}:\mu_{i}=0\) at the 5% level for any asset. Each interval gives the values of \(\mu_{i}^{0}\) for which we cannot reject \(H_{0}:\mu_{i}=\mu_{i}^{0}\) at the 5% level. For example, for Microsoft we cannot reject the hypothesis (at the 5% level) that \(\mu_{MSFT}\) is as small as -0.011 or as large as 0.019.

Next, the approximate 95% confidence intervals for \(\rho_{ij}\) for the pairs \(\rho_{MSFT,SBUX}\), \(\rho_{MSFT,SP500}\), and \(\rho_{SBUX,SP500}\) are:

```
## lower upper
## MSFT.SBUX 0.206 0.476
## MSFT.SP500 0.523 0.712
## SBUX.SP500 0.337 0.578
```

Consider testing \(H_{0}:\rho_{ij}=0.5\) for all pairs of assets. Here, we see that \(\rho_{ij}^{0}=0.5\) is not in the approximate 95% confidence intervals for \(\rho_{MSFT,SBUX}\) and \(\rho_{MSFT,SP500}\) but is in the approximate 95% confidence interval for \(\rho_{SBUX,SP500}\). Hence, we can reject \(H_{0}:\rho_{ij}=0.5\) at the 5% significance level for \(\rho_{MSFT,SBUX}\) and \(\rho_{MSFT,SP500}\) but not for \(\rho_{SBUX,SP500}\).

- Add SR info here

\(\blacksquare\)

This means that in an infinite number of hypothetical samples from the GWN model in which \(H_{0}\) is true only 5 percent of the samples produce an estimate \(\hat{\mu}_{i}\) that is more than \(1.96\) values of \(\mathrm{se}(\hat{\mu}_{i})\) away from \(\mu_{i}^{0}\).↩︎