6 The Gaussian White Noise Return Model
Updated: May 4, 2021
Copyright © Eric Zivot 2015, 2016
The first model of asset returns we consider is the very simple Gaussian white noise (GWN) model for asset returns . This model is motivated by the stylized facts for monthly asset returns discussed in the previous chapter. The GWN model assumes that an asset’s return (simple or continuously compounded) over time is independent and identically normally distributed. The model allows for the returns on different assets to be contemporaneously correlated but does not allow for any lead-lag cross correlations. The GWN model is widely used in finance. For example, it is used in risk analysis (e.g., for computing Value-at-Risk) for assets and portfolios, in mean-variance portfolio analysis, in the Capital Asset Pricing Model (CAPM), and in the Black-Scholes option pricing model. Although this model is very simple, it provides important intuition about the statistical behavior of asset returns and prices and serves as a benchmark against which more complicated models can be compared and evaluated. It allows us to discuss and develop several important econometric topics such as Monte Carlo simulation, estimation, bootstrapping, hypothesis testing, forecasting, and model evaluation that are discussed in later chapters.
The outline of this chapter is as follows. Section 6.1 reviews the assumptions of the GWN model and presents several equivalent specifications of the model. Application of the model to continuously compounded returns, simple returns, and portfolios is also discussed. Section 6.2 illustrates Monte Carlo simulation of the GWN model as a first-step reality check of the model.
The R packages used in this chapter are IntroCompFinR, mvtnorm, and PerformanceAnalytics. Make sure these packages are installed and loaded before running the R examples in the chapter.
6.1 GWN Model Assumptions
Let \(R_{it}\) denote the simple or continuously compounded (cc) return on asset \(i\) over the investment horizon between times \(t-1\) and \(t\) (e.g., monthly returns). We make the following assumptions regarding the probability distribution of \(R_{it}\) for \(i=1,\ldots,N\) assets for all times \(t\).
Assumption GWN
- Covariance stationarity and ergodicity: \(\{R_{i1},\ldots,R_{iT}\}=\{R_{it}\}_{t=1}^{T}\) is a covariance stationary and ergodic stochastic process with \(E[R_{it}]=\mu_{i},\) \(\mathrm{var}(R_{it})=\sigma_{i}^{2}\), \(\mathrm{cov}(R_{it},R_{jt})=\sigma_{ij}\) and \(\mathrm{cor}(R_{it},R_{jt})=\rho_{ij}\).
- Normality: \(R_{it}\sim N(\mu_{i},\sigma_{i}^{2})\) for all \(i\) and \(t\), and all joint distributions are normal.
- No serial correlation: \(\mathrm{cov}(R_{it},R_{js})=\mathrm{cor}(R_{it},R_{is})=0\) for \(t\neq s\) and \(i,j=1,\ldots,N\).
Assumptions GWN states that in every time period asset returns are jointly (multivariate) normally distributed, that the means and the variances of all asset returns, and all of the pairwise contemporaneous covariances and correlations between assets are constant over time. In addition, all of the asset returns are serially uncorrelated: \[ \mathrm{cor}(R_{it},R_{is})=\mathrm{cov}(R_{it},R_{is})=0\textrm{ for all }i\textrm{ and }t\neq s, \] and the returns on all possible pairs of assets \(i\) and \(j\) are cross lead-lag uncorrelated: \[ \mathrm{cor}(R_{it},R_{js})=\mathrm{cov}(R_{it},R_{js})=0\textrm{ for all }i\neq j\textrm{ and }t\neq s. \] In addition, under the normal distribution assumption lack of serial and cross lead-lag correlation implies time independence of returns. Clearly, these are very strong assumptions. However, they allow us to develop a straightforward probabilistic model for asset returns as well as statistical tools for estimating the parameters of the model, testing hypotheses about the parameter values and assumptions.
6.1.1 Regression model representation
A convenient mathematical representation or model of asset returns can be given based on Assumption GWN. This is the GWN regression model. For assets \(i=1,\ldots,N\) and time periods \(t=1,\ldots,T\), the GWN regression model is: \[\begin{align} R_{it} & =\mu_{i}+\varepsilon_{it},\tag{6.1}\\ \{\varepsilon_{it}\}_{t=1}^{T} & \sim\mathrm{GWN}(0,\sigma_{i}^{2}),\nonumber \\ \mathrm{cov}(\varepsilon_{it},\varepsilon_{js}) & =\left\{ \begin{array}{c} \sigma_{ij}\\ 0 \end{array}\right.\begin{array}{c} t=s\\ t\neq s \end{array}.\nonumber \end{align}\] The notation \(\varepsilon_{it}\sim\mathrm{GWN}(0,\sigma_{i}^{2})\) stipulates that the stochastic process \(\{\varepsilon_{it}\}_{t=1}^{T}\) is a GWN process with \(E[\varepsilon_{it}]=0\) and \(\mathrm{var}(\varepsilon_{it})=\sigma_{i}^{2}\). In addition, the random error term \(\varepsilon_{it}\) is independent of \(\varepsilon_{js}\) for all assets \(i\neq j\) and all time periods \(t\neq s\).
Using the basic properties of expectation, variance and covariance, we can derive the following properties of returns in the GWN model: \[\begin{align*} E[R_{it}] & =E[\mu_{i}+\varepsilon_{it}]=\mu_{i}+E[\varepsilon_{it}]=\mu_{i},\\ \mathrm{var}(R_{it}) & =\mathrm{var}(\mu_{i}+\varepsilon_{it})=\mathrm{var}(\varepsilon_{it})=\sigma_{i}^{2},\\ \mathrm{cov}(R_{it},R_{jt}) & =\mathrm{cov}(\mu_{i}+\varepsilon_{it},\mu_{j}+\varepsilon_{jt})=\mathrm{cov}(\varepsilon_{it},\varepsilon_{jt})=\sigma_{ij},\\ \mathrm{cov}(R_{it},R_{js}) & =\mathrm{cov}(\mu_{i}+\varepsilon_{it},\mu_{j}+\varepsilon_{js})=\mathrm{cov}(\varepsilon_{it},\varepsilon_{js})=0,~t\neq s. \end{align*}\] Given that covariances and variances of returns are constant over time implies that the correlations between returns over time are also constant: \[\begin{align*} \mathrm{cor}(R_{it},R_{jt}) & =\frac{\mathrm{cov}(R_{it},R_{jt})}{\sqrt{\mathrm{var}(R_{it})\mathrm{var}(R_{jt})}}=\frac{\sigma_{ij}}{\sigma_{i}\sigma_{j}}=\rho_{ij},\\ \mathrm{cor}(R_{it},R_{js}) & =\frac{\mathrm{cov}(R_{it},R_{js})}{\sqrt{\mathrm{var}(R_{it})\mathrm{var}(R_{js})}}=\frac{0}{\sigma_{i}\sigma_{j}}=0,~i\neq j,t\neq s. \end{align*}\] Finally, since \(\{\varepsilon_{it}\}_{t=1}^{T}\sim\mathrm{GWN}(0,\sigma_{i}^{2})\) it follows that \(\{R_{it}\}_{t=1}^{T}\sim iid~N(\mu_{i},\sigma_{i}^{2})\). Hence, the GWN regression model (6.1) for \(R_{it}\) is equivalent to the model implied by Assumption GWN.
6.1.1.1 Interpretation of the GWN regression model
The GWN model has a very simple form and is identical to the measurement error model in the statistics literature.29 In words, the model states that each asset return is equal to a constant \(\mu_{i}\) (the expected return) plus an i.i.d. normally distributed random variable \(\varepsilon_{it}\) with mean zero and constant variance. The random variable \(\varepsilon_{it}\) can be interpreted as representing the unexpected news concerning the value of the asset that arrives between time \(t-1\) and time \(t.\) To see this, (6.1) implies that: \[ \varepsilon_{it}=R_{it}-\mu_{i}=R_{it}-E[R_{it}], \] so that \(\varepsilon_{it}\) is defined as the deviation of the random return from its expected value. If the news between times \(t-1\) and \(t\) is good, then the realized value of \(\varepsilon_{it}\) is positive and the observed return is above its expected value \(\mu_{i}.\) If the news is bad, then \(\varepsilon_{it}\) is negative and the observed return is less than expected. The assumption \(E[\varepsilon_{it}]=0\) means that news, on average, is neutral; neither good nor bad. The assumption that \(\mathrm{sd}(\varepsilon_{it})=\sigma_{i}\) can be interpreted as saying that the volatility, or typical magnitude, of news arrival is constant over time. The random news variable affecting asset \(i\), \(\varepsilon_{it}\), is allowed to be contemporaneously correlated with the random news variable affecting asset \(j,\) \(\varepsilon_{jt}\), to capture the idea that news about one asset may spill over and affect another asset. For example, if asset \(i\) is Microsoft stock and asset \(j\) is Apple Computer stock, then one interpretation of news in this context is general news about the computer industry and technology. Good news should lead to positive values of both \(\varepsilon_{it}\) and \(\varepsilon_{jt}.\) Hence these variables will be positively correlated due to a positive reaction to a common news component. Finally, the news on asset \(j\) at time \(s\) is unrelated to the news on asset \(i\) at time \(t\) for all times \(t\neq s\). For example, this means that the news for Apple in January is not related to the news for Microsoft in February.
6.1.2 Location-Scale model representation
Sometimes it is convenient to re-express the regression form of the GWN model (6.1) in location-scale form: \[\begin{align} R_{it} & =\mu_{i}+\varepsilon_{it}=\mu_{i}+\sigma_{i}\cdot Z_{it}\tag{6.2}\\ \{Z_{it}\}_{t=1}^{T} & \sim\mathrm{GWN}(0,1),\nonumber \end{align}\] where we use the decomposition \(\varepsilon_{it}=\sigma_{i}\cdot Z_{it}\). In this form, the random news shock is the \(iid\) standard normal random variable \(Z_{it}\) scaled by the “news” volatility \(\sigma_{i}\).
This form is particularly convenient for Value-at-Risk calculations because the \(\alpha\times100\%\) quantile of the return distribution has the simple form: \[ q_{\alpha}^{R_{i}}=\mu_{i}+\sigma_{i}\times q_{\alpha}^{Z}, \] where \(q_{\alpha}^{Z}\) is the \(\alpha\times100\%\) quantile of the standard normal random distribution. Let \(W_{0}\) be the initial amout of wealth to be invested from time \(t-1\) to \(t\). If \(R_{it}\) is the simple return then, \[ \mathrm{VaR}_{\alpha}=W_{0}\times q_{\alpha}^{R_{i}} = W_{0}\times \left( \mu_{i}+\sigma_{i}\times q_{\alpha}^{Z} \right), \] whereas if \(R_{it}\) is the continuously compounded return then, \[ \mathrm{VaR}_{\alpha}=W_{0}\times\left(e^{q_{\alpha}^{R_{i}}}-1\right) = W_{0}\times\left(e^{\mu_{i}+\sigma_{i}\times q_{\alpha}^{Z}}-1\right). \]
6.1.3 The GWN model in matrix notation
Define the \(N\times1\) vectors \(\mathbf{R}_{t}=(R_{1t},\ldots,R_{Nt})^{\prime}\), \(\mu=(\mu_{1},\ldots,\mu_{N})^{\prime}\), \(\varepsilon_{t}=(\varepsilon_{1t},\ldots,\varepsilon_{Nt})^{\prime}\) and the \(N\times N\) symmetric covariance matrix: \[ \mathrm{var}(\varepsilon_{t})=\Sigma=\left(\begin{array}{cccc} \sigma_{1}^{2} & \sigma_{12} & \cdots & \sigma_{1N}\\ \sigma_{12} & \sigma_{2}^{2} & \cdots & \sigma_{2N}\\ \vdots & \vdots & \ddots & \vdots\\ \sigma_{1N} & \sigma_{2N} & \cdots & \sigma_{N}^{2} \end{array}\right). \] Then the regression form of the GWN model in matrix notation is: \[\begin{align} \mathbf{R}_{t}=\mu+\varepsilon_{t},\tag{6.3}\\ \varepsilon_{t}\sim\mathrm{iid}~N(\mathbf{0},\Sigma),\nonumber \end{align}\] which implies that \(\mathbf{R}_{t}\sim iid~N(\mu,\Sigma)\).
The location-scale form of the GWN model in matrix notation makes use of the matrix square root factorization \(\Sigma=\Sigma^{1/2}\Sigma^{1/2\prime}\) where \(\Sigma^{1/2}\) is the lower-triangular matrix square root (usually the Cholesky factorization). Then (6.3) can be rewritten as: \[\begin{align} \mathbf{R}_{t}=\mu+\Sigma^{1/2}\mathbf{Z}_{t},\tag{6.4}\\ \mathbf{Z}_{t}\sim \mathrm{GWN}(\mathbf{0},\mathbf{I}_{N}),\nonumber \end{align}\] where \(\mathbf{I}_{N}\) denotes the \(N\)-dimensional identity matrix.
6.1.4 The GWN model for continuously compounded returns
The GWN model is often used to describe continuously compounded (cc) returns defined as \(R_{it}=\ln(P_{it}/P_{it-1})\) where \(P_{it}\) is the price of asset \(i\) at time \(t\). An advantage of the GWN model for cc returns is that the model aggregates to any time horizon because multi-period cc returns are additive. This is particularly convenient for investment risk analysis. The GWN model for cc returns also gives rise to the random walk model for the logarithm of asset prices. The normal distribution assumption of the GWN model for cc returns implies that single-period simple returns are log-normally distributed.
A disadvantage of the GWN model for cc returns is that the model has some limitations for the analysis of portfolios because the cc return on a portfolio of assets is not a weighted average of the cc returns on the individual assets. As a result, for portfolio analysis the GWN model is typically applied to simple returns. However, if the investment horizon is short (e.g. daily, weekly or monthly) then simple returns are typically close to zero on average and there is little difference between simple and cc returns. Hence, the time aggregation results of the GWN model for cc returns can be applied to simple returns and the portfolio results of the GWN model for simple returns can be applied to cc returns.
6.1.4.1 Time Aggregation and the GWN Model
The GWN model for cc returns has the following nice aggregation property with respect to the interpretation of \(\varepsilon_{it}\) as news. For illustration purposes, suppose that \(t\) represents months so that \(R_{it}\) is the cc monthly return on asset \(i\). Now, instead of the monthly return, suppose we are interested in the annual cc return \(R_{it}=R_{it}(12)\). Since multi-period cc returns are additive, \(R_{it}(12)\) is the sum of \(12\) monthly cc returns: \[ R_{it}=R_{it}(12)=\sum_{k=0}^{11}R_{it-k}=R_{it}+R_{it-1}+\cdots+R_{it-11}. \] Using the GWN regression model (6.1) for the monthly return \(R_{it}\), we may express the annual return \(R_{it}(12)\) as: \[ R_{it}(12)=\sum_{t=0}^{11}(\mu_{i}+\varepsilon_{it})=12\cdot\mu_{i}+\sum_{t=0}^{11}\varepsilon_{it}=\mu_{i}(12)+\varepsilon_{it}(12), \] where \(\mu_{i}(12)=12\cdot\mu_{i}\) is the annual expected return on asset \(i\), and \(\varepsilon_{it}(12)=\sum_{k=0}^{11}\varepsilon_{it-k}\) is the annual random news component. The annual expected return, \(\mu_{i}(12)\), is simply \(12\) times the monthly expected return, \(\mu_{i}\). The annual random news component, \(\varepsilon_{it}(12)\), is the accumulation of news over the year. As a result, the variance of the annual news component, \(\sigma_{i}(12)^{2}\), is \(12\) times the variance of the monthly news component: \[\begin{align*} \mathrm{var}(\varepsilon_{it}(12)) & =\mathrm{var}\left(\sum_{k=0}^{11}\varepsilon_{it-k}\right)\\ & =\sum_{k=0}^{11}\mathrm{var}(\varepsilon_{it-k})\textrm{ since }\varepsilon_{it}\textrm{ is uncorrelated over time}\\ & =\sum_{k=0}^{11}\sigma_{i}^{2}\textrm{ since }\mathrm{var}(\varepsilon_{it})\textrm{ is constant over time}\\ & =12\cdot\sigma_{i}^{2}=\sigma_{i}^{2}(12). \end{align*}\] It follows that the standard deviation of the annual news is equal to \(\sqrt{12}\) times the standard deviation of monthly news: \[ \mathrm{sd}(\varepsilon_{it}(12))=\sqrt{12}\times\mathrm{sd}(\varepsilon_{it})=\sqrt{12}\times\sigma_{i}. \] Similarly, due to the additivity of covariances, the covariance between \(\varepsilon_{it}(12)\) and \(\varepsilon_{jt}(12)\) is 12 times the monthly covariance: \[\begin{align*} \mathrm{cov}(\varepsilon_{it}(12),\varepsilon_{jt}(12)) & =\mathrm{cov}\left(\sum_{k=0}^{11}\varepsilon_{it-k},\sum_{k=0}^{11}\varepsilon_{jt-k}\right)\\ & =\sum_{k=0}^{11}\mathrm{cov}(\varepsilon_{it-k},\varepsilon_{jt-k})\textrm{ since }\varepsilon_{it}\textrm{ and }\varepsilon_{jt}\textrm{ are uncorrelated over time}\\ & =\sum_{k=0}^{11}\sigma_{ij}\textrm{ since }\mathrm{cov}(\varepsilon_{it},\varepsilon_{jt})\textrm{ is constant over time}\\ & =12\cdot\sigma_{ij}=\sigma_{ij}(12). \end{align*}\] The above results imply that the correlation between the annual errors \(\varepsilon_{it}(12)\) and \(\varepsilon_{jt}(12)\) is the same as the correlation between the monthly errors \(\varepsilon_{it}\) and \(\varepsilon_{jt}\): \[\begin{align*} \mathrm{cor}(\varepsilon_{it}(12),\varepsilon_{jt}(12)) & =\frac{\mathrm{cov}(\varepsilon_{it}(12),\varepsilon_{jt}(12))}{\sqrt{\mathrm{var}(\varepsilon_{it}(12))\cdot\mathrm{var}(\varepsilon_{jt}(12))}}\\ & =\frac{12\cdot\sigma_{ij}}{\sqrt{12\sigma_{i}^{2}\cdot12\sigma_{j}^{2}}}\\ & =\frac{\sigma_{ij}}{\sigma_{i}\sigma_{j}}=\rho_{ij}=\mathrm{cor}(\varepsilon_{it},\varepsilon_{jt}). \end{align*}\] The above results generalize to aggregating returns to arbitrary time horizons. Let \(R_{it}\) denote the cc return on asset \(i\) between times \(t-1\) and \(t\), where \(t\) represents the general investment horizon, and let \(R_{it}(k)=\sum_{j=0}^{k-1}R_{it-j}\) denote the \(k\)-period cc return. Then the GWN model for \(R_{it}(k)\) has the form:
\[\begin{align} R_{it}(k)=\mu_{i}(k)+\varepsilon_{it}(k), \tag{6.5} \\ \varepsilon_{it}(k)\sim \mathrm{GWN}(0,\sigma_{i}^{2}(k)) \nonumber, \end{align}\]
where \(\mu_{i}(k)=k\times\mu_{i}\) is the \(k\)-period expected return, \(\varepsilon_{it}(k)=\sum_{j=0}^{k-1}\varepsilon_{it-j}\) is the \(k\)-period error term, and \(\sigma_{i}^{2}(k)=k\times\sigma_{i}^{2}\) is the \(k\)-period variance. The \(k\)-period volatility follows the square-root-of-time rule: \(\sigma_{i}(k)=\sqrt{k}\times\sigma_{i}.\) The \(k\)-period covariance between asset \(i\) and asset \(j\) is \(\sigma_{ij}(k) = k\times \sigma_{ij}\), and the \(k\)-period correlation is \(\rho_{ij}(k) = \rho_{ij}\). This aggregation result is exact for cc returns but it is often used as an approximation for simple returns.
6.1.4.2 The random walk model of asset prices
The GWN model for cc returns (6.1) gives rise to the so-called random walk (RW) model for the logarithm of asset prices. To see this, recall that the continuously compounded return, \(R_{it},\) is defined from asset prices via \(R_{it}=\ln\left(\frac{P_{it}}{P_{it-1}}\right)=\ln(P_{it})-\ln(P_{it-1})\). Letting \(p_{it}=\ln(P_{it})\) and using the representation of \(R_{it}\) in the GWN model (6.1), we can express the log-price as: \[\begin{equation} p_{it}=p_{it-1}+\mu_{i}+\varepsilon_{it}.\tag{6.6} \end{equation}\] The representation in (6.6) is known as the RW model for log-prices.30. It is a representation of the GWN model in terms of log-prices.
In the RW model (6.6), \(\mu_{i}\) represents the expected change in the log-price (cc return) between months \(t-1\) and \(t\), and \(\varepsilon_{it}\) represents the unexpected change in the log-price. That is, \[\begin{align*} E[\Delta p_{it}] & =E[R_{it}]=\mu_{i},\\ \varepsilon_{it} & =\Delta p_{it}-E[\Delta p_{it}]. \end{align*}\] where \(\Delta p_{it}=p_{it}-p_{it-1}\). Further, in the RW model, the unexpected changes in the log-price, \(\{\varepsilon_{it}\}\), are uncorrelated over time (\(\mathrm{cov}(\varepsilon_{it},\varepsilon_{is})=0\) for \(t\neq s)\) so that future changes in the log-price cannot be predicted from past changes in the log-price.31
The RW model gives the following interpretation for the evolution of log prices. Let \(p_{i0}\) denote the initial log price of asset \(i\). The RW model says that the log-price at time \(t=1\) is: \[ p_{i1}=p_{i0}+\mu_{i}+\varepsilon_{i1}, \] where \(\varepsilon_{i1}\) is the value of random news that arrives between times \(0\) and \(1.\) The expected log-price at time \(t=1\) is: \[ E[p_{i1}]=p_{i0}+\mu_{i}+E[\varepsilon_{i1}]=p_{i0}+\mu_{i}, \] which is the initial price plus the expected return between times 0 and 1. Similarly, by recursive substitution the log-price at time \(t=2\) is: \[\begin{align*} p_{i2} & =p_{i1}+\mu_{i}+\varepsilon_{i2}\\ & =p_{i0}+\mu_{i}+\mu_{i}+\varepsilon_{i1}+\varepsilon_{i2}\\ & =p_{i0}+2\cdot\mu_{i}+\sum_{t=1}^{2}\varepsilon_{it} \\ & =p_{i0}+2\cdot\mu_{i}+\varepsilon_{it}(2), \end{align*}\] which is equal to the initial log-price, \(p_{i0},\) plus the two period expected return, \(2\cdot\mu_{i}\), plus the accumulated random news over the two periods, \(\sum_{t=1}^{2}\varepsilon_{it} = \varepsilon_{it}(2)\). By repeated recursive substitution, the log price at time \(t=T\) is, \[ p_{iT}=p_{i0}+T\cdot\mu_{i}+\sum_{t=1}^{T}\varepsilon_{it} = p_{i0}+T\cdot\mu_{i}+\varepsilon_{it}(T). \] At time \(t=0,\) the expected log-price at time \(t=T\) is, \[ E[p_{iT}]=p_{i0}+T\cdot\mu_{i}, \] which is the initial price plus the expected growth in prices over \(T\) periods. The actual price, \(p_{iT},\) deviates from the expected price by the accumulated random news: \[ p_{iT}-E[p_{iT}]=\sum_{t=1}^{T}\varepsilon_{it} = \varepsilon_{it}(T). \] At time \(t=0,\) the variance of the log-price at time \(T\) is, \[ \mathrm{var}(p_{iT})=\mathrm{var}\left(\sum_{t=1}^{T}\varepsilon_{it}\right)=T\cdot\sigma_{i}^{2} \] Hence, the RW model implies that the stochastic process of log-prices \(\{p_{it}\}\) is non-stationary because the variance of \(p_{it}\) increases with \(t.\) Finally, because \(\varepsilon_{it}\sim \mathrm{}(0,\sigma_{i}^{2})\) it follows that (conditional on \(p_{i0})\) \(p_{iT}\sim N(p_{i0}+T\mu_{i},T\sigma_{i}^{2})\).
The term random walk was originally used to describe the unpredictable movements of a drunken sailor staggering down the street. The sailor starts at an initial position, \(p_{0}\), outside the bar. The sailor generally moves in the direction described by \(\mu\) but randomly deviates from this direction after each step \(t\) by an amount equal to \(\varepsilon_{t}\). After \(T\) steps the sailor ends up at position \(p_{T}=p_{0}+\mu\cdot T+\sum_{t=1}^{T}\varepsilon_{t}\). The sailor is expected to be at location \(\mu T\), but where he actually ends up depends on the accumulation of the random changes in direction \(\sum_{t=1}^{T}\varepsilon_{t}\). Because \(\mathrm{var}(p_{T})=\sigma^{2}T\), the uncertainty about where the sailor will be increases with each step.
The RW model for log-prices implies the following model for prices: \[ P_{it}=e^{p_{it}}=P_{i0}e^{\mu_{i}\cdot t+\sum_{s=1}^{t}\varepsilon_{is}}=P_{i0}e^{\mu_{i}t}e^{\sum_{s=1}^{t}\varepsilon_{is}}, \] where \(p_{it}=p_{i0}+\mu_{i}t+\) \(\sum_{s=1}^{t}\varepsilon_{s}.\) Since \(P_{it}=e^{p_{it}}\), \(P_{it} > 0\). The term \(e^{\mu_{i}t}\) represents the expected exponential growth rate in prices between times \(0\) and time \(t,\) and the term \(e^{\sum_{s=1}^{t}\varepsilon_{is}}\) represents the unexpected exponential growth in prices. Here, conditional on \(P_{i0},\) \(P_{it}\) is log-normally distributed because \(p_{it}=\ln P_{it}\) is normally distributed.
6.1.5 GWN model for simple returns
For simple returns, defined as \(R_{it}=(P_{it}-P_{it-1})/P_{it-1}\), the GWN model is often used for the analysis of portfolios as discussed in chapters 11 - 15. The reason is that the simple return on a portfolio of \(N\) assets is weighted average of the simple returns on the individual assets. Hence, the GWN model for simple returns extends naturally to portfolios of assets.
6.1.5.1 GWN Model and Portfolios
Consider the GWN model in matrix form (6.3) for the \(N\times1\) vector of simple returns \(\mathbf{R}_{t}=(R_{1t},\ldots,R_{Nt})^{\prime}\). For a vector of portfolio weights \(\mathbf{x}=(x_{1},\ldots,x_{N})\) such that \(\mathbf{x}^{\prime}\mathbf{1}=\sum_{i=1}^{N}x_{i}=1,\) the simple return on the portfolio is: \[ R_{pt}=\mathbf{x}^{\prime}\mathbf{R}_{t}=\sum_{i=1}^{N}x_{i}R_{it}. \] Substituting in (6.1) gives the GWN model for the portfolio returns: \[\begin{equation} R_{pt}=\mathbf{x}^{\prime}\left(\mu+\varepsilon_{t}\right)=\mathbf{x}^{\prime}\mu + \mathbf{x}^{\prime}\varepsilon_{t}=\mu_{p}+\varepsilon_{pt}\tag{6.7} \end{equation}\] where \(\mu_{p}=\mathbf{x}^{\prime}\mu=\sum_{i=1}^{N}x_{i}\mu_{i}\) is the portfolio expected return, and \(\varepsilon_{pt}=\mathbf{x}^{\prime}\varepsilon_{t}=\sum_{i=1}^{N}x_{i}\varepsilon_{it}\) is the portfolio error. The variance of \(R_{pt}\) is given by: \[ \mathrm{var}(R_{pt})=\mathrm{var}(\mathbf{x}^{\prime}\mathbf{R}_{t})=\mathbf{x}^{\prime}\Sigma\mathbf{ x}=\sigma_{p}^{2}. \] Therefore, the distribution of portfolio returns is normal, \[ R_{pt}\sim N(\mu_{p},\sigma_{p}^{2}). \] This result is exact for simple returns but is often used as an approximation for cc returns.
6.1.5.2 GWN Model for Multi-Period Simple Returns
The GWN model for single period simple returns does not extend exactly to multi-period simple returns because multi-period simple returns are not additive. Recall, the \(k\)-period simple return has a multiplicative relationship to single period returns: \[\begin{align*} R_{t}(k) & =(1+R_{t})(1+R_{t-1})\times\cdots\times(1+R_{t-k+1})-1\\ & =R_{t}+R_{t-1}+\cdots+R_{t-k+1}\\ & +R_{t}R_{t-1}+R_{t}R_{t-2}+\cdots+R_{t-k+2}R_{t-k+1}. \end{align*}\] Even though single period returns are normally distributed in the GWN model, multi-period returns are not normally distributed because the product of two normally distributed random variables (e.g., \(R_{t}R_{t-1}\)) is not normally distributed. Hence, the GWN model does not exactly generalize to multi-period simple returns. However, if single period returns are small then all of the cross products of returns are approximately zero (\(R_{t}R_{t-1}\approx\cdots\approx R_{t-k+2}R_{t-k+1}\approx0\)) and, \[\begin{align*} R_{t}(k) & \approx R_{t}+R_{t-1}+\cdots+R_{t-k+1}\\ & \approx\mu(k)+\varepsilon_{t}(k). \end{align*}\] where \(\mu(k)=k\mu\) and \(\varepsilon_{t}(k)=\sum_{j=0}^{k-1}\varepsilon_{t-j}\). Hence, the GWN model is approximately true for multi-period simple returns when single period simple returns are not too big.
Some exact returns can be derived for the mean and variance of multi-period simple returns. For simplicity, let \(k=2\) so that, \[ R_{t}(2)=(1+R_{t})(1+R_{t-1})-1=R_{t}+R_{t-1}+R_{t}R_{t-1}. \] Substituting in (6.1) then gives, \[\begin{align*} R_{t}(2) & =\left(\mu+\varepsilon_{t}\right)+(\mu+\varepsilon_{t-1})+\left(\mu+\varepsilon_{t}\right)(\mu+\varepsilon_{t-1})\\ & =2\mu+\varepsilon_{t}+\varepsilon_{t-1}+\mu^{2}+\mu\varepsilon_{t}+\mu\varepsilon_{t-1}+\varepsilon_{t}\varepsilon_{t-1}\\ & =2\mu+\mu^{2}+\varepsilon_{t}(1+\mu)+\varepsilon_{t-1}(1+\mu)+\varepsilon_{t}\varepsilon_{t-1}. \end{align*}\] The result for the expected return is easy, \[\begin{align*} E[R_{t}(2)] & =2\mu+\mu^{2}+(1+\mu)E[\varepsilon_{t}]+(1+\mu)E[\varepsilon_{t-1}]+E[\varepsilon_{t}\varepsilon_{t-1}]\\ & =2\mu+\mu^{2}=(1+\mu)^{2}-1, \end{align*}\] The result uses the independence of \(\varepsilon_{t}\) and \(\varepsilon_{t-1}\) to get \(E[\varepsilon_{t}\varepsilon_{t-1}]=E[\varepsilon_{t}]E[\varepsilon_{t-1}]=0.\) The result for the variance, however, is more work: \[\begin{align*} \mathrm{var}(R_{t}(2)) & =\mathrm{var}(\varepsilon_{t}(1+\mu)+\varepsilon_{t-1}(1+\mu)+\varepsilon_{t}\varepsilon_{t-1})\\ & =(1+\mu)^{2}\mathrm{var}(\varepsilon_{t})+(1+\mu)^{2}\mathrm{var}(\varepsilon_{t-1})+\mathrm{var}(\varepsilon_{t}\varepsilon_{t-1})\\ & +2(1+\mu)^{2}\mathrm{cov}(\varepsilon_{t},\varepsilon_{t-1})+2(1+\mu)\mathrm{cov}(\varepsilon_{t},\varepsilon_{t}\varepsilon_{t-1})\\ & +2(1+\mu)\mathrm{cov}(\varepsilon_{t-1},\varepsilon_{t}\varepsilon_{t-1}) \end{align*}\] Now, \(\mathrm{var}(\varepsilon_{t})=\mathrm{var}(\varepsilon_{t-1})=\sigma^{2}\) and \(\mathrm{cov}(\varepsilon_{t},\varepsilon_{t-1})=0.\) Next, note that, \[ \mathrm{var}(\varepsilon_{t}\varepsilon_{t-1})=E[\varepsilon_{t}^{2}\varepsilon_{t-1}^{2}]-\left(E[\varepsilon_{t}\varepsilon_{t-1}]\right)^{2}=E[\varepsilon_{t}^{2}]E[\varepsilon_{t-1}^{2}]-\left(E[\varepsilon_{t}]E[\varepsilon_{t-1}]\right)^{2}=2\sigma^{2}. \] Finally, \[\begin{align*} \mathrm{cov}(\varepsilon_{t},\varepsilon_{t}\varepsilon_{t-1}) & =E[\varepsilon_{t}(\varepsilon_{t}\varepsilon_{t-1})]-E[\varepsilon_{t}]E[\varepsilon_{t}\varepsilon_{t-1}]\\ & =E[\varepsilon_{t}^{2}]E[\varepsilon_{t-1}]-E[\varepsilon_{t}]E[\varepsilon_{t}]E[\varepsilon_{t-1}]\\ & =0. \end{align*}\] Then, \[\begin{align*} \mathrm{var}(R_{t}(2)) & =(1+\mu)^{2}\sigma^{2}+(1+\mu)^{2}\sigma^{2}+2\sigma^{2}\\ & =2\sigma^{2}[(1+\mu)^{2}+1]. \end{align*}\] If \(\mu\) is close to zero then \(E[R_{t}(2)]\approx2\mu\) and \(\mathrm{var}(R_{t}(2))\approx2\sigma^{2}\) and so the square-root-of-time rule holds approximately.
6.1.5.3 Implied model for prices
Since \(R_{it}=(P_{it}-P_{it-1})/P_{it-1}\), we can write \[ P_{it} = P_{it-1}(1 + R_{it}) \] Starting at \(t=1\) and assuming \(P_{i0} > 0\) is fixed, by recursive substitution we get \[ P_{it} = P_{i0}(1+R_{i1})(1+R_{i2})\ldots (1+R_{it-1}) \] Prices will be positive provided all returns are greater than \(-1\). This model for prices is not a RW model but behaves similarly to the RW model if all simple returns are close to zero.
6.2 Monte Carlo Simulation of the GWN Model
A simple technique that can be used to understand the probabilistic behavior of a model involves using computer simulation methods to create pseudo data from the model. The process of creating such pseudo data is called Monte Carlo simulation.32 Monte Carlo simulation of a model can be used as a first step reality check of the model. If simulated data from the model do not look like the data that the model is supposed to describe, then serious doubt is cast on the model. However, if simulated data look reasonably close to the actual data then the first step reality check is passed. Ideally, one should consider many simulated samples from the model because it is possible for a given simulated sample to look strange simply because of an unusual set of random numbers.
Monte Carlo simulations can be used to reveal properties of a model that are hard to derive analytically.
Monte Carlo simulation can also be used to create “what if?” type scenarios for a model. Different scenarios typically correspond with different model parameter values.
Finally, Monte Carlo simulation can be used to study properties of statistics computed from the pseudo data from the model. For example, Monte Carlo simulation can be used to illustrate the concepts of estimator bias and confidence interval coverage probabilities (see Chapter 9).
To illustrate the use of Monte Carlo simulation, consider creating pseudo return data from the GWN model (6.1) for a single asset. The steps to create a Monte Carlo simulation from the GWN model are:
Monte Carlo Steps: Single Asset
- Fix values for the GWN model parameters \(\mu\) and \(\sigma\).
- Determine the number of simulated values, \(T\), to create.
- Use a computer random number generator to simulate \(T\) \(iid\) values of \(\varepsilon_{t}\) from a \(N(0,\sigma^{2})\) distribution. Denote these simulated values as \(\tilde{\varepsilon}_{1},\ldots,\tilde{\varepsilon}_{T}\).
- Create the simulated return data \(\tilde{R}_{t}=\mu+\tilde{\varepsilon}_{t}\) for \(t=1,\ldots,T\).
To motivate plausible values for \(\mu\) and \(\sigma\) in the simulation,
Figure 6.1 shows the monthly continuously
compounded (cc) returns on Microsoft stock over the period January 1998
through May 2012. The data is the same as that used in Chapter 5
and is constructed from the IntroCompFinR "xts"
object msftDailyPrices
as follows:
data(msftDailyPrices)
msftPrices = to.monthly(msftDailyPrices, OHLC=FALSE)
smpl = "1998-01::2012-05"
msftPrices = msftPrices[smpl]
msftRetC = na.omit(Return.calculate(msftPrices, method="log"))
mu.msftRetC = mean(msftRetC)
sigma.msftRetC = sd(msftRetC)
c(mu.msftRetC, sigma.msftRetC)
## [1] 0.00413 0.10021
The parameter \(\mu=E[R_{t}]\) in the GWN model is the expected monthly return, and \(\sigma\) represents the typical size of a deviation about \(\mu\). In Figure 6.1, the returns seem to fluctuate up and down about a central value near 0 and the typical size of a return deviation about 0 is roughly 0.10, or 10% (see dashed lines in Figure 6.1). The sample mean turns out to be \(\hat{\mu}=0.004\) (0.4%) and the sample standard deviation is \(\hat{\sigma}=0.100\) (10%). Figure 6.2 shows three distribution summaries (histogram, boxplot and normal qq-plot) and the SACF computed from the sample returns. The returns look to have slightly fatter tails than the normal distribution and show little evidence of linear time dependence (autocorrelation).
\(\blacksquare\)
To mimic the monthly return data on Microsoft in the Monte Carlo simulation, the values \(\mu=0.004\) and \(\sigma=0.10\) are used as the model’s true parameters and \(T=172\) is the number of simulated values (sample size of actual data). Let \(\{\tilde{\varepsilon}_{1},\ldots,\tilde{\varepsilon}_{172}\}\) denote the \(172\) simulated values of the random news variable \(\varepsilon_{t}\sim\mathrm{GWN}(0,(0.10)^{2})\). The simulated returns are then computed using:33 \[\begin{equation} \tilde{R}_{t}=0.004+\tilde{\varepsilon}_{t},~t=1,\ldots,172\tag{6.8} \end{equation}\] To create the simulated returns from (6.8) use:
mu = 0.004
sd.e = 0.10
n.obs = length(msftRetC)
set.seed(111)
sim.e = rnorm(n.obs, mean=0, sd=sd.e)
sim.ret = mu + sim.e
sim.ret = xts(sim.ret, index(msftRetC))
The simulated returns \(\{\tilde{R}_{t}\}_{t=1}^{172}\) (with the same time index as the Microsoft returns) are shown in Figure 6.3. The simulated return data fluctuate randomly about \(\mu=0.004\), and the typical size of the fluctuation is approximately equal to \(\sigma=0.10\). The simulated return data look somewhat like the actual monthly return data for Microsoft. The main difference is that the return volatility for Microsoft appears to have decreased in the latter part of the sample whereas the simulated data has constant volatility over the entire sample. Figure 6.4 shows the distribution summaries (histogram, boxplot and normal qq-plot) and the SACF for the simulated returns. The simulated returns are normally distributed and show thinner tails than the actual returns. The simulated returns also show no evidence of linear time dependence (autocorrelation).
\(\blacksquare\)
The RW model for log-price based on the GWN model (6.8) calibrated to Microsoft log prices is: \[ p_{t}=2.592+0.004\cdot t+\sum_{j=1}^{t}\varepsilon_{j},\varepsilon_{j}\sim\mathrm{GWN}(0,(0.10)^{2}). \] where \(p_{0}=2.586=\ln(13.27)\) is the log of Microsoft Price at the end of January 1998. Given \(p_{t}\), the price level is determined using \(P_{t} = e^{p_{t}}\). A Monte Carlo simulation of this RW model with drift can be created in R using:
p0 = 2.58
sim.p = p0 + mu*seq(n.obs) + cumsum(sim.e)
sim.p = c(p0, sim.p)
sim.P = exp(sim.p)
sim.p = xts(sim.p, index(msftPrices))
sim.P = xts(sim.P, index(msftPrices))
Figure 6.5 shows the simulated log prices and prices. The top panel shows the simulated log price, \(\tilde{p}_{t}\), and the bottom panel shows the simulated price levels \(\tilde{P}_{t}=e^{\tilde{p}_{t}}\) For comparison, Figure 6.6 shows the actual log prices and price levels for Microsoft stock. Notice the similar behavior of the simulated random walk data and the actual data.
\(\blacksquare\)
The previous examples show one Monte Carlo simulation from the GWN model for cc returns. The simulated returns were compared to the actual returns to do the first stage reality check. Here, the simulated returns looked similar to actual returns so the GWN model passed the first stage reality check. However, keep in mind that the simulated returns represent only one random draw of size \(T\) from the GWN model for cc returns. It is possible that you can, by chance, get a strange set of simulated returns which would lead you to reject the first stage reality check even if the GWN model is a good model for actual returns. To guard against this possibility, you should always create many Monte Carlo simulations from the proposed model and use the typical behavior from the simulations to do the first step reality check.
To create ten simulated samples of size \(T=132\) from the GWN model for Microsoft cc returns use:
sim.e = matrix(0, n.obs, 10)
set.seed(111)
for (i in 1:10) {
sim.e[,i] = rnorm(n.obs, mean=0, sd=sd.e)
}
sim.ret = mu + sim.e
sim.ret = xts(sim.ret, index(msftRetC))
colnames(sim.ret) = paste("sim", 1:10, sep=".")
The ten simulations are shown in Figure 6.7.
Each panel shows an alternative reality for returns over the period February 1998 through May 2012. Each one of these alternative realities is equally likely to happen, and the typical behavior of the simulated returns is similar to actual returns.
Figure 6.8 shows the same ten simulations from the GWN model in terms of prices. Notice the large variability in prices at the end of the sample.
6.2.1 Simulating returns on more than one asset
Creating a Monte Carlo simulation of more than one return from the GWN model requires simulating observations from a multivariate normal distribution. This follows from the matrix representation of the GWN model given in (6.3). The steps required to create a multivariate Monte Carlo simulation are:
Monte Carlo Steps: Multiple Assets
- Fix values for the \(N\times 1\) mean vector \(\mu\) and the \(N\times N\) covariance matrix \(\Sigma\).
- Determine the number of simulated values, \(T,\) to create.
- Use a computer random number generator to simulate \(T\) \(iid\) values of the \(N\times 1\) random vector \(\varepsilon_{t}\) from the multivariate normal distribution \(N(\mathbf{0},\Sigma)\). Denote these simulated vectors as \(\tilde{\varepsilon}_{1},\ldots,\tilde{\varepsilon}_{T}\).
- Create the \(N\times 1\) simulated return vector \(\mathbf{\tilde{R}}_{t}=\mu+\tilde{\varepsilon}_{t}\) for \(t=1,\ldots,T\).
- Repeat steps 1-4 to create multiple simulations (alternative realities).
To motivate the parameters for a multivariate simulation of the GWN model, consider the monthly cc returns for Microsoft, Starbucks and the S&P 500 index over the period January 1998 through May 2012 illustrated in Figures 6.9 and 6.10. The data is assembled using the R commands:
data(sp500DailyPrices, sbuxDailyPrices)
sp500Prices = to.monthly(sp500DailyPrices, OHLC=FALSE)
sbuxPrices = to.monthly(sbuxDailyPrices, OHLC=FALSE)
sp500Prices = sp500Prices[smpl]
sbuxPrices = sbuxPrices[smpl]
sp500RetC = na.omit(Return.calculate(sp500Prices, method="log"))
sbuxRetC = na.omit(Return.calculate(sbuxPrices, method="log"))
gwnRetC = merge(msftRetC, sbuxRetC, sp500RetC)
colnames(gwnRetC) = c("MSFT", "SBUX", "SP500")
The multivariate sample descriptive statistics (mean vector, standard deviation vector, covariance matrix and correlation matrix) are:
## MSFT SBUX SP500
## mean.vals 0.00413 0.0147 0.00169
## sd.vals 0.10021 0.1116 0.04847
## MSFT SBUX SP500
## MSFT 0.01004 0.00381 0.00300
## SBUX 0.00381 0.01246 0.00248
## SP500 0.00300 0.00248 0.00235
## MSFT SBUX SP500
## MSFT 1.000 0.341 0.617
## SBUX 0.341 1.000 0.457
## SP500 0.617 0.457 1.000
All returns fluctuate around mean values close to zero. The volatilities of Microsoft and Starbucks are similar with typical magnitudes around 0.10, or 10%. The volatility of the S&P 500 index is considerably smaller at about 0.05, or 5%. The pairwise scatterplots show that all returns are positively related. The pairs (MSFT, SP500) and (SBUX, SP500) are the most correlated with sample correlation values around 0.5. The pair (MSFT, SBUX) has a moderate positive correlation around 0.3.
\(\blacksquare\)
Simulating values from the multivariate GWN model (6.3)
requires simulating multivariate normal random variables. In R, this
can be done using the function rmvnorm()
from the package
mvtnorm. The function rmvnorm()
requires a vector
of mean values and a covariance matrix. Define:
\[
\mathbf{R}_{t}=\left(\begin{array}{c}
R_{\textrm{msft},t}\\
R_{\textrm{sbux},t}\\
R_{\textrm{sp500},t}
\end{array}\right),~\mu=\left(\begin{array}{c}
\mu_{\textrm{msft},t}\\
\mu_{\textrm{sbux},t}\\
\mu_{\textrm{sp500},t}
\end{array}\right),\Sigma=\left(\begin{array}{ccc}
\sigma_{\textrm{msft}}^{2} & \sigma_{\textrm{msft},\textrm{sbux}} & \sigma_{\textrm{msft},\textrm{sp500}}\\
\sigma_{\textrm{msft},\textrm{sbux}} & \sigma_{\textrm{sbux}}^{2} & \sigma_{\textrm{sbux},\textrm{sp500}}\\
\sigma_{\textrm{msft},\textrm{sp500}} & \sigma_{\textrm{sbux},\textrm{sp500}} & \sigma_{\textrm{sp500}}^{2}
\end{array}\right)
\]
The parameters \(\mu\) and \(\Sigma\) of the multivariate
GWN model are set equal to the sample mean vector \(\mu\)
and sample covariance matrix \(\Sigma\):
To create a Monte Carlo simulation from the GWN model calibrated to the month continuously returns on Microsoft, Starbucks and the S&P 500 index use:34
set.seed(123)
e.sim = rmvnorm(n.obs, sigma=covMat)
returns.sim = muVec + t(e.sim)
returns.sim = t(returns.sim)
colnames(returns.sim) = colnames(gwnRetC)
returns.sim = xts(returns.sim, index(gwnRetC))
The simulated returns are shown in Figures 6.11 and 6.12. They look similar to the actual returns shown in Figures 6.9 and 6.10. The actual returns show periods of high and low volatility that the simulated returns do not. The sample statistics from the simulated returns, however, are close to the sample statistics of the actual data:
mean.vals = apply(returns.sim, 2, mean)
sd.vals = apply(returns.sim, 2, sd)
rbind(mean.vals, sd.vals)
## MSFT SBUX SP500
## mean.vals 0.00671 0.0138 0.00508
## sd.vals 0.09509 0.1051 0.04601
## MSFT SBUX SP500
## MSFT 0.00904 0.00353 0.00246
## SBUX 0.00353 0.01105 0.00194
## SP500 0.00246 0.00194 0.00212
## MSFT SBUX SP500
## MSFT 1.000 0.354 0.563
## SBUX 0.354 1.000 0.402
## SP500 0.563 0.402 1.000
\(\blacksquare\)
TBD
\(\blacksquare\)
6.3 Scenario Analysis
TBD
6.4 Further Reading: The GWN Return Model
The location-scale form of the GWN model (6.2) is commonly used as a starting point for the distributional analysis of returns. In this chapter, and throughout most of the book, we use the standard normal distribution for the standardized news shock. However, many other standardized distributions that are additive (e.g., skew-normal, Student’s t, skewed-t, etc.) can also be used as illustrated in (Jondeau, Poon, and Rockinger 2007), (Pfaff 2013), (Carmona 2014), and (Ruppert and Matteson 2015).
[] gives as comprehensive non-technical description of the RW model for stock prices.
6.5 Problems: The GWN Return Mode
Exercise 6.1 Consider the GWN model for cc returns:
\[ r_{t}=\mu+\epsilon_{t},\,\epsilon_{t}\sim\mathrm{GWN}(0,\sigma^{2}). \]
It implies that the log price follows a random walk with drift:
\[ \ln P_{t}=\ln P_{t-1}+r_{t}=\ln P_{0}+\mu t+\sum_{s=1}^{t}\epsilon_{s}. \]
Show that \(E[\ln P_{t}]\) and \(\mathrm{var}[\ln P_{t}]\) depend on \(t\) so that \(\ln P_{t}\) is non-stationary.
- Daily adjusted closing prices on Amazon stock and the S&P 500 index are in the IntroCompFinR
"xts"
objectsamznDailyPrices
andsp500DailyPrices
, respectively. Using these prices, create monthly returns over the period January 2009, through January 2014. - Create a time plot showing the monthly returns on the two assets. Do the monthly returns from the two assets look like realizations from a covariance stationary stochastic process? Why or why not?
- Compare and contrast the return characteristics of the two assets. In addition, comment on any common features, if any, of the two return series.
- Using the IntroCompFinR function
fourPanelPlot()
, create graphical descriptive statistics for the monthly returns. Also, create the scatterplot between the two return series. Do the returns look normally distributed? Is there any evidence for linear time dependence? Do the return appear to contemporaneously correlated? - Compute sample descriptive statistics (mean, standard deviation, skewness, kurtosis, and correlation between Amazon and the S&P 500 index).
- In the GWN model for simple returns prices do not follow a random walk but instead evolve over time according to \[ P_{t} = P_{t-1}(1 + R_{t}) \]
Exercise 6.3 Load packages and set options:
suppressPackageStartupMessages(library(IntroCompFinR))
suppressPackageStartupMessages(library(corrplot))
suppressPackageStartupMessages(library(PerformanceAnalytics))
suppressPackageStartupMessages(library(xts))
options(digits = 3)
Sys.setenv(TZ="UTC")
Loading data and computing returns
Load the daily price data from IntroCompFinR, and create monthly returns over the period Jan 1998 through Dec 2014:
data(amznDailyPrices, baDailyPrices, costDailyPrices, jwnDailyPrices, sbuxDailyPrices)
fiveStocks = merge(amznDailyPrices, baDailyPrices, costDailyPrices, jwnDailyPrices, sbuxDailyPrices)
fiveStocks = to.monthly(fiveStocks, OHLC=FALSE)
## Warning in to.period(x, "months", indexAt = indexAt, name =
## name, ...): missing values removed from data
Next, let’s compute monthly continuously compounded returns using the PerformanceAnalytics function Return.Calculate()
## AMZN BA COST JWN SBUX
## Feb 1998 0.2661 0.1332 0.1193 0.1220 0.0793
## Mar 1998 0.1049 -0.0401 0.0880 0.1062 0.1343
## Apr 1998 0.0704 -0.0404 0.0458 0.0259 0.0610
We removed the missing January return using the function na.omit()
.
Part I: GWN Model Estimation
Consider the GWN Model for cc returns
\[\begin{align} R_{it} & = \mu_i + \epsilon_{it}, t=1,\cdots,T \\ \epsilon_{it} & \sim \text{iid } N(0, \sigma_{i}^{2}) \\ \mathrm{cov}(R_{it}, R_{jt}) & = \sigma_{i,j} \\ \mathrm{cov}(R_{it}, R_{js}) & = 0 \text{ for } s \ne t \end{align}\]
where \(R_{it}\) denotes the cc return on asset \(i\) (\(i=\mathrm{AMZN}, \cdots, \mathrm{SBUX}\)).
Using sample descriptive statistics, give estimates for the model parameters \(\mu_i, \sigma_{i}^{2}, \sigma_i, \sigma_{i,j}, \rho_{i,j}\).
For each estimate of the above parameters (except \(\sigma_{i,j}\)) compute the estimated standard error. That is, compute \(\widehat{\mathrm{SE}}(\hat{\mu}_{i})\), \(\widehat{\mathrm{SE}}(\hat{\sigma}_{i}^{2})\), \(\widehat{\mathrm{SE}}(\hat{\sigma}_{i})\), and \(\widehat{\mathrm{SE}}(\hat{\rho}_{ij})\). Briefly comment on the precision of the estimates. We will compare the bootstrap SE values to these values.
Part II: Bootstrapping the GWN Model Estimates
For each estimate of the above parameters (except \(\sigma_{i,j}\)) compute the estimated standard error using the bootstrap with 999 bootstrap replications. That is, compute \(\widehat{\mathrm{SE}}_{boot}(\hat{\mu}_{i})\), \(\widehat{\mathrm{SE}}_{boot}(\hat{\sigma}_{i}^{2})\), \(\widehat{\mathrm{SE}}_{boot}(\hat{\sigma}_{i})\), and \(\widehat{\mathrm{SE}}_{boot}(\hat{\rho}_{ij})\). Compare the bootstrap standard errors to the analytic standard errors you computed above.
Plot the histogram and qq-plot of the bootstrap distributions you computed from the previous question. Do the bootstrap distributions look normal?
For each asset, compute estimates of the \(5\%\) value-at-risk based on an initial \(\$100,000\) investment. Use the bootstrap to compute values as well as \(95\%\) confidence intervals. Briefly comment on the accuracy of the 5% VaR estimates.
6.6 Solutions to Selected Problems
Exercise 6.3
Load packages and set options
library(IntroCompFinR)
library(corrplot)
library(PerformanceAnalytics)
library(xts)
options(digits = 3)
Sys.setenv(TZ="UTC")
Loading data and computing returns:
Load the daily price data from IntroCompFinR, and create monthly returns over the period Jan 1998 through Dec 2014:
data(amznDailyPrices, baDailyPrices, costDailyPrices, jwnDailyPrices, sbuxDailyPrices)
fiveStocks = merge(amznDailyPrices, baDailyPrices, costDailyPrices, jwnDailyPrices, sbuxDailyPrices)
fiveStocks = to.monthly(fiveStocks, OHLC=FALSE)
## Warning in to.period(x, "months", indexAt = indexAt, name =
## name, ...): missing values removed from data
Let’s look at the data:
## AMZN BA COST JWN SBUX
## Jan 1998 4.92 34.0 17.4 9.4 4.24
## Feb 1998 6.42 38.9 19.6 10.6 4.59
## Mar 1998 7.13 37.4 21.4 11.8 5.25
## AMZN BA COST JWN SBUX
## Oct 2014 305 123 128 72 75.0
## Nov 2014 339 134 137 76 80.9
## Dec 2014 310 129 137 79 81.8
Next, let’s compute monthly continuously compounded returns using the PerformanceAnalytics function Return.Calculate()
## AMZN BA COST JWN SBUX
## Feb 1998 0.2661 0.1332 0.1193 0.1220 0.0793
## Mar 1998 0.1049 -0.0401 0.0880 0.1062 0.1343
## Apr 1998 0.0704 -0.0404 0.0458 0.0259 0.0610
We removed the missing January return using the function na.omit()
.
Part I: GWN Model Estimation
Consider the GWN Model for cc returns
\[\begin{align} R_{it} & = \mu_i + \epsilon_{it}, t=1,\cdots,T \\ \epsilon_{it} & \sim \text{iid } N(0, \sigma_{i}^{2}) \\ \mathrm{cov}(R_{it}, R_{jt}) & = \sigma_{i,j} \\ \mathrm{cov}(R_{it}, R_{js}) & = 0 \text{ for } s \ne t \end{align}\]
where \(R_{it}\) denotes the cc return on asset \(i\) (\(i=\mathrm{AMZN}, \cdots, \mathrm{SBUX}\)).
- Using sample descriptive statistics, give estimates for the model parameters \(\mu_i, \sigma_{i}^{2}, \sigma_i, \sigma_{i,j}, \rho_{i,j}\).
Use the R functions mean
, var
, sd
, cov
, and rho
to compute GWN model estimates:
muhat.vals = apply(fiveStocksRet, 2, mean)
sigma2hat.vals = apply(fiveStocksRet, 2, var)
sigmahat.vals = apply(fiveStocksRet, 2, sd)
cov.mat = var(fiveStocksRet)
cor.mat = cor(fiveStocksRet)
covhat.vals = cov.mat[lower.tri(cov.mat)]
rhohat.vals = cor.mat[lower.tri(cor.mat)]
names(covhat.vals) <- names(rhohat.vals) <-
c("AMZN,BA","AMZN,COST","AMZN,JWN", "AMZN,SBUX",
"BA,COST", "BA,JWN", "BA,SBUX",
"COST, JWN", "COST,SBUX",
"JWN, SBUX")
The estimates of \(\mu_i, \sigma_{i}^{2}, \sigma_i\) are
## muhat.vals sigma2hat.vals sigmahat.vals
## AMZN 0.02042 0.02814 0.1678
## BA 0.00657 0.00764 0.0874
## COST 0.01017 0.00584 0.0764
## JWN 0.01049 0.01298 0.1140
## SBUX 0.01458 0.01111 0.1054
Let’s plot the risk-return trade-off graph
plot(sigmahat.vals, muhat.vals, ylim=c(0, 0.03), xlim=c(0, 0.20), ylab=expression(mu[p]),
xlab=expression(sigma[p]), pch=16, col="blue", cex=2, cex.lab=1.5)
text(sigmahat.vals, muhat.vals, labels=colnames(fiveStocksRet), pos=4, cex = 1.5)
Here, we see that AMZN has the hightest mean return and volatility. The other stocks have similar mean and volatilities. BA has the lowest mean and COST has the lowest volatility.
The estimates of \(\sigma_{i,j}, \rho_{i,j}\) are
## AMZN,BA AMZN,COST AMZN,JWN AMZN,SBUX BA,COST BA,JWN
## covhat.vals 0.00208 0.00404 0.00714 0.00546 0.00118 0.00319
## rhohat.vals 0.14194 0.31517 0.37357 0.30862 0.17714 0.32071
## BA,SBUX COST, JWN COST,SBUX JWN, SBUX
## covhat.vals 0.00237 0.00355 0.00271 0.00468
## rhohat.vals 0.25792 0.40724 0.33634 0.38954
All stocks are positively correlated. The highest correlated stocks are COST and JWN (Nordstrom), with correlation 0.41, and the lowest correlated stocks are AMZN and BA, with correlation 0.14. You can visualize the correlation matrix using the corrplot function corrplot.mixed()
:
- For each estimate of the above parameters (except \(\sigma_{i,j}\)) compute the estimated standard error. That is, compute \(\widehat{\mathrm{SE}}(\hat{\mu}_{i})\), \(\widehat{\mathrm{SE}}(\hat{\sigma}_{i}^{2})\), \(\widehat{\mathrm{SE}}(\hat{\sigma}_{i})\), and \(\widehat{\mathrm{SE}}(\hat{\rho}_{ij})\). Briefly comment on the precision of the estimates. Hint: the formulas for these standard errors were given in class, and are given in the lecture notes on the constant expected return model.
The SE formulas are
\[\widehat{\mathrm{SE}}(\hat{\mu}_{i})=\frac{\hat{\sigma}}{\sqrt{T}}, \widehat{\mathrm{SE}}(\hat{\sigma}_{i}^{2}) =\frac{\hat{\sigma}^2}{\sqrt{T/2}}, \widehat{\mathrm{SE}}(\hat{\sigma}_{i}) = \frac{\hat{\sigma}}{\sqrt{2T}},\widehat{\mathrm{SE}}(\hat{\rho}_{ij})=\frac{1-\hat{\rho}_{i,j}^2}{\sqrt{T}} \] The R code to compute theses estimates is
nobs = nrow(fiveStocksRet)
se.muhat = sigmahat.vals/sqrt(nobs)
se.sigma2hat = sigma2hat.vals/sqrt(nobs/2)
se.sigmahat = sigmahat.vals/sqrt(2*nobs)
se.rhohat = (1-rhohat.vals^2)/sqrt(nobs)
To interpret the SE values, it is useful to compare them to the estimates. For \(\hat{\mu}\) we have
## muhat.vals se.muhat
## AMZN 0.02042 0.01177
## BA 0.00657 0.00613
## COST 0.01017 0.00536
## JWN 0.01049 0.00800
## SBUX 0.01458 0.00740
The mean values are not estimated very precisely as the SE values are almost as big as the mean estimates. BA and COST have the smallest SE values and AMZN and JWN have the largest values.
For \(\hat{\sigma}^2\) and \(\hat{\sigma}\) we have we have
## sigma2hat.vals se.sigma2hat sigmahat.vals se.sigmahat
## AMZN 0.02814 0.002793 0.1678 0.00833
## BA 0.00764 0.000758 0.0874 0.00434
## COST 0.00584 0.000580 0.0764 0.00379
## JWN 0.01298 0.001289 0.1140 0.00566
## SBUX 0.01111 0.001102 0.1054 0.00523
Here, we see that \(\hat{\sigma}^2\) and \(\hat{\sigma}\) are estimated more precisely as their SE values are much smaller than the estimates. Hence, we estimate volatility better than the mean.
For \(\hat{\rho}_{ij}\) we have
## rhohat.vals se.rhohat
## AMZN,BA 0.142 0.0688
## AMZN,COST 0.315 0.0632
## AMZN,JWN 0.374 0.0604
## AMZN,SBUX 0.309 0.0635
## BA,COST 0.177 0.0680
## BA,JWN 0.321 0.0630
## BA,SBUX 0.258 0.0655
## COST, JWN 0.407 0.0585
## COST,SBUX 0.336 0.0622
## JWN, SBUX 0.390 0.0595
Here, the correlations are estimated reasonably well especially when the correlation is closer to 1.
- For each parameter \(\mu_i, \sigma_{i}^{2}, \sigma_i, \rho_{i,j}\) compute 95% and 99% confidence intervals. Briefly comment on the width of these intervals.
Approximate \(95\%\) and \(99\%\) confidence intervals for a parameter \(\theta\) have the form
\[ \hat{\theta} \pm 2 \times \widehat{\mathrm{SE}}(\hat{\theta}), \hat{\theta} \pm 3 \times \widehat{\mathrm{SE}}(\hat{\theta}) \]
The intervals for \(\mu_i\) are
mu.lower.95 = muhat.vals - 2*se.muhat
mu.upper.95 = muhat.vals + 2*se.muhat
mu.lower.99 = muhat.vals - 3*se.muhat
mu.upper.99 = muhat.vals + 3*se.muhat
cbind(mu.lower.95,mu.upper.95, mu.lower.99,mu.upper.99)
## mu.lower.95 mu.upper.95 mu.lower.99 mu.upper.99
## AMZN -0.003133 0.0440 -0.01491 0.0557
## BA -0.005696 0.0188 -0.01183 0.0250
## COST -0.000554 0.0209 -0.00592 0.0263
## JWN -0.005507 0.0265 -0.01350 0.0345
## SBUX -0.000216 0.0294 -0.00761 0.0368
The \(95\%\) and \(99\%\) CIs for \(\mu\) are wide containing both positive and negative values. This reveals high unGWNtainty as the economic interpretation of a negative mean and positive mean are very different.
The intervals for \(\sigma^2_i\) and \(\sigma_i\) are
sigma2.lower.95 = sigma2hat.vals - 2*se.sigma2hat
sigma2.upper.95 = sigma2hat.vals + 2*se.sigma2hat
sigma2.lower.99 = sigma2hat.vals - 3*se.sigma2hat
sigma2.upper.99 = sigma2hat.vals + 3*se.sigma2hat
cbind(sigma2.lower.95,sigma2.upper.95,sigma2.lower.99,sigma2.upper.99)
## sigma2.lower.95 sigma2.upper.95 sigma2.lower.99 sigma2.upper.99
## AMZN 0.02256 0.03373 0.01976 0.03652
## BA 0.00612 0.00915 0.00536 0.00991
## COST 0.00468 0.00700 0.00410 0.00758
## JWN 0.01041 0.01556 0.00912 0.01685
## SBUX 0.00890 0.01331 0.00780 0.01441
sigma.lower.95 = sigmahat.vals - 2*se.sigmahat
sigma.upper.95 = sigmahat.vals + 2*se.sigmahat
sigma.lower.99 = sigmahat.vals - 3*se.sigmahat
sigma.upper.99 = sigmahat.vals + 3*se.sigmahat
cbind(sigma.lower.95,sigma.upper.95,sigma.lower.99,sigma.upper.99)
## sigma.lower.95 sigma.upper.95 sigma.lower.99 sigma.upper.99
## AMZN 0.1511 0.1844 0.1428 0.1927
## BA 0.0787 0.0961 0.0744 0.1004
## COST 0.0688 0.0840 0.0650 0.0878
## JWN 0.1026 0.1253 0.0970 0.1309
## SBUX 0.0949 0.1158 0.0897 0.1211
The CIs for \(95\%\) and \(99\%\) are not too wide indicating that the volatilities are estimated reasonably precisely. Looking at the \(95\%\) CIs we see that AMZN clearly has the largest volatility. The CIs for BA and CST are very similar and the CIs for JWN and SBUX are also very similar.
The intervals for \(\rho^2_{ij}\) are
rho.lower.95 = rhohat.vals - 2*se.rhohat
rho.upper.95 = rhohat.vals + 2*se.rhohat
rho.lower.99 = rhohat.vals - 3*se.rhohat
rho.upper.99 = rhohat.vals + 3*se.rhohat
cbind(rho.lower.95,rho.upper.95,rho.lower.99,rho.upper.99)
## rho.lower.95 rho.upper.95 rho.lower.99 rho.upper.99
## AMZN,BA 0.0044 0.279 -0.0644 0.348
## AMZN,COST 0.1887 0.442 0.1255 0.505
## AMZN,JWN 0.2528 0.494 0.1924 0.555
## AMZN,SBUX 0.1816 0.436 0.1181 0.499
## BA,COST 0.0412 0.313 -0.0268 0.381
## BA,JWN 0.1948 0.447 0.1318 0.510
## BA,SBUX 0.1269 0.389 0.0614 0.454
## COST, JWN 0.2901 0.524 0.2316 0.583
## COST,SBUX 0.2118 0.461 0.1496 0.523
## JWN, SBUX 0.2705 0.509 0.2109 0.568
The CIs for the asset pairs except (AMZN, BA) and (BA,COST) are not too wide and indicate small to moderate correlations. The \(95\%\) CIs for (AMZN, BA) and (BA,COST) have lower bounds that are close to zero and the \(99\%\) intervals have negative lower bounds!
- Using the estimated values of \(\mu_i\) and \(\sigma_i\) for each stock, compute the normal distribution 1% and 5% monthly value-at-Risk (VaR) based on an initial $100,000 investment. Compare these values with the historical VaR values computed from the empirical quantiles. Hint: remember that we are using continuously compounded returns.
With cc returns, normal \(\mathrm{VaR}_{\alpha}\) is computed using \(\mathrm{VaR}_{\alpha} = W_0 \times (e^{\hat{q}^{R}_{\alpha}}-1)\) where \(\hat{q}^R_{\alpha} = \hat{\mu} + \hat{\sigma} \times q^Z_{\alpha}\). To compute VaR we first write a simple function to vectorize the calculation for a group of assets:
Value.at.Risk = function(x,p=0.05,w=100000) {
x = as.matrix(x)
q = apply(x, 2, mean) + apply(x, 2, sd)*qnorm(p)
VaR = (exp(q) - 1)*w
VaR
}
Then, \(\mathrm{VaR}_{.05}\) is
## AMZN BA COST JWN SBUX
## -22549 -12817 -10909 -16217 -14680
Here, AMZN produces the largest loss with \(5\%\) probability ($22,549) and COST produces the lowest loss ($10,909). The \(\mathrm{VaR}_{.01}\) values are
## AMZN BA COST JWN SBUX
## -30916 -17857 -15430 -22477 -20593
With 1% probability (1 in 100 months), AMZN could lose $30,916 over a month and COST could lose $15,430.
Part II: Monte Carlo Simulation in the GWN Model
Using the technique of Monte Carlo simulation, create \(1000\) simulated data sets of size \(T = 203\) from the GWN model using as true parameters the estimated parameters for AMZN: \(\hat{\mu} = 0.0204\), \(\hat{\sigma}^2 = 0.0281\), and \(\hat{\sigma} = 0.168\). Use set.seed(123)
to initialize the random number generator. For each of the \(1000\) data sets, compute \(\hat{\mu}\), \(\hat{\sigma^2}\) and \(\hat{\sigma}\) using the R functions mean()
and var()
and sd()
.
mu = 0.0204
sigma = 0.168
n.obs = 203
set.seed(123)
n.sim = 1000
sim.means = rep(0,n.sim)
sim.vars = rep(0,n.sim)
sim.sds = rep(0,n.sim)
for (sim in 1:n.sim) {
sim.ret = rnorm(n.obs,mean=mu,sd=sigma)
sim.means[sim] = mean(sim.ret)
sim.vars[sim] = var(sim.ret)
sim.sds[sim] = sqrt(sim.vars[sim])
}
- Create histograms for the 1000 values of \(\hat{\mu}\), \(\hat{\sigma}^2\), and \(\hat{\sigma}\) . Are the centers of these histograms close to the true values \(\hat{\mu} = 0.0204\), \(\hat{\sigma}^2 = 0.0281\), and \(\hat{\sigma} = 0.168\)? Do the distributions look normal?
par(mfrow=c(2,2))
hist(sim.means,xlab="mu hat", col="slateblue1")
abline(v=mu, col="white", lwd=2)
hist(sim.vars,xlab="sigma2 hat", col="slateblue1")
abline(v=sigma^2, col="white", lwd=2)
hist(sim.sds,xlab="sigma hat", col="slateblue1")
abline(v=sigma, col="white", lwd=2)
par(mfrow=c(1,1))
The centers of the histograms are very close to the true values (white lines in the histograms) indicating that the estimates are unbiased. Also, the histogram shapes look normal (symmetric, bell-shaped).
- Compute the average and standard deviation of \(\hat{\mu}\), \(\hat{\sigma}^2\), and \(\hat{\sigma}\) across the \(1000\) simulated samples. How close are the Monte Carlo averages of \(\hat{\mu}\), \(\hat{\sigma}^2\), and \(\hat{\sigma}\) to their true values? How close are the Monte Carlo standard deviations to the analytic standard error estimates of \(\hat{\mu}\), \(\hat{\sigma}^2\), and \(\hat{\sigma}\) computed from the actual data from Part I question 2 above?
Compare average of \(\hat{\mu}\) to true value:
## [1] 0.0204 0.0209
## [1] 0.000477
Almost the same - unbiased estimate.
Compare average of \(\hat{\sigma}^2\) to true value:
## [1] 0.0282 0.0283
## [1] 8.93e-05
Almost the same - unbiased estimate.
Compare average of \(\hat{\sigma}\) to true value:
## [1] 0.168 0.168
## [1] 5.38e-05
Almost the same - unbiased estimate.
Now, compare the SDs from the histograms with the estimated SEs from the sample.
## AMZN
## 0.0118 0.0118
Spot on!
## AMZN
## 0.00279 0.00284
Almost the same
## AMZN
## 0.00833 0.00845
Almost the same!
References
Carmona, R. 2014. Statistical Analysis of Financial Data in R, Second Edition. New York: Springer.
Jondeau, E., S.-H. Poon, and M. Rockinger. 2007. Financial Modeling Under Non-Gaussian Distributions. New York: Springer.
Pfaff, B. 2013. Financial Risk Modelling and Portfolio Optimization with R. Chichester, UK.: Wiley.
Ruppert, D., and D. S. Matteson. 2015. Statistics and Data Analysis for Financial Engineering with R Examples. New York: Springer.
In the measurement error model, \(r_{it}\) represents the \(t^{th}\) measurement of some physical quantity \(\mu_{i}\) and \(\varepsilon_{it}\) represents the random measurement error associated with the measurement device. The value \(\sigma_{i}\) represents the typical size of a measurement error.↩︎
The model (6.6) is technically a random walk with drift \(\mu_{i}\) A pure random walk has zero drift (\(\mu_{i}=0\))↩︎
The notion that future changes in asset prices cannot be predicted from past changes in asset prices is often referred to as the weak form of the efficient markets hypothesis.↩︎
Monte Carlo refers to the famous city in Monaco where gambling is legal.↩︎
Alternatively, the returns can be simulated directly by simulating observations from a normal distribution with mean \(0.0\) and standard deviation 0.10.↩︎
rmvnorm
returns a matrix of dimensionn.obs
by 3. The simulated news shocks for each asset are in the columns ofe.sim
. Transposinge.sim
allows the mean vector to be added row-wise. Transposing again creates a column vector of simulated returns.↩︎