# 4 Covariance Estimation

## 4.1 ARCH and GARCH

This small exercise illustrates how to perform maximum likelihood estimation in `R`

at the simple example of ARCH and GARCH models. I advice you to first write the code for the basic specification on your own, afterwards the exercises will help you to get familiar with well-established packages in `R`

that provide you with the same (and many additional more sophisticated) methods to estimate conditional volatility models.

### 4.1.1 Exercises

- As a benchmark dataset, download prices and compute daily returns for all stocks that are part of the Dow Jones 30 index (
`ticker <- "DOW"`

). Decompose the time-series into a predictable part and the residual, \(r_t = E(r_t|\mathcal{F}_{t-1}) + \varepsilon_t\). The easiest approach would be to demean the time-series by simply subtracting the sample mean but in principle more sophisticated methods can be used. - For each of the 30 tickers, illustrate the rolling window standard deviation based on some suitable estimation window length.
- Write a small function that computes the maximum likelihood estimator (MLE) of an ARCH(\(p\)) model.
- Write a second function that implements MLE estimation for an GARCH(\(p,q\)) model.
- What is the unconditional estimated variance of the ARCH(\(1\)) and the GARCH(\(1, 1\)) model for each ticker?

Next, familiarize yourself with the `rugarch`

package to perform more sophisticated volatility modeling. Here you can find a great example on how to unleash the flexibility of `rmgarch`

.

Use either your own code or the

`rugarch`

package to fit a GARCH and an ARCH model for each of the 30 time series and create 1-day ahead volatility forecasts with one year as the initial estimation window. Compare the forecasts to a 1-day ahead volatility forecast based on the sample standard deviation (often called the random walk model). Illustrate the forecast performance against the squared return of the next day and compute the mean squared prediction error of each model. What can you conclude about the predictive performance?Compute and illustrate the model-implied Value-at-risk, defined as the lowest return your model expects with a probability of less than 5 %. Formally, the VaR is defined as \(\text{VaR} _{\alpha }(X)= -\inf {\big \{}x\in \mathbb {R} :F_{-X}(x)>\alpha {\big \}}=F_{-X}^{-1}(1-\alpha)\) where \(X\) is the return distribution. Illustrate stock returns which fall below the estimated Value-at-risk. What can you conclude?

### 4.1.2 Solutions

```
library(tidyverse)
library(tidyquant)
```

As usual, I use the `tidyquant`

package to download price data of the time series of 30 stocks and to compute (adjusted) log returns.

```
<- tq_index("DOW")
ticker <- tq_get(ticker %>%
prices filter(ticker!="DOW"), get = "stock.prices") # This is a strange bug/error in tidyquant, the ticker DOW is not part of the index DOW.
<- prices %>%
returns group_by(symbol) %>%
mutate(log_price = log(adjusted),
return = 100 * (log_price - lag(log_price))) %>%
select(ticker = symbol, date, return) %>%
na.omit() %>%
mutate(return = return - mean(return))
```

I use the `slider`

package for convenient rolling window computations. For a detailed documentation, consult (this homepage.)[https://github.com/DavisVaughan/slider]

```
library(slider)
<- returns %>%
rolling_sd group_by(ticker) %>%
mutate(rolling_sd = slide_dbl(return, sd,
.before = 100)) # 100 day estimation window
%>%
rolling_sd ggplot(aes(x=date, y = rolling_sd, color = ticker)) +
geom_line() +
labs(x = "", y = "Standard deviation (rolling window") +
theme(legend.position = "bottom")
```

The figure illustrates at least two relevant features: i) Although rather persistent, volatilities change over time and ii) volatilities tend to co-move.

Next I write a short, very simplified script which performs maximum likelihood estimation of the parameters of an ARCH(\(p\)) model in `R´.

```
<- returns %>%
ret filter(ticker == "AAPL") %>%
pull(return) # One example time series
# Compute ARCH and GARCH
<- function(params, p){
logL <- cbind(ret, 1)
eps_sqr_lagged for(i in 1:p){
<- cbind(eps_sqr_lagged, lag(ret, i)^2)
eps_sqr_lagged
}<- na.omit(eps_sqr_lagged)
eps_sqr_lagged <- eps_sqr_lagged[,-1] %*% params
sigma 0.5 * sum(log(sigma)) + 0.5 * sum(eps_sqr_lagged[,1]^2/sigma)
}
```

The function `logL`

computes the (log) likelihood of an ARCH specification with \(p\) lags. Note that it returns the *negative* log likelihood because most optimization procedures in `R`

are designed to search for minima instead of maximization.

The next few lines show how to estimate the model with `optim`

and `p = 2`

lags. Note my (almost) arbitrary choice for the initial parameter vector. Also, note that the function `logL`

and the optimization procedure does not enforce the regularity conditions which would ensure stationary volatility processes.

```
<- 2
p <- (c(sd(ret), rep(1, p)))
initial_params <- optim(par = initial_params,
fit_manual fn = logL,
hessian = TRUE,
method="L-BFGS-B",
lower = 0,
p = p)
<- (fit_manual$par)
fitted_params <- sqrt(diag(solve(fit_manual$hessian)))
se
cbind(fitted_params, se) %>% knitr::kable(digits = 2)
```

fitted_params | se |
---|---|

2.07 | 0.10 |

0.18 | 0.03 |

0.19 | 0.03 |

```
# Run the code below to compare the results with tseries package
# library(tseries)
# summary(garch(ret,c(0,p)))
```

We can implement GARCH estimation with very few adjustments

```
# GARCH
<- function(params, p, q, return_only_loglik = TRUE){
garch_logL <- cbind(ret, 1)
eps_sqr_lagged for(i in 1:p){
<- cbind(eps_sqr_lagged, lag(ret, i)^2)
eps_sqr_lagged
}<- rep(sd(ret)^2, nrow(eps_sqr_lagged))
sigma.sqrd for(t in (1 + max(p, q)):nrow(eps_sqr_lagged)){
<- params[1:(1+p)]%*% eps_sqr_lagged[t,-1] +
sigma.sqrd[t] 2+p):length(params)] %*% sigma.sqrd[(t-1):(t-q)]
params[(
}<- sigma.sqrd[-(1:(max(p, q)))]
sigma.sqrd
if(return_only_loglik){
0.5 * sum(log(sigma.sqrd)) + 0.5 * sum(eps_sqr_lagged[(1 + max(p, q)):nrow(eps_sqr_lagged),1]^2/sigma.sqrd)
else{
}return(sigma.sqrd)
}
}
<- 1 # Lag structure
p <- 1
q <- optim(par = rep(0.01, p + q + 1),
fit_garch_manual fn = garch_logL,
hessian = TRUE,
method="L-BFGS-B",
lower = 0,
p = p,
q = q)
<- fit_garch_manual$par fitted_garch_params
```

Next I plot the time series of estimated \(\sigma_t\) based on the output. Note that in order to keep the required code clean and simple my function `garch_logL`

returns the fitted variances with the option `return_only_loglik = FALSE`

.

```
tibble(vola = garch_logL(fitted_garch_params, p, q, FALSE)) %>%
ggplot(aes(x = 1:length(vola), y = sqrt(vola))) +
geom_line() +
labs(x = "", y = "GARCH(1,1) volatility")
```

Next, the typical `tidyverse`

approach: Once we implemented the workflow for 1 asset, we can use `mutate`

and `map()`

to perform the same computation in a tidy manner for all assets in our sample. Recall that the unconditional variance is defined in the lecture slides.

```
# Estimate ARCH(1) for all ticker
<- returns %>%
uncond_var arrange(ticker, date) %>%
nest(data = c(date, return)) %>%
mutate(arch = map(data, function(.x){
<- function(params, p){
logL <- cbind(ret, 1)
eps_sqr_lagged for(i in 1:p){
<- cbind(eps_sqr_lagged, lag(ret, i)^2)
eps_sqr_lagged
}<- na.omit(eps_sqr_lagged)
eps_sqr_lagged <- eps_sqr_lagged[,-1] %*% params
sigma 0.5 * sum(log(sigma)) + 0.5 * sum(eps_sqr_lagged[,1]^2/sigma)
}
<- .x %>% pull(return)
ret <- 2
p <- (c(sd(ret), rep(1, p)))
initial_params <- optim(par = initial_params,
fit_manual fn = logL,
hessian = TRUE,
method="L-BFGS-B",
lower = 0,
p = p)
<- (fit_manual$par)
fitted_params <- sqrt(diag(solve(fit_manual$hessian)))
se
return(tibble(param = c("intercept", paste0("alpha ", 1:p)), value = fitted_params, se = se))
%>%
})) select(-data) %>%
unnest(arch) %>%
group_by(ticker) %>%
summarise(uncond_var = dplyr::first(value) / (1 - sum(value) + dplyr::first(value)))
%>%
uncond_var ggplot() +
geom_bar(aes(x = reorder(ticker, uncond_var),
y = uncond_var), stat = "identity") +
coord_flip() +
labs(x = "Unconditional Variance",
y = "")
```

Advanced question: Make sure you understand what happens if you do **not** define the function `logL`

within the `mutate`

call but instead define it once and for all outside the loop. **Hint:** Calling `logL`

once before running the loop stores the underlying sample as part of the function object and will thus not provide the estimated parameters of your subset of data but instead only of the sample which was present in the working environment once you defined the function. This is sometimes a tricky concept in `R`

, consult Hadley Wickham’s book Advanced R for more information.

Some more code for (in-sample) estimation of a GARCH model for multiple assets is provided below. For out-of-sample computations, consult the Section on multivariate models.

First, I specify the model (standard Garch(1,1)). The lines afterwards use the function `ugarchfit`

to fit each individual GARCH model for each ticker, the rolling window standard deviation and extracts \(\hat\sigma_t^2\). Note that these are in-sample volatilities in the sense that the entire time series is used to fit the GARCH model. In most applications, however, this is absolutely sufficient.

```
<- ugarchspec(variance.model = list(model = 'sGARCH' , garchOrder = c(1 , 1)))
model.spec
<- returns %>%
model.fit mutate(Rolling = slide_dbl(return, sd,
.before = 250)) %>%
nest(data = c(date, Rolling, return)) %>%
mutate(garch = map(data, function(x) sigma(ugarchfit(spec = model.spec ,
data = x %>% pull(return),
solver = "hybrid"))%>%
as_tibble())) %>%
mutate(full_data = map2(data, garch, cbind)) %>%
unnest(full_data) %>%
select(-data, -garch) %>%
rename("Garch" = V1)
```

Next, I provide some illustration of the estimated paths of volatilities for the two estimation procedures.

```
%>%
model.fit pivot_longer(c(Rolling, Garch)) %>%
ggplot(aes(x=date, y = value, color = ticker)) +
geom_line() +
facet_wrap(~name, scales = "free_y") +
theme(legend.position = "bottom")
```

For an illustration of the value at risk computation, consult the lecture slides.

## 4.2 Ledoit-Wolf shrinkage estimation

A severe practical issue with the sample variance covariance matrix in large dimensions (\(N >>T\)) is that \(\hat\Sigma\) is singular. Ledoit and Wolf proposed a series of biased estimators of the variance-covariance matrix \(\Sigma\) which overcome this problem. As a result, it is often advised to perform Ledoit-Wolf like shrinkage on the variance covariance matrix before proceeding with portfolio optimization.

### 4.2.1 Exercises

- Read the paper
*Honey, I shrunk the sample covariance matrix*(provided in Absalon) and try to implement the feasible linear shrinkage estimator. If in doubt, consult Michael Wolfs homepage, you will find useful`R`

and`matlab`

code there. It may also be advisable to take a look at the documentation and source code of the package RiskPortfolios, in particular the function`covEstimation()`

. - Write a simulation that generates iid distributed returns (you can use the random number generator
`rnorm()`

) for some dimensions \(T \times N\) and compute the minimum-variance portfolio weights based on the sample variance covariance matrix and Ledoit-Wolf shrinkage. What is the mean squared deviation from the optimal portfolio (\(1/N\))? What can you conclude?

### 4.2.2 Solutions

The assignment mainly requires you to understand the original paper in depth - it is arguably a hard task to translate the derivations in the paper into working code. Below you find some sample code which you can use for future assignments.

```
<- function(x) {
compute_ledoit_wolf # Computes Ledoit-Wolf shrinkage covariance estimator
# This function generates the Ledoit-Wolf covariance estimator as proposed in Ledoit, Wolf 2004 (Honey, I shrunk the sample covariance matrix.)
# X is a (t x n) matrix of returns
<- nrow(x)
t <- ncol(x)
n <- apply(x, 2, function(x) if (is.numeric(x)) # demean x
x - mean(x) else x)
x <- (1/t) * (t(x) %*% x)
sample <- diag(sample)
var <- sqrt(var)
sqrtvar <- (sum(sum(sample/(sqrtvar %*% t(sqrtvar)))) - n)/(n * (n - 1))
rBar <- rBar * sqrtvar %*% t(sqrtvar)
prior diag(prior) <- var
<- x^2
y <- t(y) %*% y/t - 2 * (t(x) %*% x) * sample/t + sample^2
phiMat <- sum(phiMat)
phi
= function(X, m, n) {
repmat <- as.matrix(X)
X = dim(X)[1]
mx = dim(X)[2]
nx matrix(t(matrix(X, mx, nx * n)), mx * m, nx * n, byrow = T)
}
<- (t(x^3) %*% x)/t
term1 <- t(x) %*% x/t
help <- diag(help)
helpDiag <- repmat(helpDiag, 1, n) * sample
term2 <- help * repmat(var, 1, n)
term3 <- repmat(var, 1, n) * sample
term4 <- term1 - term2 - term3 + term4
thetaMat diag(thetaMat) <- 0
<- sum(diag(phiMat)) + rBar * sum(sum(((1/sqrtvar) %*% t(sqrtvar)) * thetaMat))
rho
<- sum(diag(t(sample - prior) %*% (sample - prior)))
gamma <- (phi - rho)/gamma
kappa <- max(0, min(1, kappa/t))
shrinkage if (is.nan(shrinkage))
<- 1
shrinkage <- shrinkage * prior + (1 - shrinkage) * sample
sigma return(sigma)
}
```

The simulation can be conducted as below. The individual functions simulate iid normal “returns” matrix for given dimensions `T`

and `N`

. Next, optimal weights are computed based on either the variance-covariance matrix or the Ledoit-Wolf shrinkage equivalent. **Note: ** if you want to generate results that can be replicated, make sure to use the function `set.seed()`

. It takes as input any integer and makes sure that whoever runs this code afterwards retrieves exactly the same random numbers within the simulation. Otherwise, (or if you change “2021” to an arbitrary other value) there will be some variation in the results because of the random sampling of returns.

```
set.seed(2021)
<- function(T, N) matrix(rnorm(T * N), nc = N)
generate_returns <- function(mat){
compute_w_lw <- solve(compute_ledoit_wolf(mat))%*% rep(1,ncol(mat))
w return(w/sum(w))
}<- function(mat){
compute_w_sample <- solve(cov(mat))%*% rep(1,ncol(mat))
w return(w/sum(w))
}
<- function(w) length(w)^2 * sum((w - 1/length(w))^2) eval_weight
```

Note that I evaluate weights based on the squared deviations from the naive portfolio (which is the minimum variance portfolio in the case of iid returns). The scaling `length(w)^2`

is just there to penalize larger dimensions harder.

```
<- function(T = 100, N = 70){
simulation <- matrix(NA, nc = 2, nr = 100)
tmp for(i in 1:100){
<- generate_returns(T, N)
mat <- compute_w_lw(mat)
w_lw if(N < T) w_sample <- compute_w_sample(mat)
if(N >= T) w_sample <- rep(NA, N)
1] <- eval_weight(w_lw)
tmp[i, 2] <- eval_weight(w_sample)
tmp[i,
}tibble(model = c("Shrinkage", "Sample"), error = colMeans(tmp), N = N, T = T)
}
<- bind_rows(simulation(100, 60),
result simulation(100, 70),
simulation(100, 80),
simulation(100, 90),
simulation(100, 100),
simulation(100, 150))
%>% pivot_wider(names_from = model, values_from = error) %>%
result ::kable(digits = 2, "pipe") knitr
```

N | T | Shrinkage | Sample |
---|---|---|---|

60 | 100 | 1.31 | 90.40 |

70 | 100 | 1.54 | 174.05 |

80 | 100 | 1.70 | 319.29 |

90 | 100 | 1.92 | 934.57 |

100 | 100 | 2.19 | NA |

150 | 100 | 3.19 | NA |

The results show a couple of interesting results: First, the sample variance-covariance matrix breaks down if \(N > T\) - there is no unique minimum variance portfolio anymore in that case. On the contrary, Ledoit-Wolf like shrinkage retains positive-definiteness of the estimator even if \(N > T\). Further, the (scaled) mean-squared error of the portfolio weights relative to the theoretically optimal naive portfolio is way smaller for the Shrinkage version. Although the variance-covariance estimator is biased, the variance of the estimator is reduced substantially and thus provides more robust portfolio weights.

## 4.3 Multivariate dynamic volatility modeling

This exercise is designed to provide the necessary tools for i) high-dimensional volatility modeling with `rmgarch`

ii) and proper volatility backtesting procedures.

In my personal opinion, while being the de-facto standard for high dimensional dynamic volatility estimation, the `rmgarch`

package documentation is next to impossible to deciffer. I took some effort to provide running code for rather standard procedures but if you encounter a more elegant solution, please let me know! First, make sure you install the `rmgarch`

package and familiarize yourself with the documentation. Some code snippets are provided in the paper Boudt et al (2019), available in Absalon.

### 4.3.1 Exercises

- Use the daily Dow Jones 30 index constituents returns as a baseline return time series. Familiarize yourself with the documentation of the
`rmgarch`

package for multivariate volatility modeling. In particular, write a short script which estimates a DCC model based on the return time series. - Illustrate the estimated volatilities as well as some of the estimated correlation time series.
- Write a script which computes rolling window 1-step ahead forecasts of \(\Sigma_{t+1}\) based on the DCC GARCH model. For each iteration, store the corresponding minimum variance portfolio weights and the portfolio’s out-of-sample return.

### 4.3.2 Solutions

To get started, I prepare the time series of returns as a \((T \times N)\) matrix.

```
#install.packages("rmgarch")
library(rmgarch)
<- returns %>%
returns pivot_wider(names_from = ticker, values_from = return)
<- xts(returns %>% select(-date), order.by = returns$date)
returns.xts
<- ncol(returns.xts) N
```

First `rmgarch`

is the big (multivariate) brother of the `rugarch`

package. Recall that for multivariate models we are often left with specifying the univariate volatility dynamics. `rmgarch`

has a rather intuitive and (in my opinion) way to many options. In fact, `rmgarch`

allows you to specify all kinds of different GARCH model specifications, mean dynamic specifications and also distributional assumptions.

```
<- ugarchspec(variance.model = list(model = 'sGARCH',
spec garchOrder = c(1, 1)))
```

Next we replicate the specification for each of our \(N\) assets. `rmgarch`

has the appealing `multispec`

function for that purpose. In principal it is possible to specify individual dynamics for each asset and to provide a list with different `ugarchspec`

objects. In what follows we simply impose the same structure for each ticker.

```
# Replicate specification for each univariate asset
<- multispec(replicate(N, spec))
mspec
# Specify DCC
<- dccspec(mspec,
mspec model = "DCC",
distribution = "mvnorm")
mspec
```

```
##
## *------------------------------*
## * DCC GARCH Spec *
## *------------------------------*
## Model : DCC(1,1)
## Estimation : 2-step
## Distribution : mvnorm
## No. Parameters : 582
## No. Series : 29
```

The function `dccspec`

imposes the multivariate structure. Consult Boudt et al (2019) for many more alternative specifications.

Next we fit the object to the return time series. Note that you can make use of the fact that the computations can be parallelized very easily (however, DCC is fast enough for this dataset so you can also opt out of using a cluster).

```
# Generate Cluster for parallelization (you may have to change the number of nodes)
<- makeCluster(8)
cl <- 500
out_of_sample_periods
<- dccfit(mspec,
fit out.sample = out_of_sample_periods,
data = returns.xts,
cluster = cl,
fit.control = list(eval.se = FALSE,
stationarity = TRUE))
# Alternative in case you do not want to use parallel computing
# fit <- dccfit(mspec,
# out.sample = 500,
# data = returns,
# solver = c("hybrid", "lbfgs"),
# fit.control = list(eval.se = FALSE,
# stationarity = TRUE))
```

After considerable computing time the object `fit`

contains all estimated parameters of the `DCC`

GARCH model. The model has 617 parameters. Note that with the additional controls we can enforce stationarity conditions for the estimated parameters and (as we only care about forecasting in this exercise) refrain from estimating standard errors.

Sometimes it may be advisable to fit the univariate volatility dynamics first and then estimate the conditional correlation model. This may be an advantage if the univariate models are more involved. Also, if you want to keep the univariate models constant but evaluate the effect of different conditional correlation models, the following approach may safe some computation time.

```
<- multispec(replicate(N, spec)) # Define the univariate models first
mspec <- multifit(mspec, returns) # Fit each model individually
fit
<- dccspec(mspec,
my_dccspec model = "DCC",
distribution = "mvnorm") # Define the multivariate model
<- dccfit(my_dccspec,
my_dccfit data = returns,
fit = fit) # Compute the DCC model based on the pre-estimated individual GARCH models.
```

Next, I create a plot of all estimated time-series of volatilities. Note that transforming `rmgarch`

output into easy-to-work with `tidyverse`

structures is tedious but possible.

```
<- sigma(fit) %>%
fitted_sigmas fortify() %>%
as_tibble() %>%
rename(date = Index) %>%
pivot_longer(-date, names_to = "ticker", values_to = "sigma")
%>%
fitted_sigmas ggplot(aes(x = date,
y = sigma,
color = ticker)) +
geom_line() +
labs(x = "", y = "Fitted sigma") +
theme(legend.position = "bottom")
```

It is even more tedious to get hold of the estimated time series of \((N\times N)\) matrices of correlations.
The code snippet below is doing the job and provides an easily accessible way to illustrate the dynamic correlation coefficients. The figure below illustrates all correlations with ticker `AAPL`

```
<- apply(rcor(fit), 3, . %>% as_tibble(rownames = "ticker") %>%
fitted_correlations pivot_longer(-ticker, names_to = "ticker.2")) %>%
bind_rows(.id = "date") %>%
filter(ticker != ticker.2)
%>%
fitted_correlations filter(ticker.2 == "AAPL") %>%
ggplot(aes(x = as.Date(date),
y = value,
color = ticker)) +
geom_line() +
theme(legend.position = "bottom") + labs(x = "", y = "Fitted correlations")
```

Next we can use the function `dccroll`

to compute rolling window forecasts. In other words, we recursively reestimate the GARCH model and store the one ahead covariance forecasts (and conditional means if we would have specificied a dynamic model). The parallelization makes this process luckily *much* faster. Note that it is important to stop the cluster after you finished the job, otherwise you may run into connection issues.

```
<- dccroll(mspec,
dccrolln data = returns.xts,
n.ahead = 1,
forecast.length = 500,
refit.every = 100,
refit.window = "recursive",
fit.control=list(scale=TRUE),
solver.control=list(trace=1),
cluster = cl)
stopCluster(cl)
```

Final step: minimum variance portfolio weights based on the estimated parameters. Again, this is somewhat cumbersome, maybe there are more efficient ways to do this in `R`

.

```
<- apply(rcov(dccrolln), 3, function(x){
pred <- solve(x %>% as.matrix()) %*% rep(1, N)
w return(w/sum(w))}) %>%
t()
apply(pred * returns.xts %>% tail(500), 1, sum) %>%
enframe() %>%
summarise(sd = sqrt(250) * sd(value))
```

```
## # A tibble: 1 x 1
## sd
## <dbl>
## 1 21.7
```

## 4.4 Portfolio optimization under transaction costs

In this exercise we extend the simple portfolio analysis substantially and bring the simulation closer to a realistic framework. We will penalize turnover, evaluate the out-of-sample performance *after* transaction costs and introduce some robust optimization procedures in the spirit of the paper *Large-scale portfolio allocation under transaction costs and model uncertainty*, available in Absalon.

### 4.4.1 Exercises

- Download the file
`returns.rds`

. The file contains the return time series for 274 different tickers and 2404 trading days used for the paper Hautsch et al (2019). - Write a function that computes optimal portfolio weights after adjusts for \(L_2\) transaction costs
*ex-ante*. Thus, the function should take the moment estimates \(\mu_t\) and \(\Sigma_t\), the previous portfolio weight \(\omega_{t^+} := \frac{\omega_{t} \odot \left(1 + r_t\right)}{1 + w_t'r_t}\) where \(\odot\) denotes element-wise multiplication as well as the transaction cost parameter \(\beta\) into account. Within this exercise we assume that transaction costs are quadratic and of the form \[\frac{\beta}{2}\left(w_{t+1} - w_{t^+}\right)'\left(w_{t+1} - w_{t^+}\right).\] You can consult Equation (7) in Hautsch et al (2019) for further information. - Analyse how different transaction cost values \(\beta\) affect portfolio rebalancing. You can assume that the previous allocation was the naive portfolio. Show that for high values of \(\beta\) the initial portfolio becomes more relevant.
- Write a script that simulates the performance of 3 different strategies
*before*and*after*adjusting for transaction costs for different values of \(\beta\): A (mean-variance) utility maximization (risk aversion \(\gamma = 4\)), a naive allocation that rebalances daily and a (mean-variance) utility with turnover adjustment (risk aversion \(\gamma = 4\)). Assume that \(\beta = 1\). Evaluate the out-of-sample mean, standard deviation and Sharpe ratios. What do you conclude about the relevance of turnover penalization?

### 4.4.2 Solutions

The dataset is provided in Absalon and can be used without further adjustments.

```
<- read_rds("hautsch_voigt_returns.rds")
returns <- returns / 100 returns
```

First I fix the relevant parameters for the evaluation, specifically the risk aversion \(\gamma\) and the number of assets \(N\).

```
<- colnames(returns)
ticker <- length(ticker)
N <- 4 gamma
```

The function `optimal_tc_weight`

generates optimal portfolio weights based on the parameters \(\mu\) and \(\Sigma\) and the previous allocation \(\omega_{t-1 ^+}\). Note (reassure yourself) that the function returns the mean-variance optimal portfolio if the transaction cost parameter \(\beta = 0\). In that case, the previous allocation vector becomes irrelevant.

```
<- function(w_prev,
optimal_tc_weight
mu,
Sigma, beta = 0,
gamma = 4){
<- ncol(Sigma)
N <- rep(1, N)
iota <- Sigma + beta / gamma * diag(N)
Sigma_proc <- mu + beta * w_prev
mu_proc
<- solve(Sigma_proc)
Sigma_inv
<- Sigma_inv %*% iota
w_mvp <- w_mvp / sum(w_mvp)
w_mvp <- w_mvp + 1/gamma * (Sigma_inv - w_mvp %*% t(iota) %*% Sigma_inv) %*% mu_proc
w_opt return(w_opt)
}
```

Next I fix the parameters \(\mu\) and \(\Sigma\) for the short simulation study. I use Ledoit Wolf shrinkage and assume that the (conditional) mean returns are 0.

```
# Illustrate effect of Turnover penalization
<- compute_ledoit_wolf(returns)
Sigma <- 0 * colMeans(returns)
mu
<- function(beta){
beta_concentration return(sum((optimal_tc_weight(rep(1/N, N), mu, Sigma, beta = beta, gamma = 4) - rep(1/N, N))^2))
}
<- tibble(beta = 20 * qexp((1:99)/100)) %>%
beta_effect mutate(concentration = map_dbl(beta, beta_concentration))
%>%
beta_effect ggplot(aes(x = beta, y = concentration)) +
geom_line() +
scale_x_sqrt() +
labs(x = "Transaction cost parameter",
y = "Rebalancing")
```

The figure illustrates that rebalancing substantially decreases for increasing transaction costs. In the limit \(\beta\rightarrow\infty\) the *optimal* portfolio will just remain \(\omega_{t-1 ^+}\) and for \(\beta = 0\) the chosen allocation corresponds to the mean-variance efficient portfolio.

Finally, out-of-sample backtest. As outline in the paper Hautsch et al (2019) a rather high value for \(\beta\) is required to mimic meaningful transaction costs. Thus we set \(\beta\) to 1 (which corresponds to 10000 basis points). Similar to the simple analysis, the code below performs the following steps: For each day, we recompute the conditional moments of the return distribution and choose optimal weights. Then we compute the out-of-sample return, turnover and the net return after adjusting for rebalancing costs.

```
# OOS experiment
<- 500
window_length <- nrow(returns) - window_length # total number of out-of-sample periods
periods
<- 1 # Transaction costs
beta <- matrix(NA,
oos_values nrow = periods,
ncol = 3) # A matrix to collect all returns
colnames(oos_values) <- c("raw_return", "turnover", "net_return") # we implement 3 strategies
<- list(oos_values,
all_values
oos_values,
oos_values)
<- w_prev_2 <- w_prev_3 <- rep(1/N ,N)
w_prev_1
for(i in 1:periods){ # Rolling window
# Extract information
<- returns[i : (i + window_length - 1),] # the last X returns available up to date t
return_window
# Sample moments
<- cov(return_window)
Sigma <- 0 * colMeans(return_window)
mu
# Optimal TC robust portfolio
<- optimal_tc_weight(w_prev = w_prev_1, mu = mu, Sigma = Sigma, beta = beta, gamma = gamma)
w_1 # Evaluation
<- returns[i + window_length, ] %*% w_1
raw_return <- sum((as.vector(w_1)-as.vector(w_prev_1))^2)
turnover # Store realized returns
<- raw_return - beta * turnover
net_return 1]][i, ] <- c(raw_return, turnover, net_return)
all_values[[#cat(round(c(100 * i/periods, raw_return, turnover, net_return), 2), "\n")
#Computes adjusted weights based on the weights and next period returns
<- w_1 * as.vector(1 + returns[i + window_length, ])
w_prev_1 <- w_prev_1 / sum(as.vector(w_prev_1))
w_prev_1
# Naive Portfolio
<- rep(1/N, N)
w_2 # Evaluation
<- returns[i + window_length, ] %*% w_2
raw_return <- sum((as.vector(w_2)-as.vector(w_prev_2))^2)
turnover # Store realized returns
<- raw_return - beta * turnover
net_return 2]][i, ] <- c(raw_return, turnover, net_return)
all_values[[#Computes adjusted weights based on the weights and next period returns
<- w_2 * as.vector(1 + returns[i + window_length, ])
w_prev_2 <- w_prev_2 / sum(as.vector(w_prev_2))
w_prev_2
# Mean-variance efficient portfolio
<- optimal_tc_weight(w_prev = w_prev_3, mu = mu, Sigma = Sigma, beta = 0, gamma = gamma)
w_3 # Evaluation
<- returns[i + window_length, ] %*% w_3
raw_return <- sum((as.vector(w_3) - as.vector(w_prev_3))^2)
turnover # Store realized returns
<- raw_return - beta * turnover
net_return 3]][i, ] <- c(raw_return, turnover, net_return)
all_values[[#Computes adjusted weights based on the weights and next period returns
<- w_3 * as.vector(1 + returns[i + window_length, ])
w_prev_3 <- w_prev_3 / sum(as.vector(w_prev_3))
w_prev_3 }
```

The few final lines summarize the results. Note the effect of turnover penalization on the Sharpe ratio after adjusting for transaction costs.

```
<- lapply(all_values, as_tibble) %>% bind_rows(.id = "strategy")
all_values
%>%
all_values group_by(strategy) %>%
summarise(Mean = 250* 100 * mean(net_return),
SD = sqrt(250) * 100 * sd(net_return),
Sharpe = if_else(Mean > 0, Mean/SD, NA_real_),
Turnover = 10000 * mean(turnover)) %>%
mutate(strategy = case_when(strategy == 1 ~ "MV (TC)",
== 2 ~ "Naive",
strategy == 3 ~ "MV")) %>%
strategy ::kable(digits = 4) knitr
```

strategy | Mean | SD | Sharpe | Turnover |
---|---|---|---|---|

MV (TC) | 13.4877 | 11.1308 | 1.2117 | 0.0055 |

Naive | 13.3285 | 20.9976 | 0.6348 | 0.0100 |

MV | -80.6357 | 23.4857 | NA | 36.8183 |