Chapter 17 Variance Swaps

I was quite surprised to frequently receive questions about variance swaps during zoom meetings with future practicioners eager to step into the industry. So let's add a chapter about these products by summarizing the fantastic article written by Allen, Einchcomb and Granger on this subject.

17.1 Introduction

Variance swaps are instruments which offer investors straightforward and direct exposure to the volatility of an underlying asset without the path-dependency issues associated with delta-hedged options. As their names indicate, they are swap contracts where the parties agree to exchange a pre-agreed variance level for the actual amount of variance realised over a period.

Buying a variance swap is like being long volatility at the strike level; if the market delivers more than implied by the strike of the option, you are in profit, and if the market delivers less, you are in loss. However variance swaps are convex in volatility: a long position profits more from an increase in volatility than it loses from a corresponding decrease. For this reason variance swaps normally trade above ATM volatility.

The directness of the exposure to volatility and the relative ease of replication through a static portfolio of options make variance swaps attractive instruments for investors and market-makers alike.

17.2 Liquidity

Bid/offer spreads have come in significantly over recent years. For indices, they are typically in the region of 0.5 vegas. For single-stocks, they are in the range of 1-2 vegas in Europe and 2-2.5 in US and Japan. Spreads are naturally higher in emerging markets although these too are becoming more liquid.

The most liquid variance swap maturities are generally from 3 months to around 2 years. The maturities generally coincide with the quarterly options expiry dates, meaning that they can be efficiently hedged with exchange-traded options of the same maturity. The VIX, VSTOXX and VDAX indices represent the theoretical prices of 1-month variance swaps on the SP500, SX5E and DAX indices respectively, and are calculated by the exchanges from listed option prices, interpolating to get 1-month maturity. These volatility indices are widely used as benchmark measures of equity market risk, even though they are only short-dated measures and are not directly tradable.

17.3 Uses of Variance Swaps

Increasingly, investors have come to view volatility itself as an asset class, one that can diversify investment returns or hedge unwelcome investment scenarios.

Some common uses of variance swaps :

Exploiting a volatility view: Ideal for taking a direct view on the volatility of an underlying without the path-dependency issues of a delta-hedged option.

Specific hedging purposes: They can be used for macro-hedging and for hedging specific volatility exposures, such as that resulting from structured products.

Rolling short variance: Short variance swaps can be used to capture the observed equity index volatility risk premium. Rolling short index variance is an attractive systematic volatility strategy from a risk-return perspective. With returns from short volatility trades somewhat uncorrelated with the underlying, these types of strategy work well as overlay strategies aimed at boosting alpha and diversifying returns.

Diversification: Volatility can be thought of as an asset class in its own right, and as such can act to diversify returns within a portfolio. Returns of rolling short volatility index has many similarities to returns of bond index (regular periods of positive P&L and punctual large losses). Short variance swap sometimes replace bonds within an efficiently allocated portfolio as relatively low correlation with equity market.

Index variance spreads: They can be used to trade the spread of volatilities between two indices. Such trades can be thought of as ‘volatility-beta’ trades aiming to profit from a spread of volatilities widening as volatility increases. It attempts to get & long volatility exposure, but using a correlated index with lower beta to mitigate the carry. Be careful as some indices have local 'regime-changes' as their composition or behaviour of the constituents changes. A typical example is the Nasdaq before and after the dot-com bubble.

Relative value single-stock volatility: Use volatility pairs, or cross-sectional regression volatility models to find rich/cheap single-stock volatilities.

Variance dispersion and correlation trading: Trading variance swaps on an index against variance swaps on its constituents provides exposure to equity correlation. Selling variance on an index and buying variance on its constituents has been a profitable strategy. This can be partly explained by the demand for protection at the index level. This is also another way to trade implied correlation.

Forward variance and volatility spikes: Long forward volatility can avoid potentially negative carry at the cost of slide down the term structure, and can be a useful way of positioning for volatility spikes.

The holder of a long variance swap position will have to pay the spread of implied volatility over realised volatility everyday he holds the swap. Since this spread is usually positive, he has a negative carry. A forward starting instrument will have no exposure to carry until the forward starting date is reached.

On the contrary, both spot and forward starting variance swaps have exposure to slide. For upward sloping variance term structure, a long variance swap position loses out through time as it slides down the term structure.

Trading the variance term structure: Variance swaps can be used to trade the shape of the variance term structure. Using variance swaps, you could try to take advantage of a flattening of the volatility term structure or take advantage of a volatility term structure that appears overly convex for example.

Skew and convexity trades: Variance swaps are long skew and convexity. Trading variance against (delta-hedged) vanilla options provides interesting exposures to skew and/or convexity.

Cross asset class trades: Equity Volatility and credit spreads are correlated, both being measures of corporate risk. Variance swaps are useful instruments in debt/equity trades, either at the index or single name level.

17.4 Mechanics

Realized Volatility

We have already broached this subject in chapter 6 but, to my knowledge, variance swaps use a different volatility measure than the annualised standard deviation of daily log returns over a fixed period of time.

It is the root mean-squared measure that is used to define the payout of the variance swap. This RMS volatility measure is like a standard deviation but assuming a zero mean.

\(\boxed{\sigma^2 = \frac{252}{T} \sum^T_{i-1} \left[ ln \left( \frac{S_i}{S_{i-1}} \right) \right]^2}\)


  • \(S_i\) is the underlying's price on day i
  • T is the number of days

Why do we use variance and not volatility?

While volatility is more intuitive as it is measured in the same units as the underlying, variance is in some sense more fundamental as:

  • Variance is additive, whereas volatility is not.
  • Delta hedging of options seeks to capture realised variance, although exposure is complicated by path dependency issues.

Variance swaps pay-out is purely based on realised variance.

17.5 The Contract

The strike of a variance swap is set at trade inception so that the swap initially has zero value according to the prevailing market conditions.

Obviously, the buyer of a variance swap is long volatility. He pays the strike and receives the realised variance at expiry.

By convention, variance swap strikes are quoted in terms of volatility, not variance.

17.5.1 P&L for a long variance swap

The P&L of a variance swap is non-linear (convex) with volatility, although of course it is linear in terms of variance.

\(\boxed{\text{P&L} = N_{VAR} * (\sigma^2 - K^2)}\)


  • K is the variance swap strike
  • \(\sigma^2\) is realised variance
  • \(N_{VAR}\) is the variance notional

17.5.2 Vega Notional / Variance Notional

The notional of a variance swap can be expressed either as a variance notional or a vega notional.

As shown in the equation above, the variance notional represents the P&L per point difference between the strike squared (implied variance) and the subsequent realised variance.

Since most market participants are used to thinking in terms of volatility, trade size is typically expressed in terms of vega notional. The vega notional represents the average P&L for a 1% change in volatility.

\(\boxed{ \text{Vega Notional} = \text{Variance Notional} * 2K}\)

\(\boxed{\text{P&L} = N_{Vega} * \left( \frac{\sigma^2 - K^2}{2K} \right)}\)

The variance swap payout, expressed in vega notional, is locally linear around the strike. In other words, when volatility remains close to the variance swap strike, the variance swap payout is similar to the payout of a linear volatility swap.

For a vega notional of €100k, a gain of €500k is expressed as a profit of 5 vegas (5 times the vega notional).

The daily P&L pattern of a short variance position profits modestly most of the time, but loses heavily on large moves.

17.6 Convexity

As previously mentioned, variance swap payoffs are linear with variance, but convex with volatility.

The vega notional represents only the average P&L for a 1% change in volatility.

As it is convex in volatility, a long variance swap position will always profit more from an increase in volatility than it will lose for a corresponding decrease in volatility. This convexity is the reason that variance swaps strikes trade above ATM volatility.

With no surprise, the convexity premium should depend on the expected variability of the realised volatility. The higher the variability of volatility, the more beneficial the convexity.

For a long position, the maximum vega loss of K/2 happens when realised volatility.

For a short position, the loss is potentially unlimited unless the variance swap is capped. With the standard cap of 2.5%, the maximum vega loss would be 2.625K.

Example of variance swap convexity:

Let us take a 20-day short variance swap position with a strike at 16.5 and a vega notional of €100k. Let us assume that over the 20-day period, the realised volatility was 14%, 2.5% lower than the level of implied volatility sold. You would expect a P&L for this short position of €250k. In reality, the actual P&L will be slightly less than this due to the convexity.

17.7 Mark-to-Market

Marking to market of variance swaps is easy since variance is additive.

\(\boxed{V_T = V_t + V_{t,T}}\)

At an intermediate point in the lifetime of a variance swap t, the expected variance at maturity \(V_T\) is simply the time-weighted sum of the variance realised over the time elapsed \(V_t\) and the implied variance over the remaining time to maturity \(V_{t,T}\).

To compute the mark-to-market of a variance swap, you need:

  • The realised variance since the start of the swap
  • The implied variance from now until expiry
  • A discount factor between now and expiry

\(\boxed{ \text{Variance P&L at time t per unit of variance notional} = \frac{t}{T}\big(\sigma^2_{0,t} - K^2_{0,T}\big) + \frac{T-t}{T} \big(K^2_{t,T} - K^2_{0,T}\big)}\)

You can see from this formula that the total remaining exposure to variance decreases linearly with time.

17.8 Forward Variance

The additive feature of variance can be used to calculate the fair strike of a forward-starting variance swap.

Suppose we know the strike for a short-maturity variance swap expiring at time t (\(K_t\)) and the strike for a longer maturity variance swap expiring at time T (\(K_T\)), then we can easily calculate the fair strike of the forward-starting variance swap (\(K_{t,T}\)) with a combination of:

  • Long \(\frac{T}{T-t}\) variance notional of spot variance maturity T
  • Short \(\frac{t}{T-t}\) variance notional of spot variance maturity t, but with payment delayed until maturity T.

\(\boxed{K^2_{t,T} = \frac{T}{T-t} K^2_T - \frac{t}{T-t} K^2_t}\)

Two observations can be made from this formula:

  • More variance is needed on the longer leg (tends to be less liquid) than the shorter leg.
  • The total notional of the two legs will be greater than the notional of the forward.

Forward-starting variance swap can be useful for taking a direct view on the future value of implied variance or on the future shape of the variance term structure curve.

17.9 Contract Specifications

Variance swaps termsheets contain some interesting conditions that will be highlighted in this section.

Margins and collateral

Variance swaps are OTC products that are usually margined with an initial amount to be posted as collateral (ex: 3*vega). Further margin calls will be made during the life of the trade as necessary.

Disrupted days

Disrupted days are not considered as observation dates on which the realised variance is calculated. This can be impactful and work in both direction as shown by the below examples:

Example of a disrupted day decreasing the realised volatility: a 5% loss is followed by a 6% gain over two consecutive trading days. If the first day is declared as disrupted, this will result in a single combined return of 0.7%.

Example of a disrupted day increasing the realised volatility: a 5% loss is followed by a 6% loss over two consecutive trading days. If the first day is declared as disrupted, this will result in a single combined return of -10.7%.

Index reconstitution risk

Variance swaps on indices pay out on the returns on the index and not on the weighted returns of the basket of current constituents. As a consequence, they are exposed to reconstitution risk as the index may end up with exposure to a different set of stocks with different volatilities!

Dividends Adjustments

Variance swaps on single names are typically adjusted for dividends, meaning that returns on ex-dividend dates are calculated after adjusting for the dividend.

Variance swaps on indices are typically not adjusted for dividends as they are more spread out and therefore have small impact compared to the average daily move.


Variance swaps are usually sold with caps with a standard cap being 2.5 times the strike.


Typical sizes are €100k-200k vega notional for indices and €50k vega notional for single-stocks.

17.10 Observations from historical prices

  • Variance swap prices tend to follow high and low regimes in a similar manner to realised variance.

  • Longer maturity variance swaps levels vary less and react less to spikes. This is not very surprising as, for example, a sudden unexpected event is likely to increase dramatically short-term volatility but is less likely to cause the same level of high volatility over the next few years.

  • Single-stock variance swaps trade at higher levels than index variance swaps (diversification effect).

  • Variance Swaps tend to trade above levels of realised volatility for two reasons:

  1. Volatility risk premium due to risk aversion and hedging programmes --> ATM implied volatility > realised volatility.

  2. Convexity premium since variance swaps are convex in volatility --> variance swap strike > ATM implied volatility.

The theoretical price of variance swaps is calculated from prices of a replicating portfolio of options. With skew and skew convexity, the average volatilities will usually be above ATM implied volatility, making the variance swap more expensive. The price of a variance swap can be thought of as a function of ATM volatility level and the slop of the skew. In practice, contribution of skew component means that variance swap strikes tend to trade at similar level to OTM puts. Skew and convexity become more important factors at longer dates as the probability of reaching more OTM strikes increases.

17.11 Variance Swaps and Option Implied Volatilities

Not surprisingly, variance swap strikes are well correlated with Black-Scholes implied volatilities derived from option prices since both can be interpreted as market estimates of future volatility.

These measures are similar but difference since:

  • ATM IV reflects the market estimate of future volatility realised around the current level.

  • Variance swap strike represents the market estimate of variance, independent of future market level.

17.12 Pricing Rules of Thumb

Theoretically, it is necessart to have prices available for the entire strip of options to calculate the true price of a VS.

However, under some assumptions about the skew, we can reasonably approximate variance swaps prices.

Under Flat Skew

Since under flat skew all strikes trade at an identical implied volatility, the variance swap level will be the constant implied volatility level.

Under Linear Skew

If skew is assumed to be linear, at least for strikes relatively close to the money, then Derman's approximation can be used.

Derman's approximation assumes a linear put skew and a flat call skew. It is a function of three variables, namely:

  • ATM (forward) volatility
  • Slope of the skew
  • Maturity of the swap

\(\boxed{K_{0,T} = \sigma_{ATMF} * \sqrt{1 + 3T *skew^2}}\)

Where skew is generally taken to be the slope of 90/100 skew.

In practice, this approximation works best for:

  • Short-dated variance swap: as maturity increases, OTM strikes have greater effect on the variance swap price and the contribution of skew is therefore more important. The inability of this approximation to account for skew convexity can make it less accurate.

  • Index variance swap: single stocks tend to show more convexity, even at shorter dates.

Most often than not, this approximation tends to underestimate the variance swap price.

Under Log-Linear Skew

In reality, volatility skew is not linear across all option strikes and more accurate approximations can be used.

Assuming that the skew curve is log-linear of the form:

\(\sigma_K = \sigma_{ATMF} - \beta \; ln\big(\frac{K}{F}\big)\)


  • F is the forward price
  • K is the strike
  • \(\beta\) is the slope of the skew


\(\boxed{K_{0,T} = \sqrt{\sigma^2_{ATMF} + \beta \sigma^3_{ATMF} = \frac{\beta^2}{4} \left( 12\sigma^2_{ATMF}T + 5\sigma^4_{ATMF} T^2\right)}}\)

Gatheral's formula

Gatheral expresses the VS strike as an integral of the IVs accross the entire range of strikes.

This formula characterises the skew curve in terms of the BS d2 parameter, which measures the ‘moneyness’ of the associated OTM option.

This leads to potentially powerful methods of variance swap approximation: by fitting a quadratic, or higher order polynomial to the skew surface parameterised in terms of d2, it is then possible to directly calculate a theoretical variance swap price from this parameterisation.


If skew curve is quadratic in variable \(z = d_2 : σ^2(z) = σ^2 + αz+ βz^2\) then theoretical variance swap strike is \(K_{0,T} = σ_0^2T + βT\)

  • in ‘d2-space’ the linear component of the skew, α, has no effect on the variance swap price.

  • base level of volatility \(σ_0\) affects the swap strike.

  • the convexity β affects the swap strike.

17.13 Drivers of Variance Swaps levels

Variance swap levels and options implied volatilities move in clost step with each other. They are driven mostly by the same factors although variance swaps are also driven by the shape of the skew curve.

Historical realised volatility

It is one of the most important drivers of variance swap levels at least at short maturities since the correlation between the 1-month VS and the realised volatility over the previous month is about 0.91. As maturity increases, this correlation between variance swap levels and realised volatility naturally decreases.


Risk-aversion expresses itselfs in the demand for protection in the form of put options. The more risk-aversion, the higher the skew and the skew convexity and the more expensive the variance swap prices.

Market level

Volatility itself is directional, at least over short time frames, with volatility tending to increase if market sells off.

Additional info

Variance swaps flows can have a feedback effect into the market as they account for an import part of the traded vega demand. Their hedging by market makers can therefore influence implied volatilities and skew.

17.14 Are variance swaps good predictor of future volatility?

In theory, variance swap strike represents the market prediction of realised variance over the term of the swap.

In practice, this is complicated by supply/demand factors, especially at longer maturities where driven by structured products flows. Also, at shorter maturities, risk-aversion can bias the variance swap strike to be above the faire expectation of future realised variance.

Anyway, at short dates, implied variance tends to do a relatively good job of forecasting future realised variance. Backtesting shows that it does a better job than previous realised variance.

Given historical data, a simple model of future realised volatility as 0.9 times the variance swap strike provides the best fit with the data.

17.15 Is Variance Swap Convexity fairly priced?

How can we quantify the volatility of volatility priced in the variance swap price?

Well, we can calculate the standard error of the future volatility as estimated by the variance swap strike. The average error is giving information about the historical variability of future realised variance around the estimate provided by 0.9 times the variance swap strike. Backtesting shows that this standard error is about 6%. In other words, the standard deviation of implied variance minus future realised variance is 6%.

We can also calculate an implied variability of volatility by considering spread between short-dated ATM implied volatilities and the variance swaps strikes. This spread for the SX5E suggests an average implied volatility of volatility of approximately 7%. This is fairly close to the 6% average error of the variance swap in predicting volatility.

This 6% vol of vol for SX5R is a long-term average. Like volatility, it changes over time with a min of 2% and a maximum of 10% (historically).

Also, there is a strong correlation between volatility and the spread between variance swap levels and ATM volatility. It makes sens if we believe that volatility of volatility is correlated with volatility.

17.16 Variance Term Structure

The shape and potential movement of the implied variance curve is important in determining the most promising parts of the curve to buy or sell variance.

Variance Swap term structures:

  • are usually upwards sloping
  • have tendency to flatten following increases in volatility
  • generally steeper than ATM vol curves as there is an increasing effect of skew at longer maturities.
  • can be thought of representing the mean-reverting nature of volatility.
  • Short end is most sensitive to prevailing levels of realised volatility.
  • Long end is more anchored to some LT estimate of average volatility.
  • Long end is also driven by structured product flows and tends to be more susceptible to supply/demand dynamics.
    Other important point is the effect of the mark-to-market P&L.

  • A short-date variance swap is principally exposed to realised variance --> gamma.
  • A long-dated variance swap takes on significant exposure to changes in implied variance before expiry --> vega.

An investor has just bought a 5y variance swap. His principal exposure is the 5y implied variance, which is driven by factors not necessarily correlated with current realised variance. His P&L can be quite unpredictable.

Typical movements of term structures can be explained in part by the "root-time" rule. For "normal" move in volatility, the change in implied variance at a given point on the curve will be proportional to \(\frac{1}{\sqrt{T}}\).

  • Flattens normal upward-sloping term structure as volatility increases.
  • Steepens normal upward-sloping term structure as volatility decreases.
  • Changes will be most visible at shorter end of the curve.

Example: if 1y variance increases by 1% --> 3m variance increases by 2% and 4y variances increases by 0.5%.

17.17 From Options to Variance Swaps

To create a portfolio of options with constant exposure to variance, you need to ensure that \(\$\Gamma\) is constant for moves both in S and t.

The cost of this portfolio represents the price of exposure to realised variance.

When considering how to construct such an exposure, there are three possible approaches to take:

1. Use a single vanilla option, but buy/sell additional amounts of it to keep \(\$\Gamma\) constant over time.

Advantages: use only a single option strike


  • option needs to be dynamically traded.
  • position could end up with enormous amounts of option as gamma decreases.

2. Re-strike the option to maintain a constant gamma.

Start with an ATM option and on each re-hedging step, sell/hold option and buy new ATM to achieve constant gamma.

Better than the first approach but still requires dynamic trading of options.

3. Construct a portfolio of options so that dollar Gamma (\(\$ \Gamma\)) remains constant over both moves in S and t.


  • no dynamic trading of options although dynamic delta-hedging.
  • to some extent independent of the volatility process driving the underlying.

Disadvantage: it requires a strip of options across a continuum of strikes.

In theory, the third approach is used.

In practice, it is more like a combination of the 2nd and 3rd approach.

The third approach is to some extent independent of the volatility process driving the underlying.

The first two approaches would require continually calculating the gamma over the course of the trade, and this gamma will be highly (and dangerously) dependent on the assumed IV (and volatility process) at that time.

So what kind of portfolio is needed to achieve a constant dollar gamma across strikes?

Let us have a look at a few graphics to answer this question.

Fig: 16.1 : Peak dollar gamma of an option increases linearly with underlying

Fig: 16.1 : Peak dollar gamma of an option increases linearly with underlying

Fig. 16.1:

  • peak dollar gamma increases linearly with strike.
  • contribution of low-strike options is small compared to high-strike options.
  • we need to increase weights of low-strike options and decrease weights of high-strike options.
Fig: 16.2 : Peak dollar gamma of options divided by strike is constant

Fig 16.2:

  • Naively, it may be thought that weighting by \(\frac{1}{K}\) will achieve constant dollar gamma.
  • It has the property that each option in the portfolio has an equal peak dollar gamma.
  • However, dollar gammas of higher strike options spread out more.
Fig: 16.3 : Weighting options as the inverse of the strike squared gives constant dollar-gamma

Fig. 16.3:

  • summing \(\frac{1}{K}\) - weighted options across all strikes leads to dollar gamma exposure increasing linearly with S.
  • weighting each option by \(\frac{1}{K^2}\) will achieve constant dollar gamma.

Why a portfolio of options weighted by 1/\(K^2\) gives a constant exposure to volatility?

Answering this question requires a bit of math but we will stick to the basics as always.

Delta-neutral P&L = \(\frac{1}{2} \Gamma S^2 \big( \big(\frac{\Delta S}{S}\big)^2 - \sigma ^2 \Delta t\big)\) --> only \(\Gamma S^2\) prevents the direct exposure to variance.

We have to find a portfolio with gamma proportional to \(\frac{1}{S^2}\) to have a constant exposure. Mathematically speaking, we have to find a portfolio whose second derivative with respect to the underlying is proportional to \(\frac{1}{S^2}\).

The negative natural log of S represents such a payoff: \(\frac{\delta^2(-ln \; S)}{\delta S^2} = \frac{1}{S^2}\)

Unfortunately, log contracts are not traded in the market.

But we can replicate such a contract with vanilla options using an infinite sum of calls and puts across the continuum of strikes, each weighted by the inverse square of strike.

Integrating the value of this portfolio at expiry demonstrates that the non-linear part of the payoff is a negative log contract.

\(K_{VAR}^2 = \frac{e^{rT}}{T} \bigg[ \int_0^{F_0} \frac{P_0(K)}{K^2}dK + \int_{F_0}^{\infty} \frac{C_0(K)}{K^2}dK \bigg]\)

The strike-squared, \(K_{VAR}^2\), can be thought of as the (future-valued) inverse strike-squared weighted sum of the time values of the options portfolio.

To summarise:

  • A portfolio of calls and puts, weighted as 1/strike-squared, has constant dollar-gamma;
  • Delta-hedging this portfolio provides constant exposure to the difference between implied and realised variance regardless of where the volatility is delivered;
  • Hence the P&L from delta-hedging this portfolio is proportional to difference between realised and implied variance.

A variance swap can therefore be created by replicating a log contract with options which are then delta-hedged.

17.18 Sensitivity to Skew and Convexity

Skew is commonly thought of as an important component of variance swap prices, with put skews seen as having a much greater impact on prices than call skews.

Although, in practice, this is a useful framework for thinking about how variance swap prices behave, it is not theoretically correct as shown with the below example.

Case 1:

  • 3-month ATM IV @ 20%
  • Linear put skew of 5% (per 10 volatility points) with put IVs capped @ 35% and all OTM call volatilities flat at the level of the ATM IV.

The theoretical 3-month variance strike can then be calculated to be 23.05.

Case 2: Situation where the skew is a mirror image of Case 1.

  • 3-month ATM IV @ 20%
  • All OTM put volatilities are flat @ 20%, but call IVs increase linearly by 5% per 10 points as they become more OTM, capped @ 35%.

In this case the theoretical variance swap price is virtually identical at 23.15.

So the exposure to the skew curve is symmetrical.

Why does this symmetry exist?

Since the \(\frac{1}{K^2}\) replicating portfolio has a much higher weighting of puts, it would be natural to assume that their associated implied volatilities should have a proportionally greater effect on the variance swap strike.

Also, in practice, variance swaps can effectively be priced and hedged with ATM volatility plus a contribution from the skew.

But, as we have seen, the exposure to the skew curve is symmetrical. That is, the contribution to the variance swap price of the volatility of an OTM call is exactly the same as from an OTM put with same (risk-neutral) probability, N(d2), of ending ITM. The OTM puts have a greater weighting in the replicating portfolio, simply because their dollar gamma is lower and so they must be scaled up in order to provide constant dollar gamma across the range of strikes.

So what exactly determines the contribution of volatilities across the skew curve to the variance swap price?

Clearly a very OTM option should make a relatively small contribution to the variance swap price. It then seems sensible that a variance swap represents a kind of weighted average of volatilities across the skew curve, with the closer-to-the-money volatilities higher weighted. In fact, this is exactly the case, with the average being taken over the variances rather than the volatilities, and the weighting function simply being the risk-neutral probability density function, N′(d2), that the corresponding OTM option ends up ITM.

If we define the variable z to be the standard Black-Scholes parameter d2: \(d_2 = \frac{\text{ln} \left(\frac{S}{K}\right) + \left(r \: - \: \frac{\sigma^2}{2}\right)T}{\sigma \sqrt T}\) then it can be shown that \(K^2_{VAR} = \int_{-\infty}^{\infty}N'(z) \sigma^2_{BS} dz\)

The N′(d2) term is the probability density function for the underlying at expiry, T.

That is, the cumulative distribution N(d2) gives the (risk-neutral) probability that the underlying will be trading above z at time T. Thus, the parameter z, simply represents the ‘moneyness’ of the corresponding OTM option. This means that the variance swap price is a weighted sum of squared option implied volatilities weighted by the probability that the (OTM) option will end ITM.

Furthermore, if the skew curve is quadratic in the variable z (the moneyness of the option) of the form: \(\sigma^2_{BS}(z) = \sigma^2_0 + \alpha z + \beta z^2\) then substituting and integrating gives \(K_{VAR} = \sigma^2_0 T + \beta T\), i.e. in ‘d2-space’ the variance swap price is not affected by the linear component of the skew \(\alpha\), but only on the base level of volatility \(\sigma_0\), and the convexity parameter \(\beta\).

This explains why very different (linear) skews, but similar convexities give almost identical variance swap strikes. The reason that there is any difference at all in variance swap strikes of our example, is because in d2-space, the convexity in case 2 is slightly greater, since the maximum volatility is at a strike of 130, which is slightly less OTM than the 70 strike where the maximum volatility is achieved in case 1.

This also helps to explain why the convexity has a greater effect on longer maturity variance.

As maturity increases, the probability of far OTM (in relative strike terms) options ending ITM increases. Therefore, the relative weight of e.g. the 160-strike call will increase with maturity.

At shorter maturities perhaps only the 80-120 portion of the skew surface significantly affects the variance swap price, and this part of the skew is relatively linear (at least for indices).

At 5 years, strikes out to say 50-180 become relevant to the pricing and the convexity of these strikes can be much higher.

In practice, skew is most often thought about in terms of relative strikes, rather than moneyness – but the point remains: the contribution of a point on the skew curve to the variance swap price depends on the (risk-neutral) probability N(d2) of the associated OTM option ending ITM.

Therefore, ATM volatility will provide the greatest contribution to variance swap prices – particularly for short maturities. Both high put skews, and high call skews (where OTM calls have higher volatilities than ATM) will increase variance swap prices.

In fact, there is some kind of feedback effect here because the contribution of each volatility is determined by the probability of an option being ITM, but, for example high put skews will increase the probability of OTM puts ending ITM so both the volatility and the weighting factor will increase.

To sum up:

  • Given a flat skew, variance will price at the same level as ATM volatility.
  • Positive convexity will always act to increase variance swap strikes.
  • With a negatively convex skew – OTM volatilities are (on average) less than ATM volatility – it is theoretically possible that variance could price below ATM volatility.

17.19 Greeks

In this section we consider the Greeks for variance swaps, giving information about the sensitivity of variance swaps to various market variables. We work directly by differentiating the mark-to-market value of the variance swap contract.

The value of a variance swap, per unit vega-notional, at time t is given by:

\(P_t = \frac{1}{2K_0}\big[ \sigma^2_{Expected,t} - K^2_0\big]\)

Note that intra-day there is a term representing the square of the move which will act to give the variance swap delta on an intra-day basis:

\(\sigma^2_{Expected,t} = \frac{t-1}{T} \sigma^2_{0,t-1} + \frac{252}{T} \big[ ln\big( \frac{S_t}{S_{t-1}} \big)\big]^2 + \frac{T-t}{T} K^2_{t,T}\)

where the three terms are written in chronological order:

  • The first term is the realised variance accrued from inception to day t-1.
  • The second term is the realized variance based on the current daily return at the valuation time on day t.
  • The last term is the implied variance expressed as the strike of a variance swap on day t expiring at T squared.

The Greeks of the variance swap can then be calculated by differentiating \(P_t\).

17.19.1 Gamma

Gamma comes only from the exposure to realised volatility on each day:

\(\boxed{\Gamma = \frac{\partial ^2 P}{\partial S^2_t} = \frac{252}{K_0T} \bigg[ \frac{1}{S_t^2} \bigg]}\)

Since the dollar gamma is achieved by scaling the gamma by the spot squared, this gives a constant dollar gamma as expected.

17.19.2 Theta

\(\boxed{\theta = \frac{\partial P}{\partial t} = \frac{-K^2_{t,T}}{2TK_0}}\)

In particular, if the variance strike does not change, theta remains constant. Note, that since T is measured in days this value represents a daily theta.

You simply have to multiply by 252 to annualize it:

\(\boxed{\theta = \frac{\partial P}{\partial t} = 252 \frac{-K^2_{t,T}}{2TK_0}}\)

which can then be shown to satisfy the formula \(\theta = -\frac{1}{2} \Gamma S^2 \sigma^2\) taking \(\sigma^2\) as the implied variance from the variance swap strike.

17.19.3 Vega

We can calculate exposure to volatility in terms of sensitivity to changes in:

  1. ATM volatility
  2. Variance strike
  3. Implied variance (strike squared)
  4. To compute the vega in terms of sensitivity to ATM volatility, we must make some assumptions about how the variance swap strike relates to ATM volatility.

If we assume the Derman approximation: \(K^2_{t,T} = \sigma^2_{ATMF} * \big[1 + 3(T-t) skew^2\big]\), then:

\(\boxed{\frac{\partial P}{\partial \sigma_{ATMF}} = \frac{1}{2K_0} \frac{T-t}{T} \frac{\partial \bigg( \sigma^2_{ATMF} \big[ 1 + 3(T-t) skew^2 \big] \bigg)}{\partial \sigma_{ATMF}} = \frac{\partial \sigma_{ATMF}}{K_0} \frac{T-t}{T} \big[ 1 + 3(T-t) skew^2 \big]}\)

This approximation also allows the calculation of sensitivity to the skew or the skew squared:

\(\boxed{\frac{\partial P}{\partial skew} = \frac{3}{K_0} \frac{(T-t)^2}{T} \sigma^2_{ATMF} skew}\)

Increasing skew will increase the value of the variance swap, and do so by more, if there is more time remaining until expiry.
2. / 3. Computing sensitivities to the strike (or strike squared) is more straightforward and needs no such assumptions about the skew surface:

\(\boxed{\frac{\partial P}{\partial K_{t,T}} = \frac{T-t}{T} \frac{K_{t,T}}{K_0}}\)

\(\boxed{\frac{\partial P}{\partial K^2_{t,T}} = \frac{1}{2K_0} \frac{T-t}{T}}\)

These all tell us that the exposure to implied variance (or volatility) decreases with time as the accrued realised volatility is locked in to the P&L.

17.19.4 Delta

Firstly, assuming that the variance strike K has no sensitivity to the underlying, the variance swap can be seen to take on delta only intra-day:

\(\boxed{\Delta = \frac{\partial P}{\partial S_t} = \frac{T-t}{T} \frac{252}{K_0} \bigg[ 2 ln\bigg( \frac{S_t}{S_{t-1}}\bigg)\frac{1}{S_t} \bigg]}\)

This represents the replication of the log contract which will have to be done at the end of the day to capture that day’s realised variance.

If the variance strike itself has a dependency on the underlying (implied variance is directional) then the variance swap acquires other sources of delta, in addition to the intra-day delta.

For example, using Derman’s approximation, calculating delta (only on the close to avoid the intra-day delta discussed above) gives:

\[ \boxed{\Delta = \frac{\partial P}{\partial S_t} = \frac{T-t}{T} \frac{1}{2K_0} \frac{\partial K^2_{t,T}}{S_t} \\= \frac{T-t}{T} \frac{1}{2K_0} \frac{\partial \bigg(\sigma^2_{ATMF} \big( 1 + 3(T-t)skew^2\big) \bigg)}{\partial S_t} \\ = \frac{T-t}{T} \frac{1}{2K_0} 2\sigma_{ATMF}\frac{\partial \sigma_{ATMF}}{\partial S_t}\big( 1 + 3(T-t)skew^2\big) \\ = \frac{T-t}{T} \frac{1}{2K_0} \frac{2K^2_{t_T}}{\sigma_{ATMF}} \frac{\partial \sigma_{ATMF}}{\partial S} \\ = \frac{T-t}{T} \frac{1}{2K_0} \frac{2K^2_{t_T}}{\sigma_{ATMF}}skew} \]

Thus with a ‘normally shaped’ negatively sloping skew, the delta of the variance swap will be negative, at least if the skew curve is sticky with strike.

This fits with the intuition that IV will tend to go down as the underlying rallies – which is exactly what a (negative) linear skew represents.

17.19.5 Convexity

As explained before, it is really convexity and not skew which acts to increase the variance swap strike above ATM volatility. Clearly then, rising convexity will increase variance swap strikes as the ‘average’ implied volatility used in Gatheral’s formula goes up.

17.20 Setting up a replicating portfolio

We previously saw that a variance swap could be statically replicated by a portfolio of OTM options, weighted according to \(\frac{1}{K^2}\).

How do we weight this portfolio to achieve a specific variance-notional or vega-notional exposure?

If we wish to replicate a variance swap for €100K of vega notional, how many of the calls and puts do we need in the replicating portfolio?

Let \(\Pi\) be forward of the price of the portfolio of OTM options: \(P = e^{rT} \bigg[ \int_0^{F_0} \frac{P_0(K)}{K^2}dK + \int_{F_0}^{\infty} \frac{C_0(K)}{K^2}dK \bigg]\)

This portfolio associated with the dynamic futures positions will pay \(\frac{T}{2} \sigma_R^2\) at expiry.

Scaling this portfolio by 2/T will therefore produce a payout of RV squared but in % terms, so we have to multiply by \(100^2\) to be consistent with the standard quoted variance swap strikes.

The cost of this portfolio represents the fair cost of future realised variance.

\(K_{VAR}^2 = \frac{2 *100^2*\Pi}{T}\)

To be long \(N_{VAR}\) of variance notional, you must buy \(\frac{2 *100^2*N_{VAR}}{T}\) lots of portfolio P, which is equivalent to buying \(\frac{2 *100^2*N_{VAR}}{TK^2}\) of each OTM K-strike option.

The total cost of this portfolio will then be \(N_{VAR} * K^2\). The value of this portfolio and its associated delta hedging at expiry will be \(N_{VAR} * \sigma^2\) giving the desired overall P&L of \(N_{VAR} * (\sigma^2 - K^2)\).

At inception if the cut-off between puts and calls is taken to be the current forward price of the underlying, no net delta- hedge is needed.

However, during the lifetime of the variance swap, at the end of each day's close the notional value of the delta hedge is adjusted so that it equals: \(\frac{2*100^2 * N_{VAR}}{T} \bigg( \frac{F_0 - F_t}{F_0} \bigg)\)

were \(F_0\) and \(F_t\) were respectively the original forward price at time 0 and the current forward time.

To achieve a specific vega notional, \(N_{VAR} = \frac{N_{Vega}}{2K_{VAR}}\), we must trade \(\frac{100^2*N_{Vega}}{T*K_{VAR}}\) lots of the portfolio P.

Since \(K_{VAR} = 100 * \sqrt{\frac{2\Pi}{T}}\), the required amount of the portfolio is \(\frac{100*N_{Vega}}{\sqrt{2 \Pi T}}\) or equivalently \(\frac{100*N_{Vega}}{K^2 \sqrt{2 \Pi T}}\) of each OTM option.

In practice, it is obviously not possible to trade a continuum of option strikes, so we will briefly see how to approximately construct a variance swap from a tradable strip of OTM options. For simplicity, we assume the strikes are equally spaced.

To get a variance notional exposure \(N_{VAR}\), we will need \(\frac{2*100^2*\Delta_K*N_{VAR}}{TK^2}\) of each option.

Be careful with the contract size of the option when using that formula. If one option represents 10 underlying shares, then you will need to divide by 10 the amount of options you would otherwise require.

Also note that this kind of approximation will tend to slighly overvalue the true theoretical variance swap strike due to convexity issues.

17.21 Replicating and Hedging in Practice

In theory, a variance swap can be statically hedged with a portfolio of OTM European options, weighted according to the inverse squares of their strikes. Assuming option prices are available across the entire range of strikes, it makes it easy to value it.

In practice, traded strikes are not continuous, although for major liquid indices they are closely spaced. A more serious limitation is the lack of liquidity in OTM strikes, especially for puts, as these provide a relatively large component of the variance swap price in the presence of steep put skews.

When replicating the variance swap:

  • The long futures position is used to create a pay-out which is equivalent to a long log contract plus realised variance.

  • The long options/short forward position is used to create a short log contract and pay the fixed strike.

Supposing the market falls significantly, the delta-hedge will be long the log contract, while the options should counteract this by being short the log contract. However, if not enough downside puts were used, the options portfolio will not fully reflect the short log-contract exposure needed and hence the overall hedge will lose money.

This lack of liquidity at the wings has led to the development of conditional variance swaps which can remove exposure to volatility once the underlying moves into areas where vanilla options are illiquid. In practice, market-makers will not attempt to hedge with the entire strip of options but typically will use only two or three – including one close to the money and one or more OTM, but liquid, puts.

Alternatively they could approach the replicating by hedging the vega with an OTM put whose implied volatility coincides with the variance swap strike – close to the money for 1-month maturity, 95% for 3-6 months, 90% for 1-year, 85% for 2-3 years etc. In this case they would also look to buy back the wings/convexity separately.

One problem with this kind of approach is that the partial hedge is no longer static, and must be dynamically managed. For example if the market sells off towards the strike, the market maker will have to trade further OTM puts to ensure that their exposure to volatility, in the form of dollar-gamma, remains constant.

This makes the actual variance swap replication more akin to a combination of both alternatives. Here the constant dollar gamma would be maintained by a combination of holding a portfolio which has roughly constant dollar gamma if the underlying does not move too much, and re-hedging by trading more options if the underlying does move significantly.

17.22 Effects of Variance Swaps Hedging

Market-makers who trade variance swaps may hedge their positions by replicating the opposite variance swap position through the replicating options portfolio. This replicating portfolio then needs to be delta hedged.

The effects of delta-hedging this portfolio are different to that from normal delta-hedged options for two principal reasons:

First, the actions of delta-hedging the options could potentially act to the disadvantage of the counterparty’s position.

If the market-maker has sold options to a counterparty, his delta hedging will have the effect of increasing volatility in the underlying: magnifying both up-moves and down-moves.

Similarly, if the market-maker has bought options from a counterparty, his delta hedging will have the effect of suppressing volatility in the underlying, potentially to the advantage of the counterparty who is short the option.

The situation with variance swaps is different.

Suppose that the market is such that market participants have generally sold index variance swaps to market-makers. Note that no exchange of options has taken place here – the parties have just taken opposite sides in a contract for difference. Suppose that the volatility sellers do not hedge their variance swaps (they have sold the variance swaps specifically for the direct volatility exposure they offer). But assume that market-makers hedge their short volatility exposure. A market-maker who is long the variance swap can offset the risk by shorting the replicating portfolio of options, and delta-hedging. They will therefore be short gamma in the options market.

Fig: 16.4 : Flows in the market as a result of a MM buying a VS and replicating it in the market

As described, a delta-hedger who is short options will act to increase volatility in the underlying – buying as it rallies, and selling as it sells-off. However, the action of these market makers hedging their short options will not necessarily act to increase volatility in the underlying, as the counterparties they have sold options to may be counteracting this effect by themselves hedging their long volatility positions.

Second, since variance swap contracts typically measure close-close realised volatility, the options must be delta-hedged on the close only to capture this.

The important difference between the two groups of hedgers is that the variance swap market-makers who are short options, must hedge only on the close to capture the close-close realised variance specified in the variance swap contract. In contrast, the hedgers who are long the options, will generally be free to choose when to delta-hedge, as they attempt to capture the true volatility of the underlying process.

Therefore, the overall effect of hedging these variance swaps need not have the effect of increasing overall market volatility, although it may if the long options positions are not being hedged. However, the important point is that the hedging of long variance swap positions may act to increase close-to-close volatility, with option hedges on the close having the potential effect of magnifying daily moves.

17.23 Why not Volatility Swaps?

As pointed out, the exposure of delta-hedged options to volatility, after accounting for the gamma, is actually an exposure to the difference between implied and realised volatility squared. In this sense, a variance swap mirrors a kind of ideal delta-hedged option whose gamma remains constant. Furthermore, variance swaps are relatively easy to replicate. Once the replicating portfolio of options has been put in place, only delta-hedging is required. No further buying or selling of options is necessary.

All this explains why variance swaps are attractive instruments to trade, but still does not explain why volatility swaps are not also frequently traded.

The main theoretical difficulty with volatility swaps is that they cannot be statically replicated through options. A replicating portfolio must dynamically trade options, and make relatively strong assumptions about the underlying volatility process – in particular about the volatility of volatility. This makes any replication process model dependent, and therefore much more prone to errors than the theoretically robust variance swap replication.

In fact the convention of quoting variance swap notionals in vega, rather than variance, amounts can be seen as an attempt to treat variance swaps like volatility swaps. The vega notional represents the average P&L for a 1 vega change in volatility, but with the convexity meaning that longs will profit by more if volatility increases, and lose by less if volatility decreases. Thus for small changes in volatility, where the effect of the variance swap convexity is relatively limited, variance swaps (measured in vega notional) locally approximate volatility swaps.

Volatility swaps can then be thought of as variance swaps without the convexity. The discount of the volatility swaps to variance swaps should therefore reflect the value of this convexity, which in turn is determined by the volatility of volatility. Seen in this light, the ability to calculate a fair price for volatility swaps, requires not only a price for the volatility (variance), but also a price for the volatility of volatility – i.e. the means of valuing options on volatility.

Since it is variance which arises naturally from delta-hedging options, in terms of volatility products, variance should be thought of as the true underlying. In this framework, volatility swaps are naturally thought of as derivatives of variance – paying the square-root of the variance swap contract.

In fact we could dynamically trade a long variance swap (buying more as volatility decreases and selling as volatility increases) to hedge out this convexity bias.

Similar to delta-hedging an option the P&L made from the resulting buy low – sell high strategy (for variance) will lead to a P&L based on the volatility of volatility: the larger this volatility-of-volatility, the bigger the discount of the dynamically replicated volatility swap to the variance swap.

However, besides issues arising from the transaction costs of dynamically trading variance swaps, estimating this volatility-of-volatility contribution from the dynamic hedging of the variance swap is problematic and model-dependent, making the volatility swap contracts difficult to price.

17.24 Third Generation Products

Those products include gamma swaps, corridor swaps and conditional variance swaps.

17.24.1 Gamma Swaps

Gamma swaps are very similar to variance swaps but maintain a dollar gamma linear with spot, rather than a constant dollar gamma, leading to lower a skew exposure and a slightly lower strike compared to standard variance swaps.

17.24.2 Corridor Variance Swaps

The original corridor variance swaps only accrue variance within a pre-specified range, meaning a long would make a maximal loss when the underlying fails to trade within the range.

17.24.3 Conditional Variance Swaps

Conditional variance swaps have been the most popular of these products, allowing investors to take exposure to realised variance contingent upon the underlying instrument trading within a pre-specified range. Outside this range no P&L accrues. Conditional variance swaps can be useful for hedging complex volatility exposures, taking a view on the volatility levels encapsulated in the skew or buying/selling variance at more attractive levels given a view on the underlying.

As with variance swaps, the sign of the P&L for a conditional variance swap is controlled by the difference between the volatility realised (in the range) over the lifetime of the swap and the pre-agreed fixed strike level. In the case of the conditional, P&L is accrued only when the underlying is within a pre-specified range. In addition, the magnitude of the final P&L is scaled by the proportion of time the underlying has spent in the pre-specified range. If the underlying trades entirely outside the range, the P&L will be zero, so the maximal loss for a long will be when the underlying trades within the range but with very low volatility.

Whilst investors are free to specify the range associated with a conditional variance swap, the two principal types are up- and down- conditional variance swaps (up-variance and down-variance). Up-variance accrues realised volatility only when the underlying is above a pre-specified level (i.e. no upper barrier), while down-variance is accrued only when the underlying is below the specified barrier (i.e. no lower barrier).

In the presence of a positive put-skew, down-variance will normally price above up-variance for close to ATM barrier levels.

Conditional variance swaps can be useful for expressing views on volatility contingent on market level.

For example investors seeking crash protection may purchase conditional down-variance, which only becomes activated in the event of a market sell-off. That is, if the market stays above the down-barrier, P&L is zero.

Conversely investors who believe volatility will be realised on the upside can buy conditional up-variance swaps, which are usually cheaper than a standard variance swap for which a significant amount of the premium is used to fund the downside skew exposure.

Conditional variance swaps have typically been traded on index underlyings. More recently, single-stock conditionals have gained liquidity, allowing investors to trade the volatility of a stock, contingent on stock price.