Chapter 7 Black Scholes

“Time is money.” -Benjamin Franklin






Motivation




This chapter will explore another practical application of Stochastic Processes, specifically in a financial context. Even readers that don’t have interest in quantitative finance may find the derivation and result of Black Scholes interesting.




7.1 Black Scholes Model


The Black Scholes Model, most often just referred to as ‘Black Scholes,’ is one of the most widely used mathematical applications of stochastic processes (and, of course, other fields in Statistics!). It was derived by Fischer Black and Myron Scholes to model a key phenomenon in the financial world.

We’re going to need a little bit of background on some financial concepts before diving in to Black Scholes. You’re probably familiar with what a ‘stock’ is: basically a share of ownership in a company that you can buy and sell on some market (naturally, these shares of ownership usually do better when the company is doing well!). Apple stock, which is traded with the ‘ticker’ (essentially an ID for the stock) of ‘AAPL,’ is a common example. We can easily get historical values for the stock price of AAPL with the quantmod package:

library(quantmod)

# saves an object called AAPL
getSymbols("AAPL")
## [1] "AAPL"
# view results
head(AAPL)
##            AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
## 2007-01-03  3.081786  3.092143 2.925000   2.992857  1238319600      2.573566
## 2007-01-04  3.001786  3.069643 2.993571   3.059286   847260400      2.630688
## 2007-01-05  3.063214  3.078571 3.014286   3.037500   834741600      2.611954
## 2007-01-08  3.070000  3.090357 3.045714   3.052500   797106800      2.624853
## 2007-01-09  3.087500  3.320714 3.041071   3.306071  3349298400      2.842900
## 2007-01-10  3.383929  3.492857 3.337500   3.464286  2952880000      2.978950


We won’t get into what all of these columns mean; for now, we can focus on AAPL.Close, which is simply how much one share of Apple cost when the market closed each day. Let’s visualize the price over time:

data <- data.table(date = index(AAPL),
                   price = as.numeric(AAPL$AAPL.Close))

ggplot(data, aes(x = date, y = price)) +
  geom_line() +
  xlab("") + ylab("price") + ggtitle("Apple Stock Price") +
  theme_bw()


We’re, familiar, then, with the concept of stocks; we’re going to introduce the concept of Call Options. Here’s a video introduction:

Click here to watch this video in your browser



Don’t worry; if videos aren’t your thing, we’ll work through call options in long form. Let’s start with an example of a simple stock (that we simulate; simply for illustration purposes, we will start at $20$20 and simulate from a Normal distribution with a mean of .1.1):

# replicate
set.seed(0) 

# generate data
S <- round(cumsum(c(20, rnorm(99, mean = .1))), 2)

# visualize
data <- data.table(t = 1:length(S),
                   S = S)

ggplot(data, aes(x = t, y = S)) +
    geom_line() +
    theme_bw() +
    xlab("") + ylab("Price") +
    ggtitle("Stock Price over time")


Not too bad; after 100100 days, the stock price has risen to $31.17$31.17. Now let’s define a simple call option, which is a contract that gives you the option to purchase a stock for a specified amount on a specified date. One example would look like this:


Contract A: Call option struck at $35$35 expiring on day 125


This simply means that, if you own the call option, you have the option to buy the stock for $35$35 (this is called the strike price or just the strike) on day 125 (this is called the maturity or expiry; some options allow you to buy the stock at different points in time, including any time up to the maturity, but we will only be dealing with options that allow you to buy the stock at the maturity).


This sounds kind of strange; why on Earth would we want the option to buy something? Well, let’s say that you expect the stock to be worth $40$40 on day 125. You could then, as stated in the contract, ‘exercise’ your option and buy the stock for $35$35 (note that this is below the current market price of $40$40, you only get the right to buy it at a lower price because you own the option) and turn around and sell it to for the current market price of $40$40 for a quick $5$5 profit (you could also hold onto the stock, of course, after you bought it). If things didn’t work out and the stock price fell to $27$27 on day 125, then you just simply wouldn’t exercise your option and it would ‘expire’ (you wouldn’t buy the stock for the $35$35 stated on the option contract, since you could just buy it for $27$27, or the normal market price).

Therefore, intuitively, the call option gives you the option to buy the stock in the case that the stock price goes higher than strike (and thus make a profit). If the stock ends up above the strike price - and thus we exercise our option - we say that the option is in the money. Naturally, if the stock ends below the strike place, we say it is out of the money and we don’t exercise our option (if the stock ends exactly at the strike price, we say that it is at the money).

The key question is: how much would you pay for Contract A?


At first, you’re probably juggling a bunch of initial questions: what is the probability that the stock ends up ‘in the money’ (above the strike price so that we can exercise the option), so that the option isn’t worthless? If the stock does end up in the money, how much will it be in the money?

For starters, to get some intuition on valuing options, let’s compare Contract A to a couple of other call options (all derived on the same stock from above; call options are called derivatives for exactly this reason, as their value is derived from some underlying financial asset - here, the stock).


Contract A: Call option struck at $35$35 expiring on day 125

Contract B: Call option struck at $40$40 expiring on day 125

Contract C: Call option struck at $35$35 expiring on day 110


Let’s think about the relative prices of these contracts. First, which contract is more valuable (should be more expensive): Contract A or B?

The two contracts are identical except for their strike; Contract A has a strike of $35$35 and Contract B has a strike of $40$40. Which option would you pay more for?

Since because, simply put, $40$40 is more than $35$35, the stock has to go higher for Contract B to be ‘in the money’; it is thus harder for us to exercise Contract B and profit off of it. Further, remember, that if we are exercising Contract B (i.e., the stock price is above $40$40), we could also be exercising Contract A and would make $5$5 more! Therefore, Contract A should be more expensive than Contract B (we will see how much more expensive soon).


How about Contract A and Contract C? The only difference here is the expiry; Contract C expires sooner than Contract A (they have the same strike). Which one would you pay more for?

This one is a bit more tricky. The relevant question is straightforward: is it good to have a longer expiry in an option?

This isn’t immediately obvious. In this example, the stock is at around $30$30 at time t=100t=100. For Contract C, the stock would have to get to the strike of $35$35 (and thus be in the money) by time t=110t=110; for Contract B, the stock has until t=125t=125 to get there. However, although it has more time to move upwards in price (and therefore be in the money), it also has more time to move downwards. One could envision a path where the stock price was worth $38$38 at time t=110t=110, thus making Contract B in the money, and then falls to $32$32 by time t=125t=125, making Contract A out of the money. Ultimately, more time could be good or bad.

The answer - which we will see over and over in this chapter - lies in the fact that there is unlimited upside compared to limited downside. In the absolute worst case scenario, the call option is worth zero (the stock is below the strike at expiry). However, theoretically, the stock price could go up infinitely, and thus the price of the call option could also go up infinitely (if you aren’t satisfied with the intuition of infinity, substitute in that the stock could instead just go very, very high and thus the option value could go very, very high). Because we have enormous, essentially unlimited upside and capped downside, more time is good. We want to let our stock bounce around for as long as possible, because, while our losses have a limit, there is a chance that we could win really big.


Now that we have a handle on some of the major factors that make call options more or less valuable, let’s turn to the major topic of this chapter.




7.2 Deriving Black Scholes


The Black Scholes Model is, simply put, a way to value (i.e., put a price on) the options that we discussed above. By using a few assumptions, the model can easily spit out prices for options given a few input parameters (some of which we have already seen, including the current stock price, time until expiry and strike price). Black Scholes is a relevant study here because it makes liberal use of stochastic processes and other statistical tools in general. We can work through the derivation now, adapted from Stephen Blyth’s Introduction to Quantitative Finance.




7.2.1 Log-Normal


We can start the derivation by learning about a new distribution: the Log-Normal distribution. This distribution is so named because, for example, if YY is Log-Normal, then the log (natural log, that is) of YY is Normal. That is, if YY is Log-Normal with parameters μ,σ2μ,σ2, then ln(Y)N(μ,σ2)ln(Y)N(μ,σ2), or XN(μ,σ2)XN(μ,σ2) where Y=eXY=eX (or ln(Y)=Xln(Y)=X).

Let’s try to get comfortable with the distribution by calculating the mean and variance. For the mean, we will start with a standard Normal ZN(0,1)ZN(0,1). By the definition above, eZeZ is Log-Normal (the log of eZeZ is ZZ, which is Normal). Let’s use LOTUS to calculate the expectation of eZeZ (just the function ezez multiplied by the PDF of a standard Normal f(x)f(x) and integrated over the entire support):

E(eZ)=ezf(z)dzE(eZ)=ezf(z)dz =12πezez22dz=12πezez22dz 12πez+z22dz12πez+z22dz

We thus have zz22zz22 in the exponent. We can try completing the square by adding and subtracting 1212:

zz22=zz22+1212zz22=zz22+1212

We can multiply and divide zz by 22 to get a common denominator:

=(2z2z2212)+12=(2z2z2212)+12

=(z2+2z12)+12=(z2+2z12)+12

Adjusting signs:

=(z22z+12)+12

And finally completing the square:

=(z1)22+12

We can go ahead and plug this back into our numerator in the LOTUS calculation:

12πe(z1)22+12dz Factoring out the e12:

=e12(12πe(z1)22dz) Note that the integral (with the constant 12π) is now just the PDF of a N(1,1) random variable, integrated over the support! That is, it simply goes to 1. We are thus just left with:

E(eZ)=e12



Excellent; we have found the mean of eZ. Now let’s turn towards finding the mean of eX, where XN(μ,σ2). We can start by remembering that XσZ+μ (for a quick check, remember that a linear combination of Normals is Normal, so σZ+μ is Normal, and then calculate the mean and variance of σZ+μ to show that you get μ and σ2) and plugging this in:

E(eX)=E(eμ+σZ)=eμE(eσZ)

Let’s use a similar approach to calculate E(eσZ); LOTUS gives us:

E(eσZ)=12πeσzez22dz

Again, we can look at the exponent and complete the square:

σzz22

=(σzz22σ22)+σ22

=(2σz2z22σ22)+σ22

=(z22σz+σ22)+σ22

=(zσ2)+σ22 Plugging back in:

E(eσZ)=12πe(zσ2)+σ22dz =eσ22(12πe(zσ2)dz) Again, we can use the fact that the integral (with the 12π) is just the integral of the PDF of a N(σ,1) random variable, which integrates to 1. We are left with:

E(eX)=eμE(eσZ)=eμ+σ22

Voila! We have calculated the mean of eX, which is Log-Normal. Here’s a video walkthrough of this process:

Click here to watch this video in your browser


Let’s now turn to the variance; we have actually already done most of the leg work. Again, let XN(μ,σ2) so that Y=eX is Log-Normal by definition. We can look to calculate the variance:

Var(eX)=E((eX)2)E(eX)2

We know that E(eX)=eμ+σ22 from the first part, and we know that E((eX)2)=E(e2X). Note that 2XN(2μ,4σ2) (again, a linear combination of a Normal is Normal, so a linear combination of X is Normal; it should be straightforward to show that 2X has a mean of 2μ and a variance of 4σ2), so we can just use our formula for the mean of a Log-Normal (just plugging in 2μ for the mean and 4σ2 for the variance):

Var(eX)=e2μ+4σ22(eμ+σ22)2

=e2μ+2σ2e2μ+σ2

Factoring out:

Var(eX)=e2μ+σ2(eσ21)


Excellent! To recap, we have shown that, if Y is Log-Normal with parameters μ,σ2, we have:

E(Y)=eμ+σ22 Var(Y)=e2μ+σ2(eσ21)



For a video explanation, see below:

Click here to watch this video in your browser



The Log-Normal is useful in finance because the support is non-negative (this makes sense, because the Log-Normal is just the log of something, and the log of something is non-negative). Therefore, for modeling stocks, which cannot have a negative price, the Log-Normal can be more useful than the Normal (who’s support is the entire real line).

We can easily simulate the Log-Normal in R like we would with other distributions (rlnorm is the function that generates random values). Note that the mean and standard deviation parameters we input are equivalent to our set-up above; that is, if we input 2 as the mean in rlnorm, we are saying that this is a random variable that, if logged, is Normal with mean 2. Let’s generate some examples to explore this relation (and test our results for the mean and variance!).

# replicate
set.seed(0)


# initialize parameters
n <- 1000
mu <- 3
sigma <- .7

# generate data
X <- rnorm(n, mu, sigma)
X_log <- rlnorm(n, mu, sigma)

# visualize
ggplot() +
  geom_histogram(aes(X_log)) +
  theme_bw() + ggtitle("Log-Normal")

# visualize log of Log-Normal
ggplot() +
  geom_histogram(aes(log(X_log)), fill = "red", alpha = 1 / 2) +
  geom_histogram(aes(X), fill = "blue", alpha = 1 / 2) +
  theme_bw() + ggtitle("Log of Log-Normal and Normal") +
  xlab("value")

# check results
mean(X_log); exp(mu + sigma ^ 2 / 2)
## [1] 25.56843
## [1] 25.66171
var(X_log); exp(2 * mu + sigma ^ 2) * (exp(sigma ^ 2) - 1)
## [1] 408.2131
## [1] 416.395


Excellent! It looks like, as we discussed, the Log-Normal has a non-negative support (first chart), the log of the Log-Normal seems to match the Normal with the same parameters (second chart with two histograms) and the results that we found for the mean and variance are close to what we observed empirically!




7.2.2 Risk-Neutral Paradigm


Let’s now get back to the financial part of this derivation. We will define the price of a stock at time t as St (note that this is a stochastic process!). Imagine that, in a single period (i.e., from t to t+1), the stock either goes to St(1+u) or St(1+d). The letter u simply stands for ‘up’ and d simply stands for ‘down,’ so we have:

u>d St(1+u)>St(1+d)

Note that u doesn’t necessarily have to be positive (or d be negative), it just has to be greater than d.

Anyways, one potential example would be u=.01 and d=.01. Therefore, for each step, the stock either goes up 1% (we have St(1+.01)) or down by 1% (we have St(1.01))


Now we can introduce the concept of the risk free rate, which we will denote as r. This is the rate that we can lend at for each period.

If this is confusing, let’s think of a simple example where r=.02. This means that the risk free rate is 2%. If you lent $100 at time t=1 at the risk free rate, then you would get back (1+.02)100=102 dollars at time t=2 (over one period). Over n periods, you would get back (1+.02)n100 dollars.

In our current example with u and d as the up and down moves of our stock, we must have:

d<r<u

To show this, we are going to have to introduce the No-Arbitrage Principle. As this is not a book on quantitative finance, we won’t delve too deeply into the mathematics behind this principle, just the definition.

For our purposes, we can define arbitrage as riskless profit above the risk free rate. Let’s see a simple example. Imagine that the risk free rate is r=.02, so that in one period you earn, without any risk, 2%. Now imagine a stock with u=.04 and d=.03; that is, the stock either goes up 3% or up 4%. If this were the case, one could borrow 100 dollars at the risk free rate of .02 and invest that 100 dollars in the stock. After one period, we owe $102 (we borrowed $100 at the risk free rate of .02) but, thankfully, the stock has either gone up to $103 or $104, so we can sell the stock, pay back our loan and still profit $1 or $2.

Therefore, this investment strategy locked in a guaranteed profit of one or two dollars (this strategy is riskless because there is no chance of losing money). Still, it’s not clear that the investor makes more than the risk free rate here; isn’t he just making $1 or $2, which is still less than or equal to the $2 he would make from the risk free rate?

This is where the idea of leverage comes in, or borrowing extra money to amplify the outcomes of the investment strategy. Remember, in the above example, the investor isn’t putting up his own money; he is borrowing the 100 dollars and then investing that 100 dollars in the stock. Therefore, for 0 dollars put up, the investor is making a guaranteed return . What’s more, the investor could borrow and thus invest $100,000 in this strategy, driving returns up to $100 or $200, still for $0 actually put up. Theoretically, there’s no real upper limit to what the investor could borrow and invest in this strategy, and, if all of his returns are based of zero dollars invested, then his ‘rate of return’ is way better than the risk free rate (instead of just making (1+r) on the money he invested, he is making tons of money on zero dollars actually invested).

If this is confusing, check out this video on arbitrage:

Click here to watch this video in your browser


In real life, in situations like this, the extra demand for the stock would drive the stock price up and thus correct any arbitrage (savvy investors would ‘arbitrage-out’ any mispricings). For our purposes, we will be satisfied with the result that no arbitrage is allowed in the construction of our scenarios.



Above we saw the case where r<d<u. What about the case where d<u<r, or the risk free rate is higher than both u and d? It’s harder to see what an investor would do in this case, but it essentially the opposite of the previous example. Instead of buying or ‘going long’ the stock, the investor shorts the stock (borrows the stock, sells it for cash and promises to buy it back the next period). Let’s say the stock is worth $100; the investor borrows the stock, sells it for $100 cash and invests that cash in the risk free rate. In the next period, the investor has 100(1+r) (the $100 cash times what that cash earned from the risk free rate), and buys back the stock (remember, he borrowed it!), either at 100(1+u) dollars or 100(1+d) dollars. Since d<u<r, we have that 100(1+d)<100(1+u)<100(1+r), and thus the investor makes some positive profit. Again, the investor put no cash of his own down up front and, again, the investor could potentially lever up this investment massively; therefore, the investor is doing much better than the risk free rate with this arbitrage strategy.



Therefore, we must have that d<r<u, which of course means (1+d)<(1+r)<(1+u). The first of our two arbitraging investors above would lose money in a down move, since he would have to pay back the 1+r loan, which is more money than the 1+d price of the stock he owns. The second investor would lose money in an up move, since he would have to buy back the 1+u priced stock, which is more than the 1+r he got from investing in the risk free rate. That is, both strategies now have risk of losing money; they are no longer just making riskless profit.



Let’s finally turn to what the risk-neutral probability means in this context. Simply put, if we define p as the risk neutral probability, then p is the probability of the stock taking the ‘up move’ and 1p is the probability of the stock taking the ‘down move’ such that the expectation of the value of the stock at time t+1 equals St(1+r). More formally:

E(St+1|St)=pSt(1+u)+(1p)St(1+d)=(1+r)St

We can then solve for p. Dividing by St, we have:

p(1+u)+(1p)(1+d)=(1+r)

=p(1+u)p(1+d)+(1+d)=(1+r)

=p+uppdp+1+d=1+r

=p(ud)+d+1=1+r

=p(ud)+d=r =p=rdud

So, if the probability of an ‘up move’ is rdud (and the probability of a ‘down move’ is 1rdud), then the expectation of the price stock in the next period is equal to how much cash you would make if you just lent the dollar price of the stock St at the risk free rate for one period.


The risk-netural paradigm can be tricky; here is a video to solidify the concept.

Click here to watch this video in your browser




7.2.3 Symmetric Stock


We’ve constructed our stock St that either has an ‘up move’ to (1+u)St or a ‘down move’ to (1+d)St after one period. In this section, let’s imagine that the stock has an up move with probability 1/2 and a down move with probability 1/2 (we learned about the risk neutral probability in the previous section; this 1/2 is not necessarily the risk neutral probability, and we will return to the more general risk neutral probability soon).

Let’s define the ‘log return’ at time t+1 as simply:

λt+1=log(St+1St)

Or, simply put, the log of the stock price in period t+1 divided by the stock price in period t.

Let’s imagine that we are at time t, and thus St is known; from above, we have that St+1 is a random variable that takes on the value (1+u)St with probability 1/2 and takes on the value (1+d)St with probability 1/2.

This means that λt+1 is a random variable as well. In the case of an up move, we have that:

λt+1=log(St(1+u)St)=log(1+u)

And, similarly, in the case of a down move, we have:

λt+1=log(St(1+d)St)=log(1+d)

Therefore, λt+1 is just a random variable that takes on the value 1+u with probability 1/2 and the value 1+d with probability 1/2.


Let’s now think about the stock across many periods. Let each period have length ΔT (so that the difference between periods in time is ΔT and that there be N total periods). Define T as the total amount of time that passes, so that we have T=NΔT (the total amount of time that passes is the length of one interval times the total number of intervals).

Now let’s define YT as the ‘log return’ of the stock up to period T. We have:

YT=log(STS0)

Where ST is the price of the stock at time T and S0 is the initial price of the stock. Since YT is a function of the random variable ST (we take S0 as given), we know that YT is a random variable; for now, let us say that E(YT)=μT and Var(YT)=σ2T for some μ and σ2 (we will come back to this later, but let us take these definitions as given for now).

Because of the properties of the log, we can expand out YT as a summation:

YT=log(S(ΔT)S0)+log(S(2ΔT)S(ΔT))+...+log(SN(ΔT)S(N1)(ΔT)) Because, remember, the periods of this process are at times 0,ΔT,2ΔT,...,NΔT.

Note that the values in this summation are just the log returns, or λ values, that we defined earlier, so we can plug these in:

YT=λ1+λ2+...+λN

Because, remember, we have N total periods (each with a length of ΔT).


Recall that we defined the mean of Y as μT and the variance as σ2T. We can use this (and the fact that, by symmetry, the λ random variables must have the same distribution; the process by which the stock moves is the same in each period) to find the expectation of each λ value.

E(YT)=μT=E(λ1+λ2+...+λN)

μT=NE(λ1)

E(λ1)=μTN

And, because each λ term has the same distribution (and using the fact that TN=ΔT), we have:

E(λi)=μΔT

For i=1,2,...,N.


Similarly, we can find the variance:

Var(YT)=Tσ2=Var(λ1+λ2+...+λN) =Tσ2=NVar(λ1)

Var(λi)=σ2ΔT

For i=1,2,...,N.


We now have the mean and variance of each λ term. Further, we still know that each λ is a random variable that only takes on two values, each with probability 1/2. Above, we said that these values are log(1+u) and log(1+d). However, using the expectation and variance of each λ term, we can solve for what these values are in terms of μ, ΔT and σ2. Specifically, we have that:

λi=μΔT+σΔT

In an up move, and…

λi=μΔTσΔT

In a down move.


We can quickly test that these values are legitimate: a random variable that takes on μΔT+σΔT with probability 1/2 and μΔTσΔT with probability 1/2 will clearly have a mean of μΔT (multiply each value by 1/2 and sum them; the σΔT term cancels out and we are left with μΔT) and a variance of σ2ΔT (the average distance from the mean μΔT for the two values is just σΔT, and we square this to get back to the variance). Therefore, we have:

log(1+u)=μΔT+σΔT log(1+d)=μΔTσΔT


Let’s use this result to expand our formula for, say, SNΔT. Remember that in an up move we have:

SNΔT=(1+u)S(N1)ΔT

Taking the log, we get:

log(SNΔT)=log(S(N1)ΔT)+log(1+u)

Plugging in μΔT+σΔT for 1+u, we get:

log(SNΔT)=log(S(N1)ΔT)+μΔT+σΔT

In an up move. We will now use some lazy notation:

log(SNΔT)=log(S(N1)ΔT)+μΔT±σΔT

Where our lazy notation, ±, simply means ‘plus σΔT with probability 1/2 or minus σΔT with probability 1/2’ (i.e., an up or down move, each with probability 1/2). We know have a recursive equation and could plug in for log(S(N1)ΔT) in terms of log(S(N2)ΔT):

log(SNΔT)=log(S(N2)ΔT)+μΔT±σΔT+μΔT±σΔT And so on, until we recurse N times and get all the way down to S0 (remember that NΔT=T; we just plugged that in on the subscript on the left side).

log(ST)=log(S0)+μT+σTNi=Ni=1ϵi

Where each ϵi is i.i.d. and equals +1 with probability 1/2 and 1 with probability 1/2.

Let’s take a look at this result: the μT makes sense, since if we add up μΔT a total of N times we get μT. The σTN can be a bit trickier, but remember that if all of the ϵi random variables crystallize to 1, the sum of this term will equal:

NσTN=σTN=σΔTN2=NσΔT

Or just N times the original σΔT term that we added.



Let’s now let N go to infinity while T stays fixed; that is, the number of periods goes to infinity as the total amount of time elapsed stays fixed. Naturally, this means that the intervals get very very small! What happens, then, to our σTNi=Ni=1ϵi term? Well, let’s just look at 1Ni=Ni=1ϵi for now. This is the sum of many random variables (remember, each ϵi is i.i.d. and has probability 1/2 of equaling 1 and probability 1/2 of equaling 1) so, by the Central Limit Theorem, it is Normal! What are the parameters of this Normal distribution? Well, we have:

E(1Ni=Ni=1ϵi)=1Ni=Ni=1E(ϵi)=0

Since the expectation of each ϵi is just 0 (the LOTUS calculation would be 11/211/2=0). How about the variance? Well, we have:

Var(1Ni=Ni=1ϵi)=1Ni=Ni=1Var(ϵi)=NN=1

We know that Var(ϵi)=1 for all i because ϵi has a mean of 0 and thus is, on average, 1 away from the mean (remember, ϵi only takes on values 1 and 1), which means the variance is just 1. Therefore, we have the happy fact that:

1Ni=Ni=1ϵiN(0,1)

Plugging this in to our formula for ST above:

log(ST)=log(S0)+μT+σZT

Where Z is the standard Normal N(0,1).


Wait a moment…the right hand side of this equation, log(S0)+μT+σZT, is a bunch of constants log(S0), μ, T and σ are all constants) and a Normal random variable. We know that this is thus a Normal random variable, and thus log(ST) is Normal. Finally, since the log of ST is Normal, we know (from earlier) that ST is Log Normal!

Excellent! We have shown ST, the stock price at time T, to be Log-Normally distributed. This case was the simple symmetric case where the probability of an up move was equal to the probability of a down move (both 1/2).

Before moving to the next section where we work with the risk neutral probability that we introduced earlier, here’s a video to help walk through these concepts:

Click here to watch this video in your browser




7.2.4 Stock in Risk Neutral Case


We showed above that (under a variety of assumptions) a stock in the symmetric case (that goes up with probability 1/2 and down with probability 1/2) is Log-Normal. Instead of 1/2, now, let’s use a stock with probability p, that risk-neutral probability that we solved for earlier, of having an up move (and probability 1p of having a down move). Recall that (with u as the up move and d as the down move, as usual), this risk-neutral probability is given by:

p=rdud

We can thus use the same structure from above:

log(ST)=log(S0)+μT+σTNi=Ni=1ϵi

Except, of course, in this case, each ϵi is not ±1 with equal probabilities (1/2 probability of +1 and 1/2 probability of 1) but equal to +1 with probability p and equal to 1 with probability 1p.

We’re going to find the expectation and variance of the ϵ terms again, but let’s first try to get our risk neutral probability p=rdud in terms of the μ, σ, ΔT notation. Recall our result from earlier:

log(1+u)=μΔT+σΔT log(1+d)=μΔTσΔT

Let’s start with log(1+u). We can take each side to the power of e:

1+u=eμΔT+σΔT

Recall that the Taylor Series expansion for ex is 1+x+x22!+x33!+... We use this on the right side of the equation above:

1+u=1+μΔT+σΔT+σ2ΔT2

Note that we dropped all of the terms that take ΔT to a higher power (essentially everything past σ2ΔT2 in the expansion). We won’t go too deeply into the mathematical structure behind this decision so the explanation will be a bit hand-wavy; suffice it to say that, since ΔT is a small increment (and will get smaller when we allow N to grow while keeping T fixed, as we saw earlier) and ΔT raised to a power greater than 1 will be negligible.


Anyways, it is easy to solve for u from here (just subtract 1 from both sides):

u=μΔT+σΔT+σ2ΔT2

Similarly, we have:

d=μΔTσΔT+σ2ΔT2

That is, the same calculation as u, but with σΔT instead of σΔT. Let’s now plug this in to our definition of p:

p=rΔTdud Notice the slight adjustment here; we have rΔT instead of simply r from the original solution. That is because, in the original solution, we just assumed the time increment to be 1, and r is the rate of interest that you earn over time period 1, so we used r for our formula. By this point, we have specified ΔT as the increment, and thus rΔT is the rate of interest we earn over the increment (not r). Moving forward, then, we plug in our solutions for u and d:

p=rΔT(μΔTσΔT+σ2ΔT2)μΔT+σΔT+σ2ΔT2μΔT+σΔTσ2ΔT2

=rΔTμΔT+σΔTσ2ΔT22σΔT

Separating terms:

=rΔTμΔTσ2ΔT22σΔT+σΔT2σΔT

ΔT(rμσ22)2σΔT+12

p=ΔT2(rμσ22σ)+12


Excellent! We now have our probability that ϵi=1 in the risk neutral case; the probability of ϵi=1, of course, is just 1p. We can thus calculate the expectation of ϵi:

E(ϵi)=1p1(1p) =p1+p =2p1 =ΔT(rμσ22σ)+11 E(ϵi)=ΔT(rμσ22σ)

The variance is actually easier to find. We have:

Var(ϵi)=E(ϵ2i)E(ϵi)2

Since ϵ2i=1 (we know that ϵi either takes on +1 or 1, and both of these values squared are just 1) and we know E(ϵi)2, we are left with:

Var(ϵi)=1ΔT(rμσ22σ)2

Excellent! Let us now return to our equation for log(ST) from the previous section:

log(ST)=log(S0)+μT+σTNi=Ni=1ϵi Let’s look at the sum 1Ni=Ni=1ϵi and let N go to . Naturally, like we saw earlier, as 1Ni=Ni=1ϵi is a (large) sum of random variables, it has a Normal distribution by the Central Limit Theorem. Let’s find the expectation and variance of this Normal distribution:

E(1Ni=Ni=1ϵi)=1Ni=Ni=1E(ϵi)

Plugging in the expectation of each ϵi term we solved for above (and multiplying by N, the number of times that we sum it):

=NΔT(rμσ22σ)N

=NΔT(rμσ22σ)

And finally, plugging in T=NΔT:

=T(rμσ22σ) And now for the variance:

Var(1Ni=Ni=1ϵi)=1Ni=Ni=1Var(ϵi)

Plugging in the variance (and multiplying by N, as above):

1NN(1ΔT(rμσ22σ)2)

In this case, since N, the number of periods, is going to , the increment between each period ΔT goes to zero. Therefore, the ΔT(ruσ22σ)2 term here will go to zero, and we are just left with:

Var(1Ni=Ni=1ϵi)=NN=1

Astute readers may note that in the previous case, the term with ΔT did not go to zero: it was multiplied by N, which was simultaneously going to . Remember that NΔT=T, and that T does not change even as N goes to and ΔT goes to 0 (the rapidly growing N and rapidly shrinking ΔT offset each other!). We apologize if this explanation isn’t sufficient; we are not going to dive too deeply into the asymptotic math here.


Anyways, we have:

1Ni=Ni=1ϵiN(T(rμσ22σ),1)

Which has the same distribution as Z+T(rμσ22σ), where Z is the standard Normal N(0,1). We can therefore plug in to our equation for log(ST):

log(ST)=log(S0)+μT+σT(Z+T(rμσ22σ))

We can simplify the right hand side a bit:

=log(S0)+μT+σTZ+σT(rμσ22σ) log(ST)=log(S0)+σTZ+T(rσ22) Or, if we wanted to get rid of the logs, we just raise everything to the power of e:

ST=S0eσTZ+T(rσ22)


Voila! This may not seem like much, but we are actually right at the end of our long algebraic march (and about to embark on a calculus march!). The right hand side of this equation is a combination of constants and Z, a standard Normal random variable, which means the left hand side log(St) is also Normal. Further, since the log of ST is Normal, we have that ST is Log-Normal! Specifically, we have (it should be straightforward to take the expectation and variance of the right hand side above: given S0, everything is constant except for Z, which mas mean 0 and variance 1):

log(ST)N(log(S0)+T(rσ22),Tσ2)

Remember that S0 is a given but we can write, when conditioning on a more general St instead of S0 where t<T, we have:

log(ST)|StN(log(St)+(Tt)(rσ22),(Tt)σ2) Or, of course, we could say that ST is Log-Normal with the parameters above (just log(St)+(Tt)(rσ22) and (Tt)σ2)) because the log of ST is Normal with those parameters.

Also, notice that we plugged in Tt where we had T in the previous equation because the T term is the only term that changes with time. When we go from S0 to ST, we have that T time has passed; from St to ST, we have instead that Tt time has passed, so we just plug in Tt for T.


This might not seem like much, but note that we have found a distribution for the stock price at time T, which is pretty incredible! The parameters of this distribution only depend upon known constants: the stock price at some known period, the time elapsed, the risk free rate, and the ‘volatility’ σ2. Note that this ‘volatility’ is considered given for each stock (naturally, it’s an input into the model). Of course, there are many ways to estimate the true volatility of a stock, but these methods lend themselves more to a textbook on quantitative finance (recall that we set σ2 and μ as key parameters early on in the derivation; now we see that the σ2, or volatility, is a given for each stock, and that μ does not even appear in the final distribution for log(ST)|St). For our purposes, we will take the volatility as given.

This has certainly been an intense derivation; check out this video to help drill these concepts down.

Click here to watch this video in your browser



Let’s now turn to the question that prompted this algebra onslaught: how do we price a call option? Let’s define the following contract at time t:


Call option struck at K expiring at time T


For some (Log-Normal) stock St. Remember, we exercise the option if ST, the stock price at the expiry time T, is larger than K. If we have ST>K, then we buy the stock for K (this is exactly what the option gives us) and sell it for ST, making a profit of STK.

Now that we have a distribution for ST, we can simply calculate the expectation of the value of the option (in a risk neutral) case, which will be the fair price. The expected value of the option is simply:

E(STK|St)+(1+r)Tt

Where the + sign simply means ‘minimum of zero’; if STK, then we don’t exercise the option, and the option is just worth zero, not a negative value. Also note the (1+r)Tt term in the denominator; this is the rate of return that we would get for investing from time t to time T1 (so just 1+r compounded Tt times, which is also the amount you would have if you invested 1 dollar at the rate of r for Tt periods), so we need to adjust our option price by this factor. This is called getting present value of future cash; for example, if we know we will receive $10 tomorrow, but we can also lend at the risk free rate of .02, then we could just invest 101.02 dollars, which is less than 10 dollars, today to receive $10 tomorrow. That is, money in the future is worth less because we can earn a risk free interest rate in the meantime (assuming that r is positive, which is not always the case but we will hold as an assumption here).

Further, in this case, instead of just assuming that investments compound each discrete period, we are going to assume that they are compounded continuously: that is, instead of just receiving the interest each discrete period and re-investing, we assume that we receive the interest continuously and re-invest it continuously. In the limit, we move from (1+r)Tt to er(Tt) as our ‘compound’ factor (and thus multiply by the reciprocal of er(Tt), just like we did with (1+r)Tt, to get the future dollars back in today’s dollar terms); don’t worry if this part isn’t immediately clear for you; this section of the discussion is more suited for a finance book! We thus re-write:

er(Tt)E(STK|St)+

And we can express this using LOTUS:

er(Tt)E(STK|St)+=er(Tt)K(STK)f(sT)dsT

Note that the lower limit of the integral is K; as we said above, if ST is less than K, the option is just worth zero.

We can use a little trick here: let’s let XT=log(ST), so that eXT=ST. Why would we make this change? Well, we know that XT (the log of ST, conditioned on St) has this distribution:

log(ST)|StN(log(St)+(Tt)(rσ22),(Tt)σ2)

And we are more comfortable working with the density of a Normal distribution than a Log-Normal distribution. Plugging in:

er(Tt)E(STK|St)+=er(Tt)log(K)(exK)f(x)dx Note that we switched the lower bound of the integral to log(K) from K; if the lower bound for ST is K, then the lower bound for XT=log(ST)=log(K). Plugging in the density f(x) (remember that XTN(log(St)+(Tt)(rσ22),(Tt)σ2)), we have:

=er(Tt)σ2π(Tt)log(K)(exK)e(x(log(St)+(Tt)(rσ22)))22σ2(Tt)dx

Let’s use γ=log(St)+(Tt)(rσ22) to clean things up a bit:

=er(Tt)σ2π(Tt)log(K)(exK)e(xγ)22σ2(Tt)dx

It will be easiest if we split this integral up into two pieces (i.e., the ex piece and the K piece). Let’s start with the ex piece:

=er(Tt)σ2π(Tt)log(K)exe(xγ)22σ2(Tt)dx

=er(Tt)σ2π(Tt)log(K)ex(xγ)22σ2(Tt)dx Let’s take a look at that exponent in the integral We are going to make the claim that:

x(xγ)22σ2(Tt)=γ+12σ2(Tt)(x(γ+σ2(Tt)))22σ2

Bear with us; the right hand side of this equation will be far easier to work with. Let’s busy ourselves proving this equality. Expanding the squares:

xx22γx+γ22σ2=μ+12σ2(Tt)x22γx2xσ2(Tt)+γ2+2γσ2(Tt)+σ4(Tt)22σ2

We can see that the x22σ2(Tt), 2γx2σ2(Tt) and μ22σ2(Tt) terms all cancel out:

x=γ+12σ2(Tt)2xσ2(Tt)+2γσ2(Tt)+σ4(Tt)22σ2(Tt) x=γ+12σ2(Tt)+xγ12σ2(Tt) x=x

Excellent! We have shown that the above equality holds, so we can plug the original right hand side into our integral:

=er(Tt)σ2π(Tt)log(K)eγ+12σ2(Tt)(x(γ+σ2(Tt)))22σ2(Tt)dx

Filtering out the constants:

=er(Tt)eγ+12σ2(Tt)σ2π(Tt)log(K)e(x(γ+σ2(Tt)))22σ2(Tt)dx Hold on a moment…that e(x(γ+σ2(Tt)))22σ2(Tt) term, combined with the 1σ2π(Tt) term, looks like the density of a N(γ+σ2(Tt),σ2(Tt)) random variable! We can thus use the standard Normal CDF Φ to solve this problem (remember, if we subtract the mean from a Normal and divide by the standard deviation, we get back to a standard Normal):

er(Tt)eγ+12σ2(Tt)(Φ()Φ(log(K)(γ+σ2(Tt))σTt)) We know that Φ()=1 (100% of the density in a Normal is below ) and 1Φ(a)=Φ(a) (both are asking for the density beyond a distance from the mean on one side), so we have:

er(Tt)eγ+12σ2(Tt)(1Φ(log(K)(γ+σ2(Tt))σTt))

er(Tt)eγ+12σ2(Tt)Φ(γ+σ2(Tt)log(K)σTt) Let’s now plug back in for γ; recall that γ=log(St)+(Tt)(rσ22):

er(Tt)elog(St)+(Tt)(rσ22)+12σ2(Tt)Φ(log(St)+(Tt)(rσ22)+σ2(Tt)log(K)σTt) Combining terms (remember that log(a)log(b)=log(ab)):

elog(St)Φ(log(StK)+(Tt)(r+σ22)σTt) And remember that elog(a)=a (we are using the natural log unless otherwise specified), so we have:

StΦ(log(StK)+(Tt)(r+σ22)σTt)

Wow! What a nicely reduced solution. Remember, this is the solution for the first part of the integral:

=er(Tt)σ2π(Tt)log(K)exe(xγ)22σ2(Tt)dx

And now we can solve the K part:

=er(Tt)σ2π(Tt)log(K)Ke(xγ)22σ2(Tt)dx

The K is a constant so it comes out:

=Ker(Tt)σ2π(Tt)log(K)e(xγ)22σ2(Tt)dx Wait a moment; again, with the 1σ2π(Tt) term, this looks like the PDF of a N(γ,σ2(Tt)) random variable. We can use a similar approach to before, then, with the standard Normal CDF Φ:

=Ker(Tt)(Φ()Φ(log(K)γσ(Tt)))

Again using Φ()=1 and 1Φ(a)=Φ(a):

=Ker(Tt)Φ(γlog(K)σ(Tt))

Plugging in γ=log(St)+(Tt)(rσ22):

=Ker(Tt)Φ(log(St)+(Tt)(rσ22)log(K)σ(Tt)) Which finally gives us:

=Ker(Tt)Φ(log(StK)+(Tt)(rσ22)σ(Tt)) This is the second part of the integral, which means, putting it all together gives out Black Scholes Model for the price of a call:

er(Tt)E(STK|St)+= StΦ(log(StK)+(Tt)(r+σ22)σTt)Ker(Tt)Φ(log(StK)+(Tt)(rσ22)σ(Tt))

This is generally depicted as:

er(Tt)E(STK|St)+=StΦ(d1)Ker(Tt)Φ(d2)

Where d1=log(StK)+(Tt)(r+σ22)σTt and d2=d1σTt.

For a discussion of this integral, check out this video:

Click here to watch this video in your browser




Wow! After all of the assumptions, set-up and algebra / calculus slog, we have finally arrived at the answer to our original query: a model for the price of a call option. Note that the inputs to this model are all known at time t: we have St, the stock price at time t, K, the strike price of the option, Tt, or the time until expiry, r, or the risk free rate, and σ2 which, as we’ve discussed, is taken to be a known volatility for each stock. One important thing to note is that the time Tt is by default given in years (a standard assumption in financial space is that we are assuming parameters on an annualized basis), so, if we wanted an expiry of 100 days, we would simply enter 100/365 as an input to the model (100 days divided by 365 days in a year gives 100 days in year units).

Let’s move to the next section where we try to gather some intuition by seeing Black Scholes in action.




7.3 Exploring Black Scholes


Black Scholes is difficult to solve, but very easy to code up! Let’s define a function that returns the price of an option for some stock price S, some risk free rate r, some time until expiry time, some volatility sigma and some strike price k.

FUN_blackscholes <- function(S, r, time, sigma, k) {
  
  d1 <- (log(S / K) + (r + sigma ^ 2 / 2) * time) / (sigma * sqrt(time))
  d2 <- d1 - sigma * sqrt(time)
  return(S * pnorm(d1) - exp(-r * time) * K * pnorm(d2))
}


Let’s use this function to actually calculate call prices; we can also use our intuition to try to see how the prices will change with varying parameters.

We can start with the stock price, which is S in our code. Intuitively, a higher stock price means a higher price for an option, all else equal; this is because, the higher the stock price, the closer the stock is to being in the money (above the strike).

Mathematically, St seems to increase the stock price as well. As we let St go to , d1 and d2 both go to , which means Φ(d1) and Φ(d2) both go to 1 and we are left with:

Ster(Tt)K

Which makes sense, because as St gets really big, we will almost certainly exercise, so the value is just St minus the strike (where the strike price K discounted to today; note that the stock price isn’t discounted to today because we we have the value of the stock St in today’s dollars). Let’s plot the price of the option for a variety of different stock prices:

# initialize parameters
S <- seq(from = 0, to = 150, by = 5)
K <- 80
r <- .01
sigma <- 2
time <- 100 / 365

# generate prices
price <- sapply(S, function(x) FUN_blackscholes(S = x, r = r,
                                       time = time, sigma = sigma, k = k))
# visualize
data <- data.table(S = S,
                   price = price)

ggplot(data, aes(x = S, y = price)) +
  geom_point() + 
  theme_bw() + 
  xlab("Stock Price") + ylab("Option Price") + 
  ggtitle("Option Price for varying stock price (K = 80)") +
  geom_hline(yintercept = K, linetype = "dashed") + geom_hline(yintercept = 0)


There are a couple of interesting things to note from the above chart. First, naturally, as we predicted, the option price increases with the stock price. Second, note that for very low values of the stock, the option is worth essentially zero; intuitively, this is because if the stock price is low enough, there is basically no chance that it ends up in the money and thus we exercise (so the option expires worthless). Finally, note that the option price is $88.08 when the stock price is $150 (try tail(data, 1) to see this value). This value seems high. Even if the option expired today, one would buy the stock for the strike price of 80 dollars, sell it for the market price of 150 dollars and make 70 dollars; why would the option, then, be worth $18.08 more than this $70 profit? The idea here is that (as we saw at the beginning of this chapter), because there is plenty of time left before expiry (100 days) and there is more upside than downside (the stock can go very high, and the increase in value for the option is unbounded, but even if the stock falls to zero, the option can’t be worth anything less than zero dollars; the downside is bounded), the option is valued more than what we would make if the option expired right now.

It is also important to note the slope of this line; it starts out as convex (second derivative is positive) but looks more linear for large values of S (this would continue if we had even larger values of S). This is because, once the stock price gets sufficiently above the strike price, the probability of exercising the option is very high (i.e., if the stock price is 150 dollars and the strike price is 80 dollars, we are almost certainly going to exercise because we don’t expect the stock price to fall all the way back to below 80 in the time remaining). Therefore, each additional dollar in the stock price S doesn’t really change the probability of exercising (changes in the probability of exercising create the convexity for lower values of S along the curve), and thus each dollar increase in stock price corresponds to the same increase in option price.


Let’s continue in this thread but instead vary the time to expiry for the option:

# initialize parameters
S <- 90
K <- 80
r <- .01
sigma <- 2
time <- seq(from = 1, to = 365, by = 5) / 365

# generate prices
price <- sapply(time, function(x) FUN_blackscholes(S = S, r = r,
                                       time = x, sigma = sigma, k = k))
# visualize
data <- data.table(time = time,
                   price = price)

ggplot(data, aes(x = time, y = price)) +
  geom_point() + 
  theme_bw() + 
  xlab("Time Until Expiry (years)") + ylab("Option Price") + 
  ggtitle("Option Price for varying time to expiry (K = 80, S = 90)") +
  geom_hline(yintercept = 0) + geom_hline(yintercept = S)


First, we can note the earliest time point here: 1 day until expiry. The option price in this case is $10.58 (try head(data, 1) to see this value), which is very close to the stock price of 90 dollars minus the strike price of 80 dollars. This makes sense; we are only one day away from expiry, so the stock doesn’t have time to move that much, and thus won’t be very far from where it sits now at 90 dollars. If it expires at 90 dollars, we make 10 dollars, because we exercise the option, buy the stock for 80 dollars and sell it for 90 dollars. Note that the option price is slightly more than 10 dollars (58 cents more, to be exact), which is for the same reason discussed above: even though we don’t expect the stock to move that much, if it does move a lot, there is more upside than downside (option value upside theoretically unbounded while downside is capped).

In general, notice the upward sloping trend. This means that, all else equal, an option that has a longer time until expiry is more valuable. This makes sense: like we’ve been talking about, options have unlimited upside but limited downside, so the more time that the stock has to move around, the higher value the option should be! Ultimately, uncertainty is good for call options, and more time until expiry introduces more uncertainty.

In financial terms, practitioners call this ‘time’ value theta. Since, all else equal, lower time values mean lower option prices, we will often observe option prices fall each day (the time until expiry reduces each day, since the expiry stays fixed); this is called theta decay.


Let’s now vary the volatility term (remember, this is taken as a given input for each stock):

# initialize parameters
S <- 90
K <- 80
r <- .01
sigma <- seq(from = .01, to = 20, by = .5)
time <- 50 / 365

# generate prices
price <- sapply(sigma, function(x) FUN_blackscholes(S = S, r = r,
                                       time = time, sigma = x, k = k))
# visualize
data <- data.table(sigma = sigma,
                   price = price)

ggplot(data, aes(x = sigma, y = price)) +
  geom_point() + 
  theme_bw() + 
  xlab("Stock Volatility") + ylab("Option Price") + 
  ggtitle("Option Price for varying volatility (K = 80, S = 90)") +
  geom_hline(yintercept = 0) + geom_hline(yintercept = S)


Similar to how we saw that more time until expiry increases option value, more stock volatility also increases option value. The reasoning here is the same: more uncertainty is good for option value, and more volatility in the stock is, by definition, more uncertainty.

Note that, for low values of sigma, the value of the option is very close to 10 dollars, which again is the stock price of 80 dollars minus the strike price of 70 dollars. Again, this is because, if a stock has very low volatility, we don’t expect it to move much between now and expiry; by the time we get to expiry, then, if the stock has indeed not moved much from 90 dollars, the option is worth 90 dollars minus the 80 dollar strike price (again, the option today is worth slightly more because of the unlimited upside vs. the capped downside).

Finally, note that when sigma gets very large, the option price begins to approach the stock price of 90 dollars. This is a bit tricky. If the volatility (sigma value) for the stock is extremely large, and we know that the stock has a lower bound of zero, then there is a high probability of getting a very large value for the stock at expiry (the stock will move around a ton, and has a lower bound of zero). If the stock has a very, very large value, then, at expiry, the value of the option is essentially just the value of the stock (of course, the value will be the stock price minus the strike price, but if the stock price is extremely large, the strike price is just a small fraction of the stock price). Therefore, if at expiry the option value is just equal to the stock value, then we should just price the option today at the same price as the stock. That is, if at expiry the option will just be worth the same value as the stock, then the price of the option now should just be the price of the stock now (they are equivalent!).


Finally, let’s vary the last parameter, the risk free rate:

# initialize parameters
S <- 90
K <- 80
r <- seq(from = 0, to = 5, by = .1)
sigma <- .1
time <- 100 / 365

# generate prices
price <- sapply(r, function(x) FUN_blackscholes(S = S, r = x,
                                       time = time, sigma = sigma, k = k))
# visualize
data <- data.table(r = r,
                   price = price)

ggplot(data, aes(x = r, y = price)) +
  geom_point() + 
  theme_bw() + 
  xlab("Risk Free Rate") + ylab("Option Price") + 
  ggtitle("Option Price for varying risk free rate (K = 80, S = 90)") +
  geom_hline(yintercept = 0)


This relationship is tricky: why does option price increase with the risk free rate? Here, we can recall that if we do end up exercising the option, we have to pay the strike price K to buy the option. Imagine, now, how low the present value of the strike price K is if the risk free rate is really high. Alternatively, note that, if the risk free rate is extremely high, one could invest a small amount (say, a dollar) today and earn back a much higher amount (say, 80 dollars) by the time of expiry (of course, this is mostly theoretical; risk free rates would have to be extremely high to achieve this growth, almost certainly far higher than anything we will ever see in the real world; still it helps to get our intuition around it!). Therefore, the strike price that we will have to pay in the future (if the stock ends up above the strike price) is essentially worth nothing today (if the risk free rate is sufficiently high); that means that, as the risk free rate rises, we can buy the stock for less money (the value of K in present dollars gets lower and lower as the risk free rate rises). Buying the stock for less money (in present dollars) is good, of course, which means that the option price rises with the risk free rate!

We work through the intuition behind Black-Scholes in this video:

Click here to watch this video in your browser