## 16.5 Exercises

The following exercises further explore the interplay between simulations and analytic solutions by examining three types of problems:

- other Bayesian situations
- other versions of the Monty Hall problem
- related probability puzzles

### 16.5.1 More Bayesian situations

The *mammography problem* shown in Section 16.2 is a special case of a more general class of *Bayesian situations*, which require the computation of an inverse conditional probability \(p(C|T)\) when given a prior probability \(p(C)\) and two conditional probabilities \(p(T|C)\) and \(p(T|\neg C)\).

In 2021, an intriguing version of this problem is the following: The list of rapid SARS-CoV-2 self tests of the BfArM contains over 450 tests.
Each test is characterized by its *sensitivity* \(p({T+}|infected)\) and *specificity* \(p({T-}|not\ infected)\).
The current local *prevalence* of SARS-CoV-2 infections (as of 2021-05-20) is 100 out of 100,000 people.

Download the table of all tests and compute their mean

*sensitivity*and*specificity*values.Compute two conditional probabilities:

- What is the average probability of being
*infected*upon a*positive*test result (i.e., \(PPV = p(infected|{T+})\)? - What is the average probability of
*not*being infected upon a*negative*test result (i.e., \(NPV = p(not\ infected|{T-})\)?

- What is the average probability of being
How would the values of 2. change, if the local

*prevalence*of SARS-CoV-2 infections rose to 500 out of 100,000 people?Conduct a simulation by enumeration to compute both values for a population of 10,000 people.

**Bonus:**Try visualizing your results (in any way, i.e., not necessarily by using the**riskyr**package).

#### Solution

The following table summarizes the mean *sensitivity* and *specificity* values of all tests:

n | sens_mean | sens_min | sens_max | sens_sd | spec_mean | spec_min | spec_max | spec_sd |
---|---|---|---|---|---|---|---|---|

455 | 95.2191 | 80.2 | 100 | 3.111103 | 99.28365 | 96.7 | 100 | 0.6847694 |

The results of 2. and 3. could be computed by the Bayesian formula (see in Section 16.2).
The following visualizations result from creating **riskyr** scenarios.

- Solutions for a prevalence of 100 out of 100,000 people:

- Solutions for a prevalence of 500 out of 100,000 people:

### 16.5.2 Solving Bayesian situations by sampling

In Section 16.2, we solved so-called Bayesian situations (and the *mammography problem*) by enumerating cases. However, we can also use a sampling approach to address and solve such problems.

Use a sampling approach for solving the

*mammography problem*(discussed in Section 16.2).Adopt either approach (i.e., either enumeration and sampling) for solving the

*cab problem*(originally introduced by Kahneman & Tversky, 1972a; and discussed by, e.g., Bar-Hillel, 1980):

A cab was involved in a hit-and-run accident at night.

Two cab companies, the Green and the Blue, operate in the city.

You are given the following data:

1. 85% of the cabs in the city are Green and 15% are Blue.

2. A witness identified the cab as a Blue cab.

The court tested his ability to identify cabs under the appropriate visibility conditions. When presented with a sample of cabs (half of which were Blue and half of which were Green) the witness correctly identified each color in 80% of the cases and erred in 20% of the cases.What is the probability that the cab involved in the accident was Blue rather than Green?

**Hint:** Despite their superficial differences, the structure of the *cab problem* is almost identical to the *mammography problem* (e.g., compare Figures 4 and 9 of Neth, Gradwohl, et al., 2021).

- Discuss the ways in which the enumeration and sampling approaches differ from each other and from deriving an analytic solution. What are the demands of each method and the key trade-offs between them?

### 16.5.3 Monty Hall reloaded

In Section 16.3.1, we introduced the notorious brain teaser known as the *Monty Hall dilemma* (Savant, 1990; MHD, popularized by Selvin, 1975) and showed how it can be solved by a simulation that randomly samples environmental states (i.e., the setup of games) and actions (e.g., the host’s and player’s choices).

#### An erroneous MHD simulation

A feature of simulations is that they are only as good as the thinking that created them. And just like statements or scientific theories, they are not immune to criticism, and can be wrong and misleading. In the following, we illustrate this potential pitfall of simulations by creating an enumeration that yields an erroneous result.

So stay alert — and try to spot any errors we make! Importantly, the error is not in the method or particular type of simulation, but in our way of thinking about the problem.

- Setup

Without considering any specific constraints, the original problem description of the MHD describes a 3x3x3 grid of cases:

- 3 car locations
- 3 initial player picks
- 3 doors that Monty could open

The following tibble `grid_MH`

creates all 27 cases that are theoretically possible:

```
# Setup of options:
<- 1:3
car <- 1:3
player <- 1:3
open
# Table/space of all 27 options:
<- expand_grid(car, player, open)
grid_MH # names(grid_MH) <- c("car", "player", "open")
<- tibble::as_tibble(grid_MH) %>%
grid_MH arrange(car, player)
grid_MH#> # A tibble: 27 x 3
#> car player open
#> <int> <int> <int>
#> 1 1 1 1
#> 2 1 1 2
#> 3 1 1 3
#> 4 1 2 1
#> 5 1 2 2
#> 6 1 2 3
#> 7 1 3 1
#> 8 1 3 2
#> 9 1 3 3
#> 10 2 1 1
#> # … with 17 more rows
```

- Adding constraints

However, not all these cases are possible, as Monty cannot open just any of the three doors. More specifically, the problem description imposes two key constraints on Monty’s action:

Condition 1: Monty cannot open the door actually containing the

`car`

.Condition 2: Monty cannot open the door initially chosen by the

`player`

.

This allows us to eliminate some cases, or describe which of the 27~theoretical cases are actually `possible`

in this problem:

```
$possible <- TRUE # initialize
grid_MH
$possible[grid_MH$open == grid_MH$car] <- FALSE # condition 1
grid_MH$possible[grid_MH$open == grid_MH$player] <- FALSE # condition 2
grid_MH
grid_MH#> # A tibble: 27 x 4
#> car player open possible
#> <int> <int> <int> <lgl>
#> 1 1 1 1 FALSE
#> 2 1 1 2 TRUE
#> 3 1 1 3 TRUE
#> 4 1 2 1 FALSE
#> 5 1 2 2 FALSE
#> 6 1 2 3 TRUE
#> 7 1 3 1 FALSE
#> 8 1 3 2 TRUE
#> 9 1 3 3 FALSE
#> 10 2 1 1 FALSE
#> # … with 17 more rows
```

- Eliminating impossible cases

Next, we remove the impossible cases and consider only possible ones (i.e., `possible == TRUE`

):

```
<- grid_MH[grid_MH$possible == TRUE, ]
grid_MH_1
grid_MH_1#> # A tibble: 12 x 4
#> car player open possible
#> <int> <int> <int> <lgl>
#> 1 1 1 2 TRUE
#> 2 1 1 3 TRUE
#> 3 1 2 3 TRUE
#> 4 1 3 2 TRUE
#> 5 2 1 3 TRUE
#> 6 2 2 1 TRUE
#> 7 2 2 3 TRUE
#> 8 2 3 1 TRUE
#> 9 3 1 2 TRUE
#> 10 3 2 1 TRUE
#> 11 3 3 1 TRUE
#> 12 3 3 2 TRUE
```

Note that we used **base** R functions to add a new variable (`possible`

), determine its values (by logical indexing) and then filter its rows (by logical indexing) to leave only possible cases in `grid_MH_1`

.

The following tidyverse solution replaces the last two steps by a simple **dplyr** pipe (and would have worked without the additional `possible`

variable):

```
<- grid_MH %>%
grid_MH_1 filter(open != car, open != player) # condition 1 and 2
grid_MH_1#> # A tibble: 12 x 4
#> car player open possible
#> <int> <int> <int> <lgl>
#> 1 1 1 2 TRUE
#> 2 1 1 3 TRUE
#> 3 1 2 3 TRUE
#> 4 1 3 2 TRUE
#> 5 2 1 3 TRUE
#> 6 2 2 1 TRUE
#> 7 2 2 3 TRUE
#> 8 2 3 1 TRUE
#> 9 3 1 2 TRUE
#> 10 3 2 1 TRUE
#> 11 3 3 1 TRUE
#> 12 3 3 2 TRUE
```

**ToDo:** Visualize the 12 possible cases in a 3x3x3 cube.

- Determine wins by strategy

Finally, we can determine the number of wins, based on the player’s strategy (i.e., *staying* with the original choice, or *switching* to the other remaining door):

Win by staying:

*Staying wins*whenever the location of the`car`

matches the initial choice of the`player`

.Win by switching: As we are only left with

`possible`

cases,*switching wins*whenever staying fails to win.

```
$win_stay <- FALSE # initialize
grid_MH_1$win_switch <- FALSE
grid_MH_1
$win_stay[grid_MH_1$car == grid_MH_1$player] <- TRUE # win by staying
grid_MH_1$win_switch[grid_MH_1$win_stay == FALSE] <- TRUE # win by switching
grid_MH_1
grid_MH_1#> # A tibble: 12 x 6
#> car player open possible win_stay win_switch
#> <int> <int> <int> <lgl> <lgl> <lgl>
#> 1 1 1 2 TRUE TRUE FALSE
#> 2 1 1 3 TRUE TRUE FALSE
#> 3 1 2 3 TRUE FALSE TRUE
#> 4 1 3 2 TRUE FALSE TRUE
#> 5 2 1 3 TRUE FALSE TRUE
#> 6 2 2 1 TRUE TRUE FALSE
#> 7 2 2 3 TRUE TRUE FALSE
#> 8 2 3 1 TRUE FALSE TRUE
#> 9 3 1 2 TRUE FALSE TRUE
#> 10 3 2 1 TRUE FALSE TRUE
#> 11 3 3 1 TRUE TRUE FALSE
#> 12 3 3 2 TRUE TRUE FALSE
```

Of course we could have used **dplyr** `mutate()`

commands, rather than **base** R indexing to initialize and define these `win_stay()`

and `win_switch`

variables. (We will show a tidyverse solution below.)

- Count wins by strategy

To compare both strategies, we can now count how often each strategy wins:

```
# Result*(erroneous):
<- sum(grid_MH_1$win_stay)) # win by staying
(n_win_stay #> [1] 6
<- sum(grid_MH_1$win_switch)) # win by switching
(n_win_switch #> [1] 6
```

- Interpret solution

This analysis suggests that the chances are 50:50 (or rather 6:6) — which is wrong!

But where is the error? And can we fix this simulation and reach the right solution by enumerating cases?

#### Analysis: What went wrong?

The previous simulation illustrates a crucial problem with enumerating cases:
The cases noted in the reduced `grid_MH_1`

are not equiprobable (i.e., expected to appear with an equal probability).
To see this, let’s take another look at our reduced `grid_MH_1`

:

```
# Re-inspect grid_MH_1:
%>% filter(car == 1) # car behind door 1:
grid_MH_1 #> # A tibble: 4 x 6
#> car player open possible win_stay win_switch
#> <int> <int> <int> <lgl> <lgl> <lgl>
#> 1 1 1 2 TRUE TRUE FALSE
#> 2 1 1 3 TRUE TRUE FALSE
#> 3 1 2 3 TRUE FALSE TRUE
#> 4 1 3 2 TRUE FALSE TRUE
%>% filter(player == 1) # player picks door 1:
grid_MH_1 #> # A tibble: 4 x 6
#> car player open possible win_stay win_switch
#> <int> <int> <int> <lgl> <lgl> <lgl>
#> 1 1 1 2 TRUE TRUE FALSE
#> 2 1 1 3 TRUE TRUE FALSE
#> 3 2 1 3 TRUE FALSE TRUE
#> 4 3 1 2 TRUE FALSE TRUE
```

Although the `car`

is located equally often overall behind each door (i.e, in 4 out of 12 cases) and the `player`

initially chooses each door with an equal chance (i.e, in 4 out of 12 cases), the combinations of `car`

and `player`

show an odd regularity that is not motivated by the original problem. Specifically, the three possible cases in which `car == player`

appear *twice* in the table, but really describe the *same* original configuration of the game.
Given the problem description, each case for which `car != player`

should be as likely as each case for which `car == player`

.
However, our analysis *doubled* those cases in which Monty had two possible options for opening a door (i.e., the cases of `car == player`

). Put another way, the set of cases in `grid_MH_1`

assumes that the `player`

— by some prescient miracle — is twice as likely to choose the door with the car than to choose a different door. As staying is the successful strategy in this case, this analysis overestimates the overall success rate of *staying* with the initial choice.

By selecting cases that were `possible`

, we implicitly *conditionalized* the number of cases on the number of Monty’s options for opening a door. This is essentially the *same* error that people intuitively make when assuming there is a 50:50 chance *after* Monty has opened a door. Somehwat ironically, this error was obscured further by disguising it as a seemingly sophisticated simulation!

The lesson to learn here is not that simulations are bad, but that they must not be trusted blindly.
As simulations explicate assumptions and the consequences of mechanisms that we cannot solve analytically, the results from a simulation are only valid if its assumptions are sound and its mechanism is implemented in a faithful manner.
Crucially, the error here was not caused by our method (i.e, using a particular type of simulation), but in an inappropriate conceptualization of the problem (assuming that all `possible`

cases were equiprobable).
So how could we create a correct simulation by enumeration?

#### Correction 1

Actually, a correct solution is even simpler than our initial one.

In the following alternative, we only consider possible cases prior to Monty’s actions:

```
# space of only 9 options:
<- expand_grid(car, player)
grid_MH_2 # names(grid_MH_2) <- c("car", "player")
# grid_MH_2 <- tibble::as_tibble(grid_MH_2) %>% arrange(car)
grid_MH_2#> # A tibble: 9 x 2
#> car player
#> <int> <int>
#> 1 1 1
#> 2 1 2
#> 3 1 3
#> 4 2 1
#> 5 2 2
#> 6 2 3
#> 7 3 1
#> 8 3 2
#> 9 3 3
```

Importantly, given the problem description, these 9 cases actually are equi-probable! And since Monty can always open some door (as at least one of the non-chosen doors does not contain the car), we do not even need to consider his action to determine our chances of winning by staying or switching. In fact, we can count all cases of wins, based on strategy, exactly as before — independently of Monty’s actions:

```
$win_stay <- FALSE # initialize
grid_MH_2$win_switch <- FALSE # initialize
grid_MH_2
$win_stay[grid_MH_2$car == grid_MH_2$player] <- TRUE # from 1
grid_MH_2$win_switch[grid_MH_2$win_stay == FALSE] <- TRUE # from 2
grid_MH_2
grid_MH_2#> # A tibble: 9 x 4
#> car player win_stay win_switch
#> <int> <int> <lgl> <lgl>
#> 1 1 1 TRUE FALSE
#> 2 1 2 FALSE TRUE
#> 3 1 3 FALSE TRUE
#> 4 2 1 FALSE TRUE
#> 5 2 2 TRUE FALSE
#> 6 2 3 FALSE TRUE
#> 7 3 1 FALSE TRUE
#> 8 3 2 FALSE TRUE
#> 9 3 3 TRUE FALSE
# Result:
<- sum(grid_MH_2$win_stay)) # win by staying
(n_win_stay #> [1] 3
<- sum(grid_MH_2$win_switch)) # win by switching
(n_win_switch #> [1] 6
```

This yields the correct solution: \(p(win\ by\ staying) = 3/9 = 1/3\), but \(p(win\ by\ switching) = 6/9 = 2/3\).

In case this result seems odd or counter-intuitive:
Contrast the original problem with an alternative one in which Monty opens a door *prior to* the player’s initial choice (see *Monty first* below). In the *Monty first* case, the player really faces a 50:50 chance (and will often prefer to stick with his or her initial choice, which requires an additional psychological explanation). What is different in the original version, in which the player chooses first? The difference is that — when opening a door — Monty Hall always needs to avoid the car. But in the original problem, the host also needs to conditionalize his action on the player’s initial choice. Thus, Monty Hall adds information to the problem. Whereas the chance of winning by sticking with the initial choice remains fixed at \(p(win\ by\ staying) = 3/9 = 1/3\), the chance of winning by switching to the other door is twice as high: \(p(win\ by\ switching) = 6/9 = 2/3\).

#### A tidyverse variant

The `expand.grid()`

functions above (and its **tidyr** variant `expand_grid()`

) create all combinations of its constituent vectors with repetitions.
This generated many impossible cases, which we eliminated by imposing the constraints of the problem.

However, by only taking into account the location of the `car`

, our setup above ignored that there are two goats (e.g., `g_1`

and `g_2`

) and two distinct arrangements of them, given a specific `car`

location:

```
<- 1:3
car <- 1:3
g_1 <- 1:3
g_2
# All 27 combinations:
<- expand_grid(car, g_1, g_2))
(t_all #> # A tibble: 27 x 3
#> car g_1 g_2
#> <int> <int> <int>
#> 1 1 1 1
#> 2 1 1 2
#> 3 1 1 3
#> 4 1 2 1
#> 5 1 2 2
#> 6 1 2 3
#> 7 1 3 1
#> 8 1 3 2
#> 9 1 3 3
#> 10 2 1 1
#> # … with 17 more rows
# Eliminate impossible cases:
%>%
t_all filter((car != g_1), (car != g_2), (g_1 != g_2))
#> # A tibble: 6 x 3
#> car g_1 g_2
#> <int> <int> <int>
#> 1 1 2 3
#> 2 1 3 2
#> 3 2 1 3
#> 4 2 3 1
#> 5 3 1 2
#> 6 3 2 1
```

Importatnly, this shows that there exist 6 equiprobable setups, rather than just 3 possible car locations: Each car location corresponds to two possible arrangements of the goats.

This raises the question:

- Can we solve the problem by taking into account the goat locations?

Let’s repeating the above simulation with a more complicated starting grid:

```
# Table/space of all 27 x 9 options:
<- expand_grid(car, g_1, g_2, player, open)
grid_MH_3
# Eliminate impossible cases:
<- grid_MH_3 %>%
grid_MH_3 filter((car != g_1), (car != g_2), (g_1 != g_2)) %>% # impossible setups
filter(open != car, open != player) # condition 1 and 2
```

Note that this leaves 24 cases, rather than the 12 cases of `grid_MH_1`

above.

This time, we determine the wins by each strategy by using **dplyr** `mutate()`

commands:

```
<- grid_MH_3 %>%
grid_MH_3 # mutate(win_stay = FALSE, # initialize
# win_switch = FALSE) %>%
mutate(win_stay = (car == player), # determine wins
win_switch = (win_stay == FALSE))
```

This yields the following results:

```
<- sum(grid_MH_3$win_stay)) # win by staying
(n_win_stay #> [1] 12
<- sum(grid_MH_3$win_switch)) # win by switching
(n_win_switch #> [1] 12
```

which — as *staying* and *switching* appear to be equally successful — is wrong again!

Why? Because we made exactly the same error as above: By conditionalized our (larger) number of cases based on Monty’s options for opening doors, we ended up with cases that are no longer equiprobable.

Before we correct the error, note that we could have combined all steps of this simulation in a single pipe:

```
# S1: Simulation in 1 pipe:
expand_grid(car, g_1, g_2, player, open) %>%
filter((car != g_1), (car != g_2), (g_1 != g_2)) %>% # eliminate impossible cases
filter(open != car, open != player) %>% # impose conditions 1 and 2
mutate(win_stay = (car == player), # determine wins
win_switch = (win_stay == FALSE)) %>%
summarise(stay_wins = sum(win_stay), # summarise results
switch_wins = sum(win_switch))
#> # A tibble: 1 x 2
#> stay_wins switch_wins
#> <int> <int>
#> 1 12 12
```

Chunking all into one pipe allows us to easily change our simulation. Here, we can easily correct our error by simply ignoring (or commenting out) all of Monty’s actions of opening doors:

```
# S2: Simulation ignoring all of Monty's actions:
expand_grid(car, g_1, g_2, player, open) %>%
filter((car != g_1), (car != g_2), (g_1 != g_2)) %>% # eliminate impossible cases
# filter(open != car, open != player) %>% # DO NOT impose conditions 1 and 2
mutate(win_stay = car == player, # determine wins
win_switch = win_stay == FALSE) %>%
summarise(stay_wins = sum(win_stay), # summarise results
switch_wins = sum(win_switch))
#> # A tibble: 1 x 2
#> stay_wins switch_wins
#> <int> <int>
#> 1 18 36
```

The simplicity of constructing a simulation pipe also allows us to test other variants. For instance, we could ask ourselves:

Would only using the condition that Monty cannot open the door with the

`car`

(but ignoring that Monty cannot open the door initially chosen by the`player`

) yield the correct solution?Would only using the condition that Monty cannot open the door chosen by the

`player`

(but ignoring that Monty cannot open the door with the`car`

) yield the correct solution?

We can easily check this by separating condition 1 and 2 (and only using one of them):

```
# S3a: Impose only condition 1:
expand_grid(car, g_1, g_2, player, open) %>%
filter((car != g_1), (car != g_2), (g_1 != g_2)) %>% # eliminate impossible cases
filter(open != car) %>% # DO impose condition 1
# filter(open != player) %>% # BUT NOT condition 2
mutate(win_stay = car == player, # determine wins
win_switch = win_stay == FALSE) %>%
summarise(stay_wins = sum(win_stay), # summarise results
switch_wins = sum(win_switch))
#> # A tibble: 1 x 2
#> stay_wins switch_wins
#> <int> <int>
#> 1 12 24
# S3b: Impose only condition 2:
expand_grid(car, g_1, g_2, player, open) %>%
filter((car != g_1), (car != g_2), (g_1 != g_2)) %>% # eliminate impossible cases
# filter(open != car) %>% # DO NOT impose condition 1
filter(open != player) %>% # BUT condition 2
mutate(win_stay = car == player, # determine wins
win_switch = win_stay == FALSE) %>%
summarise(stay_wins = sum(win_stay), # summarise results
switch_wins = sum(win_switch))
#> # A tibble: 1 x 2
#> stay_wins switch_wins
#> <int> <int>
#> 1 12 24
```

The answer in both cases is yes — we obtain a correct solution.
However, both of these last simulations contain cases that could not occur in any actual game:
Monty Hall would either `open`

doors chosen by the `player`

(in S3a)
or `open`

doors containing the `car`

(in S3b).
Thus, these simulations would yield correct solutions, but still not provide a faithful representation of the problem:

- In the MHD, simulations that ignore all of Monty’s actions or only impose one of two conditions on his actions for opening a door yield the correct result (i.e., always
*switching*is twice as likely to win the car as*staying*). However, using*both*of these conditions to eliminate possible cases (i.e., taking into account what he would actually do on any particular game) yields an intuitively plausible, but incorrect solution.

Overall, these explorations enable an important insight:

- There are always multiple simulations possible. Not only are there many different ways of getting a wrong solution, but also different ways of obtaining a correct solution. Obtaining a correct result is no guarantee that a simulation (i.e., the process of using a particular method or strategy) was correct as well.

This is a fundamental point, that may seem quite trivial when illustrating it with a simpler problem:
Stating that the sum of \(1+1\) is 2 does not guarantee that we actually know how to add numbers.
What if we had memorized the answer? Does this mean that memorized answers are always correct?
What if we actually multiplied both addends (i.e., \(1\cdot1\)) and then doubled the result (i.e., \(2\cdot 1\)) to obtain their sum? This seems absurd, but would yield the correct answer in this particular case (but wrong answers for \(1+2\)).
Simulations often suffer from similar complications: They typically recruit complex machinery to answer quite basic questions. If we do not understand how they work, they can easily lead to erroneous results or to correct results for the wrong reasons. Thus, to productively use and enable trust in simulations, it is essential that they are transparent (i.e., we also understand *how* they work).

#### Combinations vs. permutations

An alternative conceptualization of the problem setup could start with the three objects, but rather than expanding all possible combinations and then eliminating cases, we could only enumerate actually possible cases by arranging them (i.e., all their permutations without repetitions).

Start with the three objects:

```
<- c("car", "goat 1", "goat 2")
objects
# Function for all permutations:
# See <https://blog.ephorie.de/learning-r-permutations-and-combinations-with-base-r>
# for explanation of recursion!
<- function(v) {
perm <- length(v)
n if (n == 1) v
else {
<- NULL
X for (i in 1:n) X <- rbind(X, cbind(v[i], perm(v[-i])))
X
}
}
# 6 possible arrangements:
perm(objects)
#> [,1] [,2] [,3]
#> [1,] "car" "goat 1" "goat 2"
#> [2,] "car" "goat 2" "goat 1"
#> [3,] "goat 1" "car" "goat 2"
#> [4,] "goat 1" "goat 2" "car"
#> [5,] "goat 2" "car" "goat 1"
#> [6,] "goat 2" "goat 1" "car"
```

Note: See the blog post on Learning R Permutations and Combinations at Learning Machines for the distinction between combinations and permutations.

This shows that there are only 6 initial setups of the objects.

#### Variations of the MHD

Several variants of the MHD can help to render its premises and solution more clear. Thus, use what you have learned above to adapt the simulation in the following ways:

**More doors**: Extend your simulation to a version of the problem with*100 doors*. After the player picks an initial door, Monty opens 98 of the other doors (all revealing goats) before you need to decide whether to stay or switch. How does this affect the winning probabilities for staying vs. switching?

**Hint:** As it is impractical to represent a game’s setup with 100 columns, this version requires a more abstract representation of the environment and/or Monty Hall’s actions. (For instance, the setup could be expressed by merely noting the car’s location…)

**Monty’s mind**: The classic version of the problem crucially assumes that Monty*knows*the door with the prize and curates the choices of the available doors correspondingly. Determine the player’s winning probabilities for staying vs. switching for the following alternative versions of the game:*Monty first*: The host first opens a door to reveal a goat, before the player selects one of the closed doors as her initial choice. The player then is offered an option to switch. Should she do so?*Ignorant Monty*: The host does not know the car’s location and opens a random door to reveal either a goat or the car. Should the player switch whenever a goat is revealed?*Angelic Monty*: The host offers the option to switch only when the player has chosen incorrectly. What and how successful is the player’s best strategy?*Monty from Hell*: The host offers the option to switch only when the player’s initial choice is the winning door. What and how successful is the player’s best strategy?

#### Additional variations

Many additional versions of the MHD exist.
For instance, the *three prisoners problem* (see below),
Bertrand’s box paradox, and the
boy or girl paradox)
(see Bar-Hillel & Falk, 1982 for psychological examinations.)

### 16.5.4 The sock drawer puzzle

The puzzles from the classic book *Fifty challenging problems in probability* (by Mosteller, 1965) can be solved analytically (Figure 16.5 shows the book’s cover). However, if our mastery of the probability calculus is a bit rusty, we can also find solutions by simulation.

#### The problem

- Use a simulation for solving
*The sock drawer*puzzle (by Mosteller, 1965, p. 1):

- Use an adapted simulation for solving the following variation of the puzzle:

A drawer contains red socks and black socks.

When two socks are drawn at random, the probability that both are of thesame coloris \(\frac{1}{2}\).

(c) How small can the number of socks in the drawer be?

(d) What are the first solutions with an odd number of red socks and an even number of black socks?

Note that the question in (c) is identical to (a), but the second sentence of the introduction has slightly changed: Rather than asking for a pair of red socks, we now require that the probability for both socks of a pair having the *same color* is \(\frac{1}{2}\).

#### 1. Solution to (a)

The following preliminary thoughts help to explicate the problem and discover some constraints:

The puzzle demands that a pair of socks is drawn. Thus, socks in the drawer are drawn at random (e.g., in the dark) and without replacement (i.e., we cannot draw the same sock twice).

Given that there are only two colors of socks (red

`r`

, vs. black,`b`

), drawing a pair of socks can generally yield four possible outcomes:`bb`

,`br`

,`rb`

, and`rr`

. The events`bb`

and`rr`

can only occur if there are at least two socks of the corresponding color in the drawer.For the probability of the target event

`rr`

to be \(\frac{1}{2} = 50\%\), there must be more red socks than black socks in the drawer. Otherwise, the number of events with one or two black socks would outnumber our target event`bb`

.For the target event of

`rr`

to occur, there need to be at least two red socks (i.e., \(|r| \geq 2\)).The simplest conceivable composition of the drawer is

`rrb`

.

A sampling solution:

- Drawing without replacement in R: Use the
`sample()`

function

```
# For a given drawer:
<- c("r", "r", "b")
drawer
# drawing once:
sample(x = drawer, size = 2, replace = FALSE)
#> [1] "b" "r"
# drawing N times:
<- 10000
N <- 0 # count target events "two red"
tr
# loop:
for (i in 1:N){
<- sample(x = drawer, size = 2, replace = FALSE)
s
if (sum(s == "r") == 2){ # criterion: 2 red socks
<- tr + 1
tr
}
}
# probability of tr:
/N
tr#> [1] 0.3362
```

Target event needs to count the number of red socks (i.e., number of times of drawing the element `"r"`

):
Test in R: `sum(s == "r")`

must equal 2.

Create a function that abstracts (or encapsulates) some steps:

```
<- function(drawer, N = 1000){
draw_socks
# initialize:
<- NA
prob <- 0 # count target events "two red"
tr
# loop:
for (i in 1:N){
<- sample(x = drawer, size = 2, replace = FALSE)
s
if (sum(s == "r") == 2){ # criterion: 2 red socks
<- tr + 1
tr
}
}
# probability of tr:
<- tr/N
prob
return(prob)
}
```

Notes:

In this function, the size of each draw (

`size = 2`

), our event of interest (`sum(s == "r")`

), and the frequency of this event (`2`

) were hard-encoded into the function. If we ever needed more flexibility, they could be provided as additional arguments.The function returns a

*probability*(i.e., a number between 0 and 1). The accuracy of its value depends on the number of samples`N`

. Higher samples will yield more stable estimates of this value.

Some initial plausibility checks:

```
# Check:
draw_socks(c("r", "g", "b"))
#> [1] 0
draw_socks(c("b", "b", "r"))
#> [1] 0
draw_socks(c("b", "r"))
#> [1] 0
draw_socks(c("r", "r"))
#> [1] 1
## Note errors for:
# draw_socks(c("r"))
# draw_socks(c("b"))
```

Note that varying the value of `N`

varies the variability of our estimate:

```
<- c("r", "r", "b")
drawer
draw_socks(drawer, 1) # 10^0
draw_socks(drawer, 10) # 10^1
draw_socks(drawer, 100) # 10^2
draw_socks(drawer, 1000) # 10^3
draw_socks(drawer, 10^6) # beware of computation times
```

Note that the range of values theoretically possible can vary between 0 and 1.
However, whereas it actually varies between those extremes for `N = 1`

, the chance of repeatedly obtaining such an extreme outcome decreases with increasing values of `N`

.
Thus, our estimates get more stable as `N`

increases.

Note also that there is a considerable trade-off between accuracy and speed (due to using a loop for sampling events). Using values beyond `N = 10^6`

takes considerably longer, without changing our estimates much.
Thus, we face a classic dilemma between being “quick and noisy” vs. being “slow and accurate.”

What is the lowest value of `N`

that yields a reasonable accuracy?
This raises the question: What is “reasonable?” Actually, this is one of the most important questions in many contexts of data analysis. In the present context, our target probability is specified as the fraction \(\frac{1}{2}\). Does this mean that we would only consider a probability value of exactly \(0.5\) as *accurate*, but any nearby values (e.g., \(0.495\) or \(.503\)) as *incaccurate*?
Realistically, we want some reasonable stability of the 2nd decimal digit (i.e., the percentage value):

```
round(draw_socks(drawer, 10^4), 2) # still varies on the 2nd digit
round(draw_socks(drawer, 10^5), 2) # hardly varies any more
```

Our tests show that a default value of `N = 10^5`

is good enough for (or “satisfies”) our present purposes.

Now we can “solve” the problem for specific drawer combinations. We will initially use a pretty coarse estimate (quick and noisy) and increase accuracy if we reach a promising range:

```
draw_socks(c("r", "r", "b"), 10^3)
#> [1] 0.312
draw_socks(c("r", "r", "b", "b"), 10^3)
#> [1] 0.155
draw_socks(c("r", "r", "r", "b"), 10^3)
#> [1] 0.492
```

The last one seems pretty close. Let’s zoom in by increasing our accuracy:

`draw_socks(c("r", "r", "r", "b"), 10^5) # varies around target value of 0.50`

#### 1. Solution to (b)

Part (b) imposes an additional constraint on the drawer composition: The number of black socks must be even (which excludes our first solution).

Automating the drawer composition by making it a function of two parameters:

```
# Two numeric parameters:
<- 2 # number of red socks
n_r <- 1 # number of black socks
n_b
# Function to make drawer (in random order):
<- function(n_r, n_b){
make_drawer
<- c(rep("r", n_r), rep("b", n_b)) # ordered drawer
drawer
sample(drawer, size = n_r + n_b) # a random permutation
}
# Checks:
make_drawer(4, 0)
#> [1] "r" "r" "r" "r"
make_drawer(3, 1)
#> [1] "r" "b" "r" "r"
make_drawer(2, 2)
#> [1] "r" "r" "b" "b"
```

Note that the `sample()`

function in `make_drawer()`

only creates a random permutation of the drawer elements.
If we trust that the `sample()`

function in `draw_socks()`

draws randomly, we could omit permuting the drawer contents.

Combine both function to create a drawer and estimate the probability for `N`

draws from it in one function:

```
<- function(n_r, n_b, N){
estimate_p
<- make_drawer(n_r, n_b)
drawer
draw_socks(drawer = drawer, N = N)
}
# Checks:
estimate_p(1, 1, 10)
#> [1] 0
estimate_p(2, 0, 10)
#> [1] 1
estimate_p(2, 2, 100)
#> [1] 0.15
estimate_p(3, 1, 1000)
#> [1] 0.516
```

Prepare a grid search for a range of specific parameter combinations:

```
# Parameters:
<- 2:20 # number of red socks
n_r <- 1:20 # number of black socks
n_b
<- expand.grid(n_r = n_r, n_b = n_b)
grid dim(grid)
#> [1] 380 2
```

Importantly, when conducting simulations based on sampling, we should always keep an eye on the computational demands of our simulation.

In the present case, we are considering 380 parameter combinations.
If we want to estimate the probability of our target event on the basis of `N`

draws, the number of samples needed overall varies as a function of `N`

. For instance, if the simulation for `N = 1,000`

takes several seconds, the same simulation for `N = 10,000`

will take minutes, and aiming for `N = 1,000,000`

may take hours or days.

To anticipate and avoid computational explosions, some pragmatic considerations are in order. Two simple approaches for addressing this problem include:

Narrow down the set of candidates considered based on

*a piori*constraints.Identify plausible candidates and increase the accuracy of the analysis only for those candidates.

Using what we know about the problem allows us to narrow down the range of parameters considered in two ways:

```
# Exclude undesired cases:
<- dplyr::filter(grid, n_b %% 2 == 0) # require even values of n_b
grid dim(grid)
#> [1] 190 2
# Exclude impossible cases:
<- dplyr::filter(grid, n_r > n_b) # require more red than blue socks
grid dim(grid)
#> [1] 90 2
```

The even-number constraint reduced the number of candidate pairs of parameters in `grid`

by 50% and the second step shaved off another 50% of cases to consider. This still leaves us with 90 parameter combinations to consider.

We can now compute the probability for each drawer combination in `grid`

.
However, to still use our computational resouces wisely, the second parameter to play with is the number of random draws `N`

in each simulation. To use our limited computing resources wisely, we adopt a multi-step procedure that incrementally narrows down the set of candidates considered:

- For speed’s sake, we start with a coarse estimate (i.e., estimate \(p_{e}\) for a relatively low value of
`N = 1000`

). - Narrow down the set of candidate parameters (by their distance to the target probability of \(p_{t} = \frac{1}{2}\)).
- Investigate only the set of plausible candiates (e.g., \(.40 \leq p_{e} \leq .60\)) more accurately (with a higher value of
`N`

). - Repeat.

The following code pipes narrow down the set of candidate parameters in three steps:

```
# Determine probability for each combination in grid:
# 1. Coarse search for candidates:
<- grid %>%
grid2 mutate(N = 10^3,
prob = purrr::pmap_dbl(list(n_r, n_b, N), estimate_p),
dist = abs(prob - .500)) %>%
filter(dist <= .10)
grid2
# 2. Zoom into the subset of plausible candidates:
<- grid2 %>%
grid3 select(n_r, n_b) %>%
mutate(N = 10^4,
prob = purrr::pmap_dbl(list(n_r, n_b, N), estimate_p),
dist = abs(prob - .500)) %>%
filter(dist < .02)
grid3
# 3. Check the remaining candidates with greater accuracy:
<- grid3 %>%
grid4 select(n_r, n_b) %>%
mutate(N = 10^6,
prob = purrr::pmap_dbl(list(n_r, n_b, N), estimate_p),
dist = abs(prob - .500)) %>%
filter(dist < .005)
grid4
```

Note some additional pragmatic considerations: The goal of our step-wise procedure is to identify plausible candidates and investigate those more thoroughly. Importantly, there are two types of errors:

Missing solutions occur when we exclude parameters that actually happen to be a true solution,

False positives occur when we accept a candidate that actually turns out to be false.

There is a trade-off between both errors: The more liberal (vs. restrictive) we are in accepting candidates, the less (vs. more) likely a miss will occur. Importantly, any miss (i.e., excluded parameters) will never appear again, whereas false positives can still be identified later. Thus, we need to protect ourselves against misses, rather than false positives. This is why the first two steps are more liberal in accepting candidates. This costs us extra computation time, but raises our confidence that we ultimately arrive at the right solutions.

This incremental procedure leaves us with only a few (i.e., typically one or two) results in `grid4`

.
To further increase our confidence in these results, we could repeat the procedure several times and/or increase the demands on the maximal distance `dist`

from our target probability \(p_{t} = \frac{1}{2}\)).
Ultimately, what we are willing to accept as an answer within a sampling approach always depends on what we consider to be close enough.

#### Possible alternatives and improvements

Use a

`while`

loop that increases parameter values (for`n_r`

and`n_b`

) until the computed probability falls within a threshold around the desired value.Use hill climing heuristics to detect trends in computed values (to exclude going further into the wrong direction).

Use numerical optimization methods to find optimal parameter combinations (within constraints).

#### 2. Solution to (c) and (d)

The following preliminary thoughts may help to solve the modified puzzle:

If the probability for both socks being of the

*same*color is \(p_{same} = \frac{1}{2}\), the the complementary probability that both socks are of different color is \(p_{diff} = \frac{1}{2}\) as well.As the problem asks for socks of the “same color,” any answer will be symmetrical in some sense: If \(m\) red and \(n\) black socks provide a solution for one color (e.g., red), then \(n\) red and \(m\) black socks provide a solution for the other color (e.g., black). This can be used to constrain a grid search in a simulation.

This symmetry may motivate the hypotheses that we need a symmetrical or balanced drawer combination and that the drawer containing four socks \(rrbb\) (i.e., two red and two black socks) provides a first solution. However, when enumerating all combinations for drawing a pair of socks out of \(r_{1}r_{2}b_{1}b_{2}\), we quickly see that the possible pairs with two different colors outnumber the two same-color pairs. Thus, our naive hypotheses are misguided and we may actually need an imbalanced drawer to obtain a 50:50 chance for drawing two socks of the same color.

Asking that a random pair of socks must be of the same color is less restrictive than asking for two red socks. To implement this less constrained version, we need to adjust our criterion for counting a successful draw in

`draw_socks()`

.

#### Analytical solutions

Here are some hints for using R to solve the puzzle in an analytical fashion (i.e., using the mathematics provided by probability theory):

For the original puzzle (a) and (b):

The probability for randomly drawing a red pair is

\(P(rr) = \binom{r}{2} / \binom{r+b}{2}\)

Implementation in R with the `choose()`

function:

```
# Function:
<- function(n_r, n_b){
prob_rr
# Frequencies:
<- choose(n_r, 2)
n_rr <- choose(n_r+n_b, 2)
n_all
# Probability:
<- n_rr/n_all
p_rr
return(p_rr)
}
# Check:
prob_rr( 1, 1)
prob_rr( 2, 2)
prob_rr( 3, 1) # 1st solution
prob_rr(10, 4) # close candidate
prob_rr(15, 6) # 2nd solution
prob_rr(20, 8) # close candidate
```

For the puzzle variation (c) and (d):

The probability for both socks in a random pair being of the same color is:

\(P(sc) = (\binom{r}{2} + \binom{b}{2}) / \binom{r+b}{2}\)

Implementation in R:

```
# Function:
<- function(n_r, n_b){
prob_sc
# Frequencies:
<- choose(n_r, 2)
n_rr <- choose(n_b, 2)
n_bb
<- choose(n_r+n_b, 2)
n_all
# Probability:
<- (n_rr + n_bb)/n_all
p_sc
return(p_sc)
}
# Check:
prob_sc( 1, 1)
prob_sc( 2, 2)
prob_sc( 5, 3) # 1st solution
prob_rr(10, 4) # close candidate
prob_rr(15, 6) # 2nd solution
prob_rr(20, 8) # close candidate
```

Given these analytic (and fast) formulas, we can conduct complete grid searches for a much larger grid:

```
# Grid search (for much larger grid):
# Parameters:
<- 1:1000 # number of red socks
n_r <- 1:500 # number of black socks
n_b
<- expand.grid(n_r = n_r, n_b = n_b)
grid dim(grid) # Note the number of cases (rows)!
# (a)+(b) Drawing two red socks:
<- grid %>%
grid_rr mutate(p_rr = purrr::pmap_dbl(list(n_r, n_b), prob_rr),
dist = abs(p_rr - .500)) %>%
filter(dist < .000001) # OR:
# filter(dplyr::near(dist, 0))
grid_rr
# (c) Both socks having the same color:
<- grid %>%
grid_sc mutate(p_sc = purrr::pmap_dbl(list(n_r, n_b), prob_sc),
dist = abs(p_sc - .500)) %>%
filter(n_r %% 2 == 1, n_b %% 2 == 0, dist < .000001) # OR:
# filter(dplyr::near(dist, 0))
grid_sc
```

### 16.5.5 Related probability puzzles

Solve the following puzzles (also from Mosteller, 1965) by conducting similar simulations:

- The
*Successive wins*puzzle (i.e., Puzzle 2 by Mosteller, 1965):

*The flippant juror*puzzle (i.e., Puzzle 3 by Mosteller, 1965):

*The prisoner’s dilemma*puzzle (i.e., Puzzle 13 by Mosteller, 1965):

**Hint:**
Note the structural similarity of *The prisoner’s dilemma* puzzle to the *Monty Hall* dilemma.

### 16.5.6 Coin tossing reloaded

This exercise involves one of the most famous phenomena in the psychology of human judgment and decision making.
The question *How do people perceive random events?* is the topic of a heated debate and countless publications in the most prestigious journals.

#### The (mis-)perception of randomness?

A popular task presented in this context is the following:

Which of the following sequences is more likely when repeatedly tossing a fair coin (with two possible outcomes: heads H or tails T):

- (a) H H H H
- (b) H T H T
- (c) H H H T

Most people intuitively think that regularity is positively correlated to rarity.
As sequences (a) and (b) appear as pretty regular, (c) seems more likely than (b), and both seem more likely than (a).
However, the intended solution of those who typically ask the question (e.g., Kahneman & Tversky, 1972b) is:
“All three sequences are equally likely.”
When contrasted with this solution, the intuitive answer appears to illustrate a *bias* in the human perception of random events.

The question whether human perception of chance or randomness is biased or adaptive, is a controversial issue in psychology. Its answer depends — as always — on what exact assumptions we make about the question or its premises. Given appropriate assumptions, seemingly adaptive or biased distortions can even emerge as two sides of the same coin. In this exercise, we will develop two sampling models to argue for both positions and explicate how their premises and conclusions differ.

#### The standard argument

The standard recipe for demonstrating a *bias* in human cognition proceeds as follows:

*Task*: Present some seemingly simple puzzle.*Data*: Collect “evidence” to show that people get it wrong.*Solution*: Present some normative model that seems to solve the puzzle.*Conclude*: People are biased.*Explanation*: Make up some evolutionary just-so story to “explain” why people are biased.

This recipe is widely used (e.g., see Wikipedia’s list of cognitive biases) and — if we manage to ignore the logical flaw that many “biases” come in two opposite varieties — highly influential and successful (in terms of publishing papers or winning Nobel prizes). Steps 1–4 are necessary for scoring an effect in an audience or publishing a paper (even though the “evidence” only adds some optional spice, as anyone looking at the puzzle will intuitively know how people will respond). The “explanation” in Step 5 is optional, but helps for a more serious impression or scientific flavor.

The people primarily interested in demonstrating (or “coining”) some bias usually stop here. But rather than just showing a curious effect, the more interesting questions are:

*Why*do people seem to get this wrong?- What are the reasons that humans have evolved in this particular direction?

It’s always easy to answer this question concerning the *adaptive* nature of the phenomenon by an evolutionary “just so” story (e.g., one that involves 2 systems, one of which fails to monitor the other)^{15}. However, if we think more carefully about the issue, it often turns out that there are good reasons for a seemingly distortive mechanism. These reasons typically have to do with systematic structures in our environment and the adaptive nature of cognition picking up these regularities.

The normative model used in Step 3 of the recipe typically appears as some mathematical formula (e.g., Bayes’ theorem) or some other form of probability theory. That’s also where R and simulations by sampling can come into play. The standard way of demonstrating the correct answer to queries about the probability of a sequence is the combinatorial tree representing all sequences of a given length. For instance, the blog post The Solution to my Viral Coin Tossing Poll (of the Learning Machines blog, on 2021-04-28) provides the following R code (adapted to our simpler version of the problem):

```
# Events/patterns of interest:
<- c("H", "H", "H", "H")
a <- c("H", "T", "H", "T")
b <- c("H", "H", "H", "T")
c
# Generate all possible cases (of length k=4):
<- 4
k <- expand.grid(rep(list(c("H", "T")), k))
all_cases
# As matrix:
<- as.matrix(all_cases)
mx # t(mx) # transposes mx
# Count:
sum(colSums(t(mx) == a) == k) # how many rows of mx equal a?
#> [1] 1
sum(colSums(t(mx) == b) == k) # how many rows of mx equal b?
#> [1] 1
sum(colSums(t(mx) == c) == k) # how many rows of mx equal c?
#> [1] 1
```

**Interpretation:**
This code provides a solution to the problem by first enumerating all possible cases and then counting how often each pattern of interest occurs in them.
The key step here consists in using the `expand.grid()`

function to generate all possible sequences of events `H`

and `T`

with a length of \(k=4\) (i.e., the combinatorial tree). This grid is then turned into a matrix (a step that could be skipped).
The `sum()`

statements count the number of cases in which the columns of the transposed matrix exactly match our sequences of interest (`a`

, `b`

, or `c`

). However, if one understands what `expand.grid()`

does, this counting step is also not needed (but may be helpful when considering longer sequences).

Overall, the code appears to demonstrate that — given all possible series with `H`

and `T`

of length 4 — each sequence of interest occurs exactly once.
Hence, it is concluded that all three sequences of interest are equally likely (i.e., occurring with a probability of \(P(a) = P(b) = P(c) = \frac{1}{16}\)).
The silent premise made here is: A fair coin is thrown exactly \(k=4\) times, but the probability of each possible outcome corresponds to the long-term outcomes if this process was repeated many (or an infinite number of) times.

Your tasks are to conduct two simulations with slightly different setups:

The code above enumerated the number of possible coin sequences of length \(k=4\). Alternatively, we can simulate an “experiment” that tosses a fair coin many times (e.g., observe 1 million tosses) and then count the actual occurrence of each event (

`a`

,`b`

, or`c`

) in this sequence. What would we conclude from this procedure/measure?A critic claims that people never experience infinite random events (like 1 million tosses of a coin). Instead, they experience many shorter sequences (e.g., 50,000 sequences of length 20). What matters for the human experience of a pattern’s frequency is how many of these sequences contain the event in question. This can be quantified by the average

*wait time*for each event or the probability of*non-occurrence*of each event in the set of all sequences. What would we conclude from this procedure/measure?

Conduct a simulation for each of these points and compare their results.

#### Argument by Simulation 1

**Strategy:** Sample a very long sequence of events and count number of occurrences:

Generate one very long series of (e.g., 1 million) random coin tosses

Optional: Convert the series and the events of interest into strings (to facilitate pattern matching).

Iterate through the sequence and ask: How often does each sequence occur?

We first use `sample()`

to simulate 1 million tosses of a fair coin:

```
<- sample(c("H", "T"), size = 10^6, replace = TRUE, prob = c(.5, .5))
tosses # tosses
```

To faciliate finding our patterns of interest in `tosses`

, we convert all into strings:

```
# Convert to strings:
<- paste(tosses, collapse = "")
toss_s # toss_s
nchar(toss_s)
#> [1] 1000000
# Events (as strings):
<- paste(a, collapse = ""))
(a_s #> [1] "HHHH"
<- paste(b, collapse = ""))
(b_s #> [1] "HTHT"
<- paste(c, collapse = ""))
(c_s #> [1] "HHHT"
```

As a quick check of pattern matching (using regular expressions) reveals that the corresponding counts provided by standard functions would be wrong, as they would fail to count overlapping matches:

```
gregexpr(pattern = "HH", text = "HHHHH")
::str_locate_all(string = "HHHH", pattern = "HH") stringr
```

Thus, we write an iterative function `count_occurrences()`

that moves a window of a pattern size `n_p`

over the entire width of a text string `n_t`

and counts any match it encounters:

```
# Count number of substring occurrences (including overlapping ones):
<- function(pattern, text){
count_occurrences
<- 0 # initialize
count
<- nchar(pattern) # width of pattern
n_p <- nchar(text) # width of text
n_t
for (i in (1:(n_t - n_p + 1))){
# Current candidate:
<- substr(text, start = i, stop = (i + n_p - 1))
candidate
if (candidate == pattern){ # pattern is found:
<- count + 1 # increment counter
count
}
# for end.
}
return(count)
}
# Check:
count_occurrences("XX", "X_X")
#> [1] 0
count_occurrences("XX", "XX_X")
#> [1] 1
count_occurrences("XX", "XX_XX")
#> [1] 2
count_occurrences("XX", "XXX")
#> [1] 2
count_occurrences("XX", "XXXXX")
#> [1] 4
```

Our checks verify that `count_occurrences()`

also counts overlapping patterns.
We can use this function to count the number of times each pattern of interest occurs in our tosses (both as strings):

```
# Counts:
count_occurrences(a_s, toss_s) # counts for a
#> [1] 62793
count_occurrences(b_s, toss_s) # b
#> [1] 62302
count_occurrences(c_s, toss_s) # c
#> [1] 62569
```

**Interpretation:**
The counts indicate that three events appear about equally often in our sample (with minor differences being due to random fluctuations).
Hence, we conclude that the three patterns of events/sequences are equally likely.

#### Counter-argument by Simulation 2

**Strategy:** Sample a large number of shorter sequences and count the number/probability of non-occurrence for each event of interest:

Assume that people naturally encounter

*many finite*sequences (e.g., 50,000 sequences of length 20), rather than*one infinite*one (Hahn & Warren, 2009).What matters for our natural experience of patterns is the average

*wait time*for an event (Gardner, 1988, p. 61ff.). As sequences consisting entirely of H or of T tend to cluster together (as HHH can only occur after HH, etc.), the average wait time for an event of HHHH is longer than the average wait time for HTHT.The average wait time corresponds to the

*probability of non-occurrence*of an event in a sequence (of finite length).

We now use `sample()`

to simulate a much shorter sequence of 20 tosses of a fair coin (as a string), but collect 50,000 such sequences in a vector `toss_s`

(representing each toss as a string of characters):

```
# Tossing a fair coin 10 times (as string):
<- 20
n_toss <- sample(c("H", "T"), size = n_toss, replace = TRUE, prob = c(.5, .5))
toss <- paste(toss, collapse = ""))
(toss_s #> [1] "HTHTTHTTHTHTHTHHHTHH"
# Sample many finite sequences:
<- 10^6/n_toss
n_seq <- rep(NA, n_seq)
toss_s
for (i in 1:n_seq){
<- sample(c("H", "T"), size = n_toss, replace = TRUE, prob = c(.5, .5))
toss
<- paste(toss, collapse = "") # as string
toss_s[i]
}
head(toss_s)
#> [1] "HHHTTTHTTTHHHTHHTHTT" "HTHTHHHHTTTTTTTTTTTH" "HTTHHTTTTTHHTTTHTHHH"
#> [4] "HHHTTHTTTHTTTTTTTTTH" "TTTTTTHTTHHHTHTHTHHT" "HTHTTTTTTTHTHTTHHTHT"
```

Note that the total number of coin tosses considered here is the same as in Simulation 1.
If we wanted, we could also have recycled our data (from `toss_s`

above) and split it into 5^{4} sequences of length 20 each. If we trust our random generator, sampling new coin tosses here should not make a difference.

At this point, we *could* be tempted to count the occurrence of each pattern of interest in our shorter strings:

```
## Red herring step:
# Count number of occurrences of each pattern:
# Counts (using base::sapply functions):
sum(sapply(X = toss_s, FUN = count_occurrences, pattern = a_s))
#> [1] 52969
sum(sapply(X = toss_s, FUN = count_occurrences, pattern = b_s))
#> [1] 53000
sum(sapply(X = toss_s, FUN = count_occurrences, pattern = c_s))
#> [1] 53111
# Using purrr::map functions:
# sum(purrr::map_dbl(toss_s, count_occurrences, pattern = a_s))
# sum(purrr::map_dbl(toss_s, count_occurrences, pattern = b_s))
# sum(purrr::map_dbl(toss_s, count_occurrences, pattern = c_s))
```

Note that these counts are considerably lower than those above. This is due to the fact that a very long sequence was cut into many pieces, so that the first and last elements of each sequence can no longer match a pattern of length \(k=4\) (in both directions). However, as we are only interested in comparing the relative counts for each pattern, our conclusion would not change.

**Interpretation:**
It appears that the three events of interest still occur equally often.
Hence, we would conclude that the three patterns of events/sequences are equally likely.

*However*, rather than counting the total number of appearances of a pattern,
we wanted to consider the *probability of non-occurrence* of a pattern in a finite sequence:

```
## Intended step/strategy:
# Consider average wait times (Gardner, 1988, p. 61f.) or
# p(non-occurrence) in a string.
# Ask how often did something NOT occur:
# Counts of non-occurrence (using sapply):
sum(sapply(X = tosses, FUN = count_occurrences, pattern = a_s) == 0)
#> [1] 1000000
sum(sapply(X = tosses, FUN = count_occurrences, pattern = b_s) == 0)
#> [1] 1000000
sum(sapply(X = tosses, FUN = count_occurrences, pattern = c_s) == 0)
#> [1] 1000000
# Using purrr::map functions:
# sum(purrr::map_dbl(tosses, count_occurrences, pattern = a_s) == 0)
# sum(purrr::map_dbl(tosses, count_occurrences, pattern = b_s) == 0)
# sum(purrr::map_dbl(tosses, count_occurrences, pattern = c_s) == 0)
```

These counts show some curious differences — and their magnitude seems too large to be due to random fluctuations:
As a higher count of non-occurrence corresponds to a lower frequency, the frequency ranks of our three events of interest appear as `a < b < c`

(i.e., they point in the intuitive direction).

To investigate this in more detail, we count the frequency of *all* possible events in `mx`

(i.e., all possible toss outcomes of length \(k=4\)) in our set of coin `toss_s`

:

```
# Analysis: Compute p-non-occurrence for ALL possible outcomes in mx:
# Initialize:
<- nrow(mx))
(nr_events #> [1] 16
<- rep(NA, nr_events)
n_no_occurrence <- rep(NA, nr_events)
cur_event
# Note: Event/outcomes in row 10 of mx:
# paste(mx[10, ], collapse = "")
# Loop through all possible events of mx:
# (Note: This may take a few seconds/minutes!)
for (i in 1:nr_events){
<- paste(mx[i, ], collapse = "")
cur_event[i]
<- sum(sapply(X = toss_s, FUN = count_occurrences,
n_no_occurrence[i] pattern = cur_event[i]) == 0)
}
```

Given that this loop cycles 16 times through 5^{4} sequences of length 20, it may take a few seconds or minutes.
After its evaluation, the vector `n_no_occurrence`

contains the number of non-occurrences of every pattern in all sequences.
Thus, `n_no_occurrence`

can easily be translated into the probability of non-occurrence for every pattern:

```
# Probability (of non-occurrence):
<- n_no_occurrence/n_seq)
(p_no_occurrence #> [1] 0.52446 0.25442 0.30896 0.25392 0.30624 0.35842 0.30418 0.25526 0.25416
#> [10] 0.30794 0.35920 0.31068 0.25406 0.31004 0.25660 0.52616
```

Figure 16.10 illustrates the probabilities of non-occurrence for each possible pattern of 4 coin tosses in finite strings of length 20. The dashed vertical line indicates the symmetry axis due to there being two binary outcomes. As any deviations from symmetry would indicate random fluctuations, the degree of the curve’s symmetry allows judging the robustness of our simulated data.

**Interpretation:**
Thus, the chance that (a) HHHH does not occur is higher than that (b) HTHT does not occur, and both are higher than the probability that (c) HHHT does not occur.
Expressed in terms of frequency:
Finite sequences in which the sub-sequences (c) HHHT or (b) HTHT occur (one or more times) actually *are more frequent* than sequences in which (a) HHHH occurs — exactly as humans would intuitively predict.
Thus, rather than illustrating a cognitive bias, this suggests that our intuition accurately reflects the finite way in which we naturally experience random sequences (see Hahn & Warren, 2009, for the details of this argument).

#### Conclusion

Our two simulations seem to support conflicting conclusions: The human perception of randomness either appears systematically biased or well-adapted to the actual likelihood of the relevant events. This may partly explain why there are huge theoretical debates regarding the adaptive or biased nature of human perception of random events. Importantly, our simulations also reveal the reasons for the conflicting interpretations: We first counted the number of pattern matches in a very long (near-infinite) sequence of coin tosses, before counting the number of patterns not occurring in shorter (finite) sequences of coin tosses. Which simulation we should trust depends on additional assumptions that determine which count makes more sense.

When evaluating the scientific debate with a sober mind, we see that there is no paradox or logical contradiction involved. Instead, the two positions in this debate adopt different perspectives and start from different premises regarding the data and measures that must be considered when evaluating the likelihood of random events. Overall, this shows that carefully conducting formal simulations helps explicating hidden premises and explains seemingly contradictory conclusions.

### References

*Acta Psychologica*,

*44*(3), 211–233. https://doi.org/10.1016/0001-6918(80)90046-3

*Cognition*,

*11*(2), 109–122. https://doi.org/10.1016/0010-0277(82)90021-X

*Time travel and other mathematical bewilderments*. W.H. Freeman.

*Psychological Review*,

*116*(2), 454–461. https://doi.org/10.1037/a0015241

*ORI Research Monographs*,

*1*(12), 430–454.

*Cognitive Psychology*,

*3*(3), 430–454. https://doi.org/10.1016/0010-0285(72)90016-3

*Fifty challenging problems in probability with solutions*. Addison-Wesley.

*Frontiers in Psychology*,

*11*, 567817. https://doi.org/10.3389/fpsyg.2020.567817

*Parade Magazine*,

*(September 9)*, p. 15.

*American Statistician*,

*29*, 67. https://doi.org/10.1080/00031305.1975.10479121

In my view, so-called

*dual systems*perspectives are serious rivals to Freud’s theory of 3 agents/instances (id, ego, super-ego) for the title of the most successful theory in psychology that is totally unproductive. The key problem of both theories is that they can always explain everything — including the opposite of everything.↩︎