# Chapter 9 Pseudo-counterfactual analysis

The word “assumption” has been used and overused, across all these sections referring to the conservative, parsimonious and intermediate solutions of all kinds. The time is ripe to discuss yet another assumption, perhaps the biggest myth of all in terms of scope.

The information presented insofar, especially in the previous section to filter remainders and produce intermediate solutions, is based on the implicit assumption that once the remainders are established, then classical Quine-McCluskey algorithm steps in to find the prime implicants.

Let us discuss again the how the simplifying assumptions are understood: a subset of the remainders which are actually used in the minimization process, those that have an active role in producing the prime implicants. It is in line with the philosophy of the QMC algorithm, that compares all possible pairs of configurations (observed with a positive outcome plus the included remainders), to identify those which differ through a single literal, and iteratively minimize them until nothing can be minimized further.

Except for a very small number of developers, the vast majority of the QCA users don’t have a very clear idea of how the programming of the QMC algorithm really works, it is just a black box that outputs the expected results.

On the other hand, researchers do expect that remainders are actually *used* in the minimization process. Much to everyone’s surprise, this is not how prime implicants are produced in the current version of package ** QCA**. The final solutions are exactly the same, but they have almost nothing to do with the remainders, be them good, tenable, untenable, easy, difficult, incoherent, implausible and anything else.

It is easy to imagine this is an almost perplexing statement, that raises a potential mistrust in the package (after all, it does not follow the classical, exact algorithm put forward by Quine and McCluskey), or at the very least raises two immediate questions:

- How does the package
finds the correct solutions, if no remainders are actually used?*QCA* - If the solutions are correct (and they are) what is then left of the entire theoretical corpus around the remainders? Are all of these publications sideways?

The short answer is: no. All publications around the remainders are valid, it is only the procedure to derive all categories of remainders that is different. While the current methodology focuses on what remainders are *included* in the minimization (following the classical QMC algorithm), quite the opposite the package ** QCA** concentrates on the configurations that are

*excluded*from the minimization process.

The end result is exactly the same, much like a simple addition equation like \(P + N + R = S\), where parameters \(P\) and \(N\) are always known (the positive and the negative output configurations). If \(P = 4\) and \(N = 5\), the equation becomes \(4 + 5 + R = S\). By filtering the remainders, suppose three of them are identified so \(R = 3\), then \(S\) can be calculated as 12. But the reverse is also true: knowing the end result \(S = 12\), we can derive \(R\) to be equal to 3.

This highly simplified example is only a conceptualization, of course, but it is a very accurate description of what is happening in package ** QCA**: it can calculate the end result \(S\) based on the first two parameters \(P\) and \(N\), and reverse engineer the calculation of parameter \(R\) by using the result \(S\).

From an algebraic point of view this seems like a magician activity pulling a rabbit out of an empty hat, for the end result involving an unknown quantity should be unknown. But using some very precise regularities of the minimization process, this is already proven to be possible. Contrary to the common perception, the remainders are never actually *used* in the minimization, they are only derived after the solutions have been generated.

## 9.1 eQMC

How exactly this is possible was first described by Dușa (2007), and later formalized by Dușa and Thiem (2015). The idea is very simple and has been implemented in package ** QCA** since version 0.6-5 back in 2007, after having made these two absolute basic observations about the prime implicants:

- they are always supersets of the initial positive output configurations
- they are never supersets of the negative output configurations

The quest was to find all supersets of the positive output configurations, that at the same time were not supersets of the negative output configurations, and it was a simple matter of applying three functions:

`findPrimes()`

, later renamed to`findSupersets()`

- a base function
`setdiff()`

to obtain the set difference with all elements in the first set that were not present in the second, and - a function to remove the redundant implicants, supersets that had supersets of their own (meaning they were implicants but not
*prime*implicants).

Finding all supersets of a given set of positive output configurations builds on the previous findings from Dușa (2010), where two other helpful ideas have been introduced:

- using the implicant matrix, which for binary crisp sets it is the same as the so-called \(3^k\) matrix where \(k\) is the number of causal conditions)
- instead of searching through all columns from the implicant matrix, it is more simple to just use its row numbers so the information from an entire matrix can be compressed in a single vector.

All these information might seem too abstract and unrelated to the Boolean minimization process, but a simple example will reveal the underlying idea.

A hypothetical dataset (HD) will be used for both the classical QMC algorithm and this new procedure. First, suppose we have empirical evidence for all possible configurations of causal conditions (no limited diversity), amounting to a (c)ompletely specified truth table:

```
HD <- cbind(createMatrix(rep(2, 4)), rep(c(1, 0), c(12, 4)))
colnames(HD) <- c(LETTERS[1:4], "Y")
ttHDc <- truthTable(HD, outcome = "Y")
ttHDc
```

```
OUT: output value
n: number of cases in configuration
incl: sufficiency inclusion score
PRI: proportional reduction in inconsistency
A B C D OUT n incl PRI
1 0 0 0 0 1 1 1.000 1.000
2 0 0 0 1 1 1 1.000 1.000
3 0 0 1 0 1 1 1.000 1.000
4 0 0 1 1 1 1 1.000 1.000
5 0 1 0 0 1 1 1.000 1.000
6 0 1 0 1 1 1 1.000 1.000
7 0 1 1 0 1 1 1.000 1.000
8 0 1 1 1 1 1 1.000 1.000
9 1 0 0 0 1 1 1.000 1.000
10 1 0 0 1 1 1 1.000 1.000
11 1 0 1 0 1 1 1.000 1.000
12 1 0 1 1 1 1 1.000 1.000
13 1 1 0 0 0 1 0.000 0.000
14 1 1 0 1 0 1 0.000 0.000
15 1 1 1 0 0 1 0.000 0.000
16 1 1 1 1 0 1 0.000 0.000
```

This is not only a hypothetical truth table but also an unrealistic one with 12 out of the 16 possible configurations having a positive output and the rest of 4 have a negative output. It is presented here for demonstration purposes only but it nevertheless has a lot of potential to show how the QMC algorithm really works.

The classical QMC minimization generates two prime implicants \(\sim\)A and \(\sim\)B:

`[1] "~A" "~B"`

In a real situation, it would be close to impossible to find empirical evidence for all 16 possible configurations, and the truth table would display the observed positive and negative configurations as well as the remainders which have a question mark “`?`

” in the OUTput column. Completely specifying the truth table just simulates the process of including the remainders.

But suppose only the first four configurations have a positive output, leading to an (i)ncompletely specified and more realistic truth table:

```
OUT: output value
n: number of cases in configuration
incl: sufficiency inclusion score
PRI: proportional reduction in inconsistency
A B C D OUT n incl PRI
1 0 0 0 0 1 1 1.000 1.000
2 0 0 0 1 1 1 1.000 1.000
3 0 0 1 0 1 1 1.000 1.000
4 0 0 1 1 1 1 1.000 1.000
5 0 1 0 0 ? 0 - -
6 0 1 0 1 ? 0 - -
7 0 1 1 0 ? 0 - -
8 0 1 1 1 ? 0 - -
9 1 0 0 0 ? 0 - -
10 1 0 0 1 ? 0 - -
11 1 0 1 0 ? 0 - -
12 1 0 1 1 ? 0 - -
13 1 1 0 0 0 1 0.000 0.000
14 1 1 0 1 0 1 0.000 0.000
15 1 1 1 0 0 1 0.000 0.000
16 1 1 1 1 0 1 0.000 0.000
```

Minimizing this truth table (this time including the remainders) leads to the same two prime implicants \(\sim\)A and \(\sim\)B:

`[1] "~A" "~B"`

The same prime implicants are obtained with the eQMC method, using the `findSupersets()`

function applied to the first four rows and the first four columns from the HD data (same as the first four configurations from the truth table) to yield the following superset row numbers from the 3\(^k\) matrix:

```
[1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 28 29 30 31 32
[23] 33 34 35 36 37 38 39 40 41 42 43 44 45
```

There are of course more than 16 rows from the truth table, because in this example the implicants matrix has \(3^4 = 81\) rows. The \(3^k\) matrix is signaled by using the vector `rep(3, 4)`

, where the number 3 literally implies \(3^4\), and the configurations from `ttHDc`

are raised in the implicants matrix by adding 1 to their values and use the value 2 for the presence and the value 1 for the absence of a causal condition.

The function `findSupersets()`

is next applied to the last four rows of the `HD`

matrix, to find all supersets of the negative output configurations:

```
[1] 2 3 4 5 6 7 8 9 19 20 21 22 23 24 25 26 27 55 56 57 58 59
[23] 60 61 62 63 73 74 75 76 77 78 79 80 81
```

The next step is to find all supersets from the positive output configurations, that are *not* supersets of the negative output configurations:

```
[1] 10 11 12 13 14 15 16 17 18 28 29 30 31 32 33 34 35 36 37 38 39 40
[23] 41 42 43 44 45
```

These are all supersets of the positive output configurations, but most of them are redundant because there are bigger supersets in this vector. The final step of this procedure is to remove all redundant supersets, to obtain the minimal non-redundant prime implicants. It involves two functions called `removeRedundants()`

and `getRow()`

which are internal, not designed for the user level (therefore undocumented) but it is easy to see what they do:

`[1] 10 28`

These are the row numbers in the \(3^k\) matrix that are the equivalent of the \(\sim\)B and \(\sim\)A prime implicants, recalling that a value of 0 signals a minimized condition, a value of 1 signals the absence of a condition and a value of 2 signals the presence of a condition:

```
[,1] [,2] [,3] [,4]
[1,] 0 1 0 0
[2,] 1 0 0 0
```

The entire procedure is encapsulated in the **e**QMC minimization method:

`[1] "~A" "~B"`

The resulting prime implicants are exactly the same, despite using a very different procedure which brought a great deal of speed compared with the previous one, consuming a lot less memory (for 15 causal conditions, about 15 MB compared to 1.4 GB for the classical QMC). It also made it possible to explore up to 18 conditions at once, therefore additional three conditions meaning \(3^3 = 27\) times more complex situations than before.

Past that threshold, the eQMC algorithm reaches its maximum possibilities. The culprit is less represented by memory consumption, although at an extremely high number of causal conditions the generated vectors of row numbers for the supersets can potentially grow larger than available memory. Possibly problematic, after 19 causal conditions the row numbers cannot even be represented in 32 bit computers, which have an upper representation limit of \(2^{31}\), that is still higher than \(3^{19}\) but not enough compared to \(3^{20}\) for the implicant matrix with 20 causal conditions.

The more serious issue is the calculation time, especially at the final step to remove the redundant implicants: for millions or tens of millions of implicants, even this “quick” process can take a lot more time than expected.

As shown, the input of the classical QMC procedure consists of 12 positive output configurations, either having a completely specified truth table or including the 8 remainders that are treated the same as the empirically observed positive output configurations. In the eQMC procedure, sifting through the supersets of the 4 positive and 4 negative output configurations generates the very same prime implicants, without even touching the “included” remainders.

One might argue this is not entirely accurate, because the remainders are still somehow involved in this alternative procedure, among the redundant superset implicants. This would be a valid counter argument, if not for a third minimization method called “CCubes”, that finds the prime implicants not by searching through supersets but using consistency cubes.

## 9.2 Consistency Cubes

The hypothetical dataset in the previous section is highly straightforward, both \(\sim\)A and \(\sim\)B showing perfect consistency with the presence of the outcome. For the next procedure, suppose we have the following simulated truth table with three positive and three negative output configurations:

```
tt <- matrix(c(1,0,0,0,1,1, 0,0,1,0,0,1,
1,0,0,1,1,1, 0,1,1,0,1,1,
1,1,1,0,0,0), nrow = 6)
dimnames(tt) <- list(c(11, 2, 6, 3, 12, 16), c("A", "B", "C", "D", "OUT"))
tt
```

```
A B C D OUT
11 1 0 1 0 1
2 0 0 0 1 1
6 0 1 0 1 1
3 0 0 1 0 0
12 1 0 1 1 0
16 1 1 1 1 0
```

Similar to the eQMC method, the input for this procedure does not involve any remainders but only the positive and the negative output configurations. The following definition is necessary to properly describe how consistency cubes work:

Definition 9.1: A prime implicant is the simplest possible, non-redundant, fully consistent superset of any positive output configuration.

In this definition, fully consistent means it is not a superset of an observed negative output configuration. The simplest possible superset is a single condition (either its presence, or its absence). Figure 9.1 below is a graphical representation of the truth table `tt`

, from which it can be observed that none of the conditions is fully consistent in their presence. There is only one condition (C) that is fully consistent in its *absence* but it does not cover all observed positive configurations and search is continued at the next level of complexity.

In the next level of complexity, superset expressions are searched in all combinations of 2 conditions out of 4, and within each combination in all possible pairs of their respective levels. In none are found at this level, search is continued in all combinations of 3 conditions out of 4, until everything is exhausted.

Since binary crisp conditions are just a subset of the general multi-value conditions, a complete search space \(S\) containing all these possibilities can be calculated as the sum of all combinations of `c`

selected conditions out of `k`

, times the product of their respective number of levels \(l_s\) for each such selected combination:

\[\begin{equation} S_{MV} = \sum_{c = 1}^{k}\binom{k}{c}\prod_{s = 1}^{c} l_s \tag{9.1} \end{equation}\]

The search space for the binary crisp sets (CS) involve just two levels (0 and 1) for each condition, and the equation above results in exactly \(3^k - 1\) possibile expressions, describing what Ragin (2000, 132–36) calls “groupings”. These are all possible configurations from the \(3^k\) implicant matrix specific for binary crisp sets:

\[\begin{equation} S_{CS} = \sum_{c = 1}^{k}\binom{k}{c}\prod_{s = 1}^{c} 2 = \sum_{c = 1}^{k}\binom{k}{c}2^c = 3^k - 1 \tag{9.2} \end{equation}\]

Using a similar technique, Ragin lists all possible expressions combining presence and absence of causal conditions in table 5.4 (p.134). His search technique is very similar to this bottom-up approach (starting from the simplest expressions), using what he calls the “containment rule” to eliminate more complex, redundant expressions.

The efficiency of the bottom-up approach (from simplest to the most complex expressions) can be measured by the performance of the CCubes algorithm, that combines the properties of equation (9.1) with the requirements of definition 9.1. These are all key ingredients that make the CCubes algorithm very fast.

At the next level of complexity `c = 2`

, figure 9.2 displays all possible combinations of presence and absence for the conditions A and D, out of which two pairs are consistent: \(\sim\)A\(\cdot\)D and A\(\cdot{\sim}\)D are simultaneously associated with an observed positive output (above the separator), while at the same time not associated with an oberved negative output (below the separator).

Both sides of the cube have to be lit below the separator in order to prove inconsistency (such as those from fourth implicant A\(\cdot\)D), otherwise in the absence of complete negative cubes they are consistent with the positive ones.

## Search space

Exploring the entire search space is a polynomial, time consuming operation. To make sure the results are exhaustive, all possible combinations of conditions must be verified, along with all their combinations of levels. There are 4 pairs of such combinations of levels in figure 9.2, but there are 6 combinations of 4 conditions with 4 pairs of levels each, consuming 48 computer cycles (six times four times two comparisons).

To solve this exponential computational problem, the search space is partitioned into layers of complexity, starting with the least complex (`c = 1`

) to a maximum value of `k`

causal conditions. A second strategy to eliminate redundant search becomes evident by noticing that not all combinations of levels should be checked, as the relevant ones are already found in the truth table configurations, thus solving yet another usual problem (memory consumption) in such polynomial search strategies.

In figure 9.2 there are four combinations for the levels (presence and absence) of the two conditions A and D, but only two of them are relevant and they are already found in the observed configurations (the other two combinations of levels will certainly prove to be void, as they are not observed).

```
A D OUT
11 1 0 1
2 0 1 1
6 0 1 1
3 0 0 0
12 1 1 0
16 1 1 0
```

By eliminating certain void search, the search space is shortened to only 6 rows times two columns (12 computer cycles instead of 48). The identified prime implicants are the conjunctions A\(\cdot{\sim}\)D is also the same as the binary combination “`10`

” on the first line, truth table row number 11), and the conjunction \(\sim\)A\(\cdot\)D, same as the binary combination “`01`

” on the next two lines, truth table row numbers 2 and 6), as they are found in the observed positive output configurations and not in the negative output ones.

A third strategy to shorten the search space is made possible by using the mathematical regularities from the previous two implementations, in this example transforming the binary representations of the truth table configurations in their decimal equivalent:

```
AD OUT
11 2 1
2 1 1
6 1 1
3 0 0
12 3 0
16 3 0
```

While the exhaustive approach consumed 48 cycles, the same results cand be obtained by searching through only these 6 decimal numbers: 2, 1, 1, 0, 3, 3 (the first three corresponding to the observed positive configurations where `OUT = 1`

). Since they are not found in the negative truth table configurations (numbers 0, 3, 3), both are identified as prime implicants.

This bottom up approach is superior to the previous QMC and eQMC methods in terms of efficiency, and generates the same prime implicants:

```
2 6 11
~C x x -
~A*B - x -
~A*D x x -
A*~D - - x
```

All prime implicants from figures 9.1 and 9.2 are found: \(\sim\)C, \(\sim\)A\(\cdot\)D and A\(\cdot{\sim}\)D. The fourth one \(\sim\)A\(\cdot\)B is actually redundant because it does not contribute to solving the prime implicants chart (explaining all positive output configurations).

This is the search strategy corresponding to the parsimonious solution, and contrary to the common (mis)perception, the pseudo-counterfactual analysis does not actually *use* any remainders to generate prime implicants and yet the final solutions are the same as those from the classical QMC algorithm. It may not seem highly relevant at this point, but it will prove important in the next chapter.

## 9.3 Include vs. exclude

Both Standard Analysis (SA) and the Enhanced Standard Analysis (ESA) insist on which counterfactuals to be *included* in the minimization process, together with the observed positive output configurations. By default, all remainders are included, unless specifically filtering out some of them: incoherent counterfactuals, untenable assumptions, contradictory simplifying assumptions etc.

The usual conceptualization of the minimization process follows the classical QMC procedure with a set of configurations composed from the positive output ones plus what is left from the remainders after filtering out. In the classical procedure, what is left out (the negative output configurations and the filtered counterfactuals) do not play any role whatsoever in the minimization process. Only what is preserved contribute to a minimal solution, where the word “minimal” should be interpreted function of how many remainders are left after filtering out some of them.

While QMC ignores what is left out, the current default algorithm in package ** QCA** does exactly the opposite: it ignores the “included” remainders and instead concentrates on the contrast between the positive output configurations (on one hand) and the set composed from the negative output configurations plus what is filtered out from the remainders (on the other).

In the classical QMC algorithm, the remainders which are not filtered out and survived this process are treated as if they had a positive output value, based on the assumption that should these remainders ever be observed, they would contribute to producing the phenomenon of interest. And this is a very intuitive conceptual image of the minimization process, taking out of this picture the remainders which are filtered out.

But the same results are obtained by ignoring the surviving remainders, and instead concentrating on those which are filtered out. In this alternative process a different assumption is made, that should these (filtered out) remainders ever be observed, they would not lead to producing the phenomenon of interest.

The counterfactual analysis assumes the surviving remainders would produce the output, whereas the pseudo-counterfactual analysis assumes the filtered out remainders would *not* produce the output, should they be empirically observed. These are, of course, the two sides of the same coin: QMC focuses on the *included* remainders, while eQMC and the faster CCubes concentrate on the *excluded* remainders, with the same end result.

The reason for this alternative approach is speed and memory. The classical QMC approach reaches its limits at about 11-12 input causal conditions (consuming a lot of memory in the process), while CCubes can easily deal with up to 30 causal conditions with no additional memory needs.

Building on the previous example with the simulated truth table, there are 6 observed configurations having the row numbers: 11, 2, 6, 3, 12, and 16. Since there are 4 binary crisp conditions, the remainders are found on row numbers:

` [1] 1 4 5 7 8 9 10 13 14 15`

Suppose the remainders on rows numbers 7 and below should be excluded from the minimization process, which means the classical QMC algorithm would include the rest of the surviving remainders 8, 9, 10, 13, 14 and 15, which are assumed to take a positive output value of 1.

```
toinclude <- c(8, 9, 10, 13, 14, 15)
tt2 <- rbind(tt, cbind(getRow(toinclude, noflevels = rep(2, 4)), 1))
rownames(tt2) <- c(rownames(tt), toinclude)
```

Using a cascading combination of functions `rbind()`

, `cbind()`

and `getRow()`

, the code above simulates the data becoming the input for the the classical QMC algorithm. It is very important to understand this is a simulated truth table, because in a normal one the column OUT would be coded with a question mark sign `"?"`

for the included remainders, not a value of 1 as here.

```
2 6 8 9 10 11 13 14 15
A*~C - - - x x - x x -
A*~D - - - x - x x - x
~C*D x x - - x - - x -
~A*B*D - x x - - - - - -
```

The net effect of hardcoding the output value with 1 (instead of `"?"`

) for the remainders is the generation of a larger PI chart, where the correct one would only display columns “2”, “6” and “11” (the observed positive configurations). It can be seen that only the prime implicants A\(\cdot{\sim}\)D and \(\sim\)C\(\cdot\)D cover these three columns, and the other two PI lines do not count in the soution.
**Manual hardcoding of the output column is not a very good idea**, this is the morale that captures the essence of this example.

The correct PI chart is obtained by using the initial truth table `tt`

, specifically excluding the hypothetical untenable remainders 1, 4, 5, 7:

```
minimize(truthTable(tt, outcome = OUT, exclude = c(1, 4, 5, 7)),
include = "?", method = "CCubes")$PIchart
```

```
2 6 11
A*~D - - x
~C*D x x -
~A*B*D - x -
```

### References

*Quality & Quantity*44: 99–113. https://doi.org/10.1007/s11135-008-9183-x.

*Journal of Mathematical Sociology*39: 92–108. https://doi.org/10.1080/0022250X.2014.897949.

*Fuzzy Set Social Science*. Chicago; London: University of Chicago Press.