Chapter 10 QCA extensions

10.1 Temporal QCA

Ever since the QCA method has been developed, it has been a promising methodological candidate to study complex phenomena. The empirical data is subjected to a theoretical model which consists of a series of causal conditions and one outcome, which could partially explain why it is sometimes confused with a regression model.

While regressions share a similar, simple diagram (independent variables having a direct effect over the dependent variable), in reality the phenomena being studied have a much more complex structure. Some causal factors naturally happen before others, thus determining (directly or indirectly) their structure and their possible effects on the final outcome.

The temporality aspect of QCA has been extensively discussed before (De Meur, Rihoux, and Yamasaki 2009; C. Schneider and Wagemann 2012; Hak, Jaspers, and Dul 2013; Marx, Rihoux, and Ragin 2014), but few of the identified solutions have been converted into concrete research instruments to be combined with the current QCA software. While many of these solutions are tempting to discuss, this section will cover only a smaller subset which can be used in conjunction with R and wherever possible with the package QCA.

Caren and Panofsky (2005) wanted to tackle this important aspect of the social research and propose a new methodology called TQCA (an extension of the main QCA with a temporal dimension). As they eloquently put it:

Variables and cases, as Ragin deploys them in QCA, are frozen in time—they are treated neither as containing sequences of events nor as forces that cause changes to occur over time.

Much like in the clarifying response from Ragin and Strand (2008), I have the same opinion that such an attention to the dimension of time is highly welcomed in the methodological field. Since these two papers have appeared, the concept of temporality has been a recurrent theme in the literature, in close connection with causality which is going to be treated in more depth in the next section.

Although Caren and Panofsky proposed an extension of the QCA methodology, actually modifying the minimization algorithm to incorporate this dimension, Ragin and Strand showed that it needs no special modification because it was always possible to do that with the classical QCA. The solution was pointed right back to the original book by Ragin (1987, 162):

Note that it is possible to include causal variables relevant to historical process in a truth table (such as “class mobilization preceded ethnic mobilization,” true or false?) and to analyze combinations of such dichotomies. This strategy would enhance the usefulness of Boolean techniques as aids to comparative historical interpretation.

The solution is therefore as simple as using just one additional column in the data (called EBA) to account for the cases where the causal condition E (support for elite allies) happens or not before the condition A (national union affiliation). In this version of the data, conditions P and S are placed first and last but don’t carry a temporal information.

1  1 1 1   1 1   1
2  1 1 1   1 1   1
3  1 1 1   0 1   1
4  1 1 1   1 0   1
5  1 1 1   1 0   1
6  1 1 0   - 1   1
7  1 0 1   - 1   0
8  1 0 1   - 1   0
9  1 0 0   - 1   0
10 1 0 0   - 1   0
11 1 0 0   - 1   0
12 1 0 0   - 0   0
13 0 1 1   0 1   1
14 0 1 0   - 0   0
15 0 0 0   - 0   0
16 0 0 0   - 0   0
17 0 0 0   - 0   0

The dash is a special sign, departing from the regular binary values 0 and 1, but it is recognized as such by package QCA. It is there to account for the cases where it is impossible to assess if the condition E really happens before the condition A, because one of them or even both are not happening.

For instance, the first case with a dash at condition EBA is on line 6, where condition E is present (value 1) but condition A is absent (value 0, it does not happen). It would be logically impossible to say that E happens before A happens, given that A does not happen at all.

Even more obvious, on the last case where both E and A are absent, it would not make sense to say that E happens before A, just as it would be illogical to say that E doesn’t happen before A doesn’t happen. To gain any logical interpretation, the expression E before A needs both E and A to actually happen.

With this kind of data, it is possible to perform a regular Boolean minimization and obtain the same results as those initially advanced by Caren and Panofsky, and further simplified by Ragin and Strand:

minimize(truthTable(RS, outcome = "REC"))

M1: P*E*S + P*E*A*EBA + E*A*~EBA*S <-> REC

This result faithfully replicates those presented by Ragin and Strand, and expression such as P*E*A*EBA indicating a causal order, with both E and A as part of this prime implicant and the simultaneous information EBA, that is E happens before A happens. The third expression E*A*eba*S is also sufficient for the outcome REC (union recognition), but in this case the condition E does not happen before condition A.

As simple as it may seem, this approach suffers from the main disadvantage that it only works with a conservative solution type. It would be impossible to select any meaningful counterfactuals, not only configurations of conditions for which there is no empirical information, but also with an added uncertainty related to the causal order of the counterfactuals.

Given this rather severe restriction, it is not surprising that an initially promising extension to incorporate the temporal dimension, was not further developed. There are now over 10 years since the article by Caren and Panofsky appeared, and nothing else was further mentioned.

Another, under appreciated and almost forgotten paper was contributed by Hino (2009). Building on the work of Caren and Panofsky, as well as Ragin and Strand, he proposed a number of different other techniques to introduce the temporal dimension, using various forms of time-series: Pooled QCA, Fixed Effects QCA, and Time Differencing QCA. While in the previous TQCA technique, it was the sequence of events that leads to a specific outcome, Hino’s approach (called TS/QCA) considers the changes of the events as causal factors that produce a certain outcome.

One major advantage of the TS/QCA is related to the number of causal conditions, that remains equal to the one from the classical QCA (in the former TQCA, additional columns are created for each pair of causal order between two conditions). In TS/QCA, it is the number of dataset rows that is multiplying, as a function of the numbers of cases and the number of temporal measurements: the same case can appear multiple times in the dataset, function of how many measurement points it has.

In the Time Difference method, only two measurements are of interest (the start point, in Hino’s case 1980, and the end point, 1990) for each case. The difference between them is then trivially transformed into binary crisp sets using a value of 1 for a positive difference (the end point value is higher than the starting point), and a value of 0 otherwise.

minimize(truthTable(HC, outcome = "VOTE"))


Including the counterfactuals, this becomes:

minimize(truthTable(HC, outcome = "VOTE"), include = "?")

M1: CONV + PRES80 <-> VOTE

Arguably, Hino’s method could perhaps be improved by employing fuzzy sets when calibrating the difference between the start time and end time. Instead of 1 for positive difference and 0 otherwise, one could imagine fuzzy scores to take into account large differences, negative towards zero and positive towards 1, while very small differences would be close to the 0.5 crossover point. By examining the full range of differences, dedicated calibrated thresholds could be employed and allow the final calibrated conditions to have fuzzy, rather than crisp values.

The Time Difference method is quite simple, and can be used without resorting to any other specially designed algorithms, it works with a normal QCA minimization process. And due to the recent theoretical developments, it can also benefit the counterfactual analysis in the (Enhanced) Standard Analysis, using directional expectations etc.

However, Berg-Schlosser (2012, 210) points that Hino’s method can be applied with metric variables only, and in addition it does not detect particular sequences of events, as the previous TQCA method did. I would personally not worry too much about the metric issue as almost all causal conditions (especially in fuzzy sets QCA) are measured on a metric scale, one way or another.

The second issue is indeed something to be concerned about, and Berg-Schlosser mentions a possible solution of defining the condition itself as a causal sequence, and something very similar was mentioned by De Meur, Rihoux, and Yamasaki (2009, 162) to integrate the temporal dimension into the conditions themselves and create “dynamic” conditions.

For Ragin and Strand’s example, this would probably imply having the condition EBA indicating with a value of 1 if E happens before A happens, and 0 otherwise (including the situations when E does not happen, or A does not happen or both not happening).

However this approach still implies an additional information about the causal order between two existing conditions, whereas a dynamic temporal condition should contain the information itself. Obviously, a different condition ABE would be needed to signal if A happens before E happens.

This approach has the big advantage of eliminating the “don’t care” code, and the data becomes more standard, more simple to minimize, and the ice on the cake it opens up the possibility to include remainders for temporal QCA. With the two (binary crisp) conditions EBA and ABE, the conditions A and E are now redundant, and the solution changes to:

RS2 <- RS
RS2$ABE <- recode(RS$EBA, "0 = 1; else = 0")
RS2$EBA <- recode(RS$EBA, "- = 0; else = copy")
minimize(RS2[, c(1, 4, 7, 5, 6)], outcome = "REC")


This result is perfectly equivalent to the one minimizing the original dataset RS, without the conditions A and E:

minimize(RS[, -c(2, 3)], outcome = "REC")

M1: P*EBA + ~EBA*S -> REC

This solution is a bit different from the original one presented by Ragin and Strand, but much of the essential is preserved. For instance, the expression P*EBA is logically equivalent with the former P*E*A*EBA, and eba*S is also part of the original solution. The expression P*E*S is no longer found, but it neither entailed anything about the temporal order between E and A.

Temporal order of events is almost always associated with causal analysis, therefore a full fledged temporal QCA methodology has to be intrinsically related to causation. The paper by Mahoney, Kimball, and Koivu (2009) is a must read for anyone wanting to understand both concepts, in terms of set relations. They described a method called sequence elaboration, introducing all sorts of relations between a cause and an effect: necessary, sufficient, both necessary and sufficient, INUS and SUIN, combined with yet another battery of such relations with an intervening cause (either antecedent or subsequent to the main cause, etc.)

These are all very interesting, especially since they employ set theory which is the bread and butter of the QCA analysis. The next section deals with a methodology specifically tailored for causal analysis, similar yet somehow different from QCA.

10.2 Coincidence analysis: CNA

About two decades after the introduction of QCA by Ragin (1987), another interesting methodological approach has been contributed by Baumgartner (2009), based on his PhD thesis defended a couple or years earlier, and named CNA - coincidence analysis.

While QCA analyses one outcome using various configurations of causal conditions, CNA inspects all possible causal configurations not only with respect to the outcome, but also among the causal conditions themselves. It is more than a temporal QCA, although a causal context presumes that the cause must happen before the effect.

The idea of analyzing possible relations between conditions is not new, having been used for decades in SEM - Structural Equation Models. In fact, the newer terminology in CNA (Baumgartner and Thiem 2017, online first) speaks of exogenous and endogenous factors that are custom for SEM models. It is the adaptation of causal models to QCA which is certainly interesting and constitutes a possible advancement of the QCA methodology.

Although claiming to be different and even superior to QCA, as it will be shown in this section, CNA is in fact a sub-variant of QCA as acknowledged by its creator (Baumgartner 2013, 14):

“…instead of CNA one might just as well speak of case analysis—or even of causal-chain- QCA (ccQCA)…”

The terminology of CNA is a bit different: factors (and sometimes residuals) instead of conditions, or effects instead of outcomes, or coincidence lists instead of truth tables (although not identical), but the similarities are obvious.

At the time of its appearance, CNA was (and I believe it still is) a promising QCA extension. More recently, Baumgartner and Thiem (2017, online first) claim that QCA is actually incorrect when employing the conservative and the intermediate solutions, a good enough reason for more thorough analysis of the CNA methodology, both on its own and also comparing its capabilities to those of its larger QCA sibling. As it will be shown, CNA has its merits but for truth table type of analyses it is essentially a sub-variant of QCA.

There are all sorts of myths being vehiculated, that initiate a methodological need for clarifications. For instance, the following initial affirmation from Baumgartner (2009, 72) is interesting:

“…QCA is designed to analyze causal structures featuring exactly one effect and a possibly complex configuration of mutually independent direct causes of that effect.”

In addition, on page 78:

“…Contrary to QCA, however, the data fed into CNA are not required to mark one factor as the effect or outcome.”

This is essentially stating that QCA as a technique cannot observe but a single outcome at a time, while CNA is free to observe all possible relations between all conditions, each one being considered an outcome. This is partially true, the QCA algorithm being indeed restricted to a single outcome, but it induces an image of a more limited QCA technique because of this apparent limitation. In reality, there is nothing to prevent researchers running several QCA analyses and vary the outcome among the conditions, to see which of the conditions form causal structures.

The only difference is the possibility of performing the entire CNA procedure in an automated fashion, but this is merely a software feature not a methodological limitation. In fact, as it will be shown later, the version 3 of QCA package offers a similar automated procedure that can reproduce the results from CNA. But it should be noticed that such an automated procedure is mechanical and data driven, an important aspect that will be treated later in more depth.

Another myth that is vehiculated, probably more important in scope because it is supposed to show the alleged superiority of CNA, refers to the Quine-McCluskey algorithm and the use of remainders. Baumgartner (2013, 14) states that:

“…contrary to QCA, CNA does not minimize relationships of sufficiency and necessity by means of Quine-McCluskey optimization but based on its own custom-built minimization procedure.”

Also, from the abstract of Baumgartner (2015, 839):

“…in order to maximize parsimony, QCA—due to its reliance on Quine-McCluskey optimization (Q-M)—is often forced to introduce untenable simplifying assumptions. The paper ends by demonstrating that there is an alternative Boolean method for causal data analysis, viz. Coincidence Analysis (CNA), that replaces Q-M by a different optimization algorithm and, thereby, succeeds in consistently maximizing parsimony without reliance on untenable assumptions.”

All of these quotes imply that different from QCA, the “custom built” algorithm behind CNA does not rely on any remainders, and thus emerges as a reliable alternative to QMC, that is not “forced” to introduce untenable simplifying assumptions.

The apparent reliance on the classical QMC optimization was already disproven in chapter 9: the modern QCA algorithms do not necessarily rely explicitly on remainders and more importantly do not necessarily rely on the classical QMC algorithm. Instead, package QCA employes (since 2007) a pseudo-counterfactual analysis that produces exactly the same results as those from the classical QMC approach.

This myth should not be left without a further clarification relating to the second part, that CNA does not rely on any remainders in the process of maximizing parsimony. It is rather easy to compare the new QCA algorithms to the one described by CNA. Baumgartner (2009, 86) provides the following definition to find a coincidence (another terminological departure from the established concept of a prime implicant):

“SUF: A coincidence \(X_k\) of residuals is sufficient for \(Z_i\) iff the input list C contains at least one row featuring \(X_{k}Z_i\) and no row featuring \(X_{k}\bar{Z_i}\).”

This is, unsurprisingly, very similar to the Definition 9.1 of a prime implicant (i.e. a sufficient expression \(X_k\) for \(Z_i\)) introduced when describing the Consistency Cubes procedure:

“A prime implicant is the simplest possible, non-redundant, fully consistent superset of any positive positive output configuration.”

To reiterate the explanation, fully consistent also means it can never be a superset of an observed negative configuration (“no row featuring \(X_{k}\bar{Z_i}\)”). It has a full inclusion score equal to 1, in the set of supersets of the positive output configurations, and conversely an inclusion score of 0 in the set of supersets of the negative output configurations.

This is of course valid for exact Boolean minimization, for truth table based analyses, but it is not out of the question, and also in the spirit of the fuzzy sets, to sometimes lower this inclusion score such that if an expression has a 0.95 consistency for sufficiency (therefore not full but almost), it could still be accepted as a prime implicant.

Chapter 9 demonstrates unequivocally that a pseudo-counterfactual analysis produces exactly the same parsimonious solutions as the one produced via the classical Quine-McCluskey procedure, as if it included the remainders. Whether or not some methodologists agree, for truth table type of analyses CNA behaves like a pseudo-counterfactual analysis therefore it is in fact implicitly using the remainders.

If the CNA and QCA algorithms are indeed similar and equivalent, they should reach exactly the same solutions. As indicated before, version 3 of package QCA offers a function called causalChain(), that is going to be compared with the function cna() from package cna version 2.0 (Ambuehl and Baumgartner 2017).

The specification of the function causalChain() is very basic:

    data, ordering = NULL, strict = FALSE, pi.cons = 0, pi.depth = 0,
    sol.cons = 0, sol.cov = 1, sol.depth = 0, ...

The arguments to ensure similar functionality with function cna(), while the three dots ... argument can be used to pass various parameters to function minimize(), which is used in the background by the wrapper function causalChain().

Two things are necessary to be explained. First, package cna is only focused on parsimonious solution types. If not otherwise specified, function causalChain() silently sets the argument include = "?" to arrive at this type of solution. Second and similar, there is another argument called all.sol in function minimize() that allows (when activated) to find all possible solutions for a given data, irrespective of how many conjunctions it contains.

To be compliant with the QCA standard, the default is to leave this argument inactive but since package cna explores the full solution space, function causalChain() silently activates it (if not otherwise specified) to obtain the same solutions.

Having introduced all this background information, the stage is set to test the results produced by both packages. To demonstrate that package cna arrives at exactly the same solutions as those produced by the classical QMC procedure, the argument method is going to be provocatively set to the value "QMC" before being passed to the main minimization function. Identical results would count as evidence that CNA is a pseudo-counterfactual method.

As the examples from package cna are geared for the coincidence analysis, some of the examples from the help file of function cna() are going the be used:

cna(d.educate, what = "a")
--- Coincidence Analysis (CNA) ---

Factors: U, D, L, G, E 

Atomic solution formulas:
Outcome E:
 solution        consistency coverage complexity inus
 L + G <-> E               1        1          2 TRUE
 U + D + G <-> E           1        1          3 TRUE

Outcome L:
 solution    consistency coverage complexity inus
 U + D <-> L           1        1          2 TRUE

The purpose of developing the function causalChain() is less to be a complete replacement of the function cna(), but rather to show they produce equivalent results. For this reason, it will only be compared with the atomic solution formulas (asf, printed using the argument what = "a") that are part of the output produced by function cna().

cc <- causalChain(d.educate, method = "QMC")

M1: U + D <-> L

M1: L + G <-> E
M2: U + D + G <-> E

As it can be seen, the classical QMC algorithm produces exactly the same atomic (parsimonious) solutions as those produced by the package cna. The output is not completely similar, but the results are identical: there are two models produced by for the outcome E and one produced for the outcome L. As already mentioned, the two models for outcome E are produced by activating the argument all.sol, to find the disjunction G + U + D which is otherwise not minimal. Just like package cna, the package QCA is also able to identify the full model space of a solution.

Although not printed on the screen, the resulting object contains the minimization results for all columns in the data d.educate.

[1] "U" "D" "L" "G" "E"
M1: ~D*L -> U

The minimization for the outcome U, despite being sufficient, is not revealed as necessary because it’s coverage for sufficiency (which is the same as the consistency for necessity) is less than the coverage cut-off specified by the argument sol.cov from function minimize() defaulted to 1. This is revealed by the right arrow sign -> only, whereas to be part of a causal chain, the atomic solutions have to be both sufficient and necessary, as displayed in the output for outcomes E and L with the double arrow sign <->.

To inspect the actual inclusion and coverage scores, each solution is an object of class "qca" which has a component called IC:

       inclS   PRI   covS   covU   (M1)   (M2)  
1   G  1.000  1.000  0.571  0.143  0.143  0.143 
2   U  1.000  1.000  0.571  0.000         0.143 
3   D  1.000  1.000  0.571  0.000         0.143 
4   L  1.000  1.000  0.857  0.000  0.429        
   M1  1.000  1.000  1.000 
   M2  1.000  1.000  1.000 

As both models for the outcome E have inclusion and coverage equal to 1, they are identified as necessary and sufficient, hence part of the causal chain model. The second example from the function cna() uses a well known Krook (2010) data on representation of women in western-democratic parliaments:

cna(d.women, what = "a")
--- Coincidence Analysis (CNA) ---

Factors: ES, QU, WS, WM, LP, WNP 

Atomic solution formulas:
Outcome WNP:
 solution                           consistency coverage complexity
 WS + ES*WM + es*LP + QU*LP <-> WNP           1        1          7
 WS + ES*WM + QU*LP + WM*LP <-> WNP           1        1          7

I have deliberately modified the original example command in the function, to emphasize a certain aspect that I am going to refer immediately. For the moment, let us inspect the result from function causalChain():

causalChain(d.women, method = "QMC")

M1: WS + ~ES*LP + ES*WM + QU*LP <-> WNP
M2: WS + ES*WM + QU*LP + WM*LP <-> WNP

While the default settings of the function cna() do not reveal any atomic solution formulas, the function causalChain() accurately reflects the results published by Krook. The original example command from package cna contains an additional argument called maxstep, that has three values:

cna(d.women, maxstep = c(3, 4, 9), what = "a")
--- Coincidence Analysis (CNA) ---

Factors: ES, QU, WS, WM, LP, WNP 

Atomic solution formulas:
Outcome WNP:
 solution                           consistency coverage complexity
 WS + ES*WM + es*LP + QU*LP <-> WNP           1        1          7
 WS + ES*WM + QU*LP + WM*LP <-> WNP           1        1          7

Using these settings, the function cna() is also able to reveal Krook’s necessary and sufficient conditions. From the help file of the function, we learn more about the argument maxstep, which has a form of c(i, j, k), that indicates:

…the generated asf have maximally j disjuncts with maximally i conjuncts each and a total of maximally k factors (k is the maximal complexity).

This argument raises a question about how the CNA algorithm works. Naturally, users can experiment with various combinations of i, j and k, but the bottom line is that nobody knows for sure to which is the set of minimal values that produces a complete causal chain in a reasonable time, as increasing any of those values will increase the time spent to search for a possible solution.

This is even more important as some of the values might yield a necessary and sufficient model for one particular outcome, but not for another outcome. In order to make sure that such necessary and sufficient models are found (if they exist) for all outcomes, the values from the argument maxstep would have to as large as needed, and this increases the search time. The explanation from the help file is revealing:

“As the combinatorial search space for asf is potentially too large to be exhaustively scanned in reasonable time, the argument maxstep allows for setting an upper bound for the complexity of the generated asf.”

By setting a default (low) upper bound for the complexity, neither minimality nor exhaustiveness are guaranteed. The search space is indeed very large, sometimes, and it should be reduced to find at least some solutions. But some other times it may happen that a minimal solution exists above the default bound, and many more complex disjunctions even above that.

In contrast, the function causalChain() need not to be told of such bounds as it performs an exhaustive search to find: a) all possible prime implicants, at all levels of complexity, and b) all possible solutions from a given prime implicants chart. Since the CCubes algorithm is exact and exhaustive, this is perfectly possible for classical QCA minimizations on truth tables.

It is true however, that using a lower threshold for the solution consistency sidesteps a finite PI chart and the search space potentially increases to infinity. In such situations, some upper bound is needed and if not otherwise specified it is silently set to a default of five prime implicants per solution, an upper bound that can be modified via the argument sol.depth in function minimize().

Continuing the comparison with package cna, it has other interesting features for causal analysis. By default, all factors from a certain dataset are rotated as outcomes, having all other columns as causal conditions. But this need not always be the case, in some situations theory might stipulate for sure which causal condition happens before which other effect. In a way, this looks very similar to declaring a SEM diagram in a path analysis.

To achieve this, the function cna() provides the argument called ordering, which has to be specified as a list where each component might have one or several factors on the same level. The net effect is to make sure that temporally antecedent factors may never be used as outcomes for the temporally subsequent factors. The complementary argument called strict dictates whether factors from the same temporal level may be outcomes for each other.

mvcna(d.pban, ordering = list(c("C", "F", "T", "V"), "PB"),
      cov = 0.95, maxstep = c(6, 6, 10), what = "a")
--- Coincidence Analysis (CNA) ---

Causal ordering:
C, F, T, V < PB

Atomic solution formulas:
Outcome PB=1:
 solution                                         consistency
 C=1 + F=2 + C=0*F=1 + C=2*V=0 <-> PB=1                     1
 C=1 + F=2 + C=0*T=2 + C=2*V=0 <-> PB=1                     1
 C=1 + F=2 + C=2*F=0 + C=0*F=1 + F=1*V=0 <-> PB=1           1
 C=1 + F=2 + C=2*F=0 + C=0*T=2 + F=1*V=0 <-> PB=1           1
 C=1 + F=2 + C=0*F=1 + C=2*T=1 + T=2*V=0 <-> PB=1           1
 coverage complexity inus
    0.952          6 TRUE
    0.952          6 TRUE
    0.952          8 TRUE
    0.952          8 TRUE
    0.952          8 TRUE
 ... (total no. of formulas: 14)

One thing to notice, at this command, is that it uses a different function called mvcna(), which is a shortcut for using the main function cna() with the argument type = "mv". The function needs to be told what kind of data is used, while package QCA automatically detects it. That is only a matter of convenience, but more importantly it leads to a drawback for package cna, because it apparently cannot use a mix of different types of sets in the same data, as indicated in the help file of the function cna():

… data comprising both multi-value and fuzzy-set factors cannot be meaningfully modelled causally.

This is an absolute major drawback that is already solved in package QCA since many years ago (for multi-value sufficiency, see equation (6.2)), accepting any kind of data for QCA minimizations, and now for causal modeling as well.

Returning to the actual command, a coverage threshold of 0.95 is used, to relax the very strict default value of 1, generating 14 different models for the outcome PB. While the similar argument ordering can also be fed as a list for the function causalChain(), I took the liberty of improving the specification of the temporal order in a single string, as customary for many other functions in package QCA:

causalChain(d.pban, ordering = "C, F, T, V < PB", sol.cov = 0.95,
            method = "QMC")

M01: C[1] + F[2] + C[0]*F[1] + C[2]*V[0] <-> PB[1]
M02: C[1] + F[2] + C[0]*T[2] + C[2]*V[0] <-> PB[1]
M03: C[1] + F[2] + C[0]*F[1] + C[2]*F[0] + F[1]*V[0] <-> PB[1]
M04: C[1] + F[2] + C[0]*F[1] + C[2]*T[1] + T[2]*V[0] <-> PB[1]
M05: C[1] + F[2] + C[0]*F[1] + T[1]*V[0] + T[2]*V[0] <-> PB[1]
M06: C[1] + F[2] + C[0]*T[2] + C[2]*F[0] + F[1]*V[0] <-> PB[1]
M07: C[1] + F[2] + C[0]*T[2] + C[2]*T[1] + T[2]*V[0] <-> PB[1]
M08: C[1] + F[2] + C[0]*T[2] + T[1]*V[0] + T[2]*V[0] <-> PB[1]
M09: C[1] + F[2] + C[0]*F[1] + C[2]*F[0] + F[1]*T[1] + T[2]*V[0] <-> PB[1]
M10: C[1] + F[2] + C[0]*F[1] + C[2]*T[1] + F[0]*T[2] + F[1]*V[0] <-> PB[1]
M11: C[1] + F[2] + C[0]*F[1] + F[0]*T[2] + F[1]*V[0] + T[1]*V[0] <-> PB[1]
M12: C[1] + F[2] + C[0]*T[2] + C[2]*F[0] + F[1]*T[1] + T[2]*V[0] <-> PB[1]
M13: C[1] + F[2] + C[0]*T[2] + C[2]*T[1] + F[0]*T[2] + F[1]*V[0] <-> PB[1]
M14: C[1] + F[2] + C[0]*T[2] + F[0]*T[2] + F[1]*V[0] + T[1]*V[0] <-> PB[1]

Hopefully, it is by now evident that CNA produces the very same solutions as the classical Quine-McCluskey algorithm, even though it is not explicitly using any remainders. This fact leads to the conclusion that CNA is implicitly using remainders, and therefore qualifies as a pseudo-counterfactual method.

There is however a certain feature of the CNA algorithm that makes it different from the classical Boolean minimization: it can search for solutions with lower levels of consistency (note the argument con instead of cov):

mvcna(d.pban, ordering = list(c("C", "F", "T", "V"), "PB"),
      con = .93, maxstep = c(6, 6, 10), what = "a")
--- Coincidence Analysis (CNA) ---

Causal ordering:
C, F, T, V < PB

Atomic solution formulas:
Outcome PB=1:
 solution                                     consistency coverage
 C=1 + F=2 + T=2 + C=2*T=1 <-> PB=1                 0.955        1
 C=1 + F=2 + T=2 + C=2*F=0 + F=1*T=1 <-> PB=1       0.955        1
 complexity inus
          5 TRUE
          7 TRUE

The major drawback with this command is related to the combination of arguments con = 0.93 (which automatically sets the related argument con.msc to the same value) and maxstep = c(6, 6, 10). It is highly unclear how a regular user could ever find this particular combination of numbers to produce these solutions, other than trial and error, a fact which makes the usage experience cumbersome and prone to potential imprecisions.

The function minimize(), and of course the wrapper function causalChain() from package QCA also has some arguments to allow modifying these consistency thresholds. For instance, the argument pi.cons is equivalent to the argument con.msc from function cna(), and the argument sol.cons is equivalent to the argument con from the same function. The solutions are again the same:

causalChain(d.pban, ordering = "C, F, T, V < PB", pi.cons = 0.93,
            sol.cons = 0.95)

M1: C[1] + F[2] + T[2] + C[2]*T[1] <-> PB[1]
M2: C[1] + F[2] + T[2] + C[2]*F[0] + F[1]*T[1] <-> PB[1]

In this command, it is important to note the removal of the argument method = "QMC" to allow employing the default method "CCubes", the tweaking of the prime implicant consistency threshold being a feature that is foreign to the classical minimization algorithm.

Also, readers might notice the argument sol.depth was not used, despite a less perfect consistency threshold for the solution. When left unspecified, an automatic upper bound of 5 prime implicants is used, which is enough to produce the same solutions found by package cna (the second solution is a disjunction of exactly 5 prime implicants).

Introducing a coverage threshold of 0.95 for the solution finds six other, different solutions, some even more parsimonious. Needless to say they are identical with those found by function cna():

causalChain(d.pban, ordering = "C, F, T, V < PB", pi.cons = 0.93,
            sol.cons = 0.95, sol.cov = 0.95)

M1: C[1] + F[2] + T[2] <-> PB[1]
M2: C[1] + T[2] + C[2]*T[1] <-> PB[1]
M3: F[2] + T[2] + F[1]*T[1] <-> PB[1]
M4: C[1] + F[2] + V[0] + C[0]*F[1] <-> PB[1]
M5: C[1] + T[2] + C[2]*F[0] + F[1]*T[1] <-> PB[1]
M6: F[2] + V[0] + C[0]*F[1] + F[1]*T[1] <-> PB[1]

Since it is now possible to deviate from the exact minimization and plunge into the fuzzy sets consistencies, the results can become even more spectacular while still highly consistent (albeit not perfect). The downside is that users are now over-equipped with multiple arguments, many referring to various consistency or coverage thresholds. They should know for certain what each argument does, before changing the defaults.

It is perhaps a good moment to take stock of the overall landscape. While CNA does not use a traditional truth table (but a coincidence list), QCA is still a truth table based procedure, and I am arguing this is a feature, not a problem, for QCA. Besides being an useful intermediate object to inspect before a minimization, truth tables provide a very synthetic image of the complex data researchers are analyzing, as described in chapter 7.

And this is nowhere more relevant than in the examples using calibrated fuzzy data, such as the one from the next example from package cna:

dat2 <- d.autonomy[15:30, c("AU","RE", "CN", "DE")]
fscna(dat2, ordering = list("AU"), con = .9, con.msc = .85, cov = .85,
      what = "a")
--- Coincidence Analysis (CNA) ---

Causal ordering:

Atomic solution formulas:
Outcome AU:
 solution             consistency coverage complexity inus
 RE*cn + re*CN <-> AU        0.92    0.851          4 TRUE
 re*DE + cn*DE <-> AU        0.90    0.862          4 TRUE

Again, the same solutions are obtained by function causalChain(), using the corresponding arguments sol.cons, pi.cons and sol.cov, with an additional information. When using a certain threshold for the solution consistency, it is mandatory to use a similar (better an equal) threshold for the argument incl.cut from function truthTable(), otherwise the generated truth table will transform the fuzzy sets to crisp truth tables using a full inclusion score of 1. This is of course valid for both crisp and fuzzy sets.

causalChain(dat2, ordering = "AU", sol.cons = 0.9, pi.cons = 0.85,
            sol.cov = 0.85)

M1: ~RE*CN + RE*~CN <-> AU
M2: ~RE*DE + ~CN*DE <-> AU

Unless otherwise specified, when either pi.cons or sol.cons are set below the default value of 1, the function causalChain() silently sets the value for the argument incl.cut to 0.5 to accommodate all possible truth tables for all possible outcomes. The argument incl.cut is then passed to function minimize(), which in turn passes it to function truthTable() which uses it to construct the truth table.

The fact that both algorithms found exactly the same solutions in all presented examples is yet another evidence of the similarity of the algorithms behind the cna and QCA packages. For full consistencies, one should expect a 100% exact overlap between the results of QCA and CNA, but these are not identical packages and there is no formal proof that their outcome are identical every single time when lowering the consistency thresholds. Rarely, discrepancies might appear and further investigation would be needed to uncover the reasons for their appearance.

But as it can be seen, the arguments from the function causalChain() are perfectly compatible with those from the function cna(), although it was not built to completely replace all the features in the sibling package (additional features are rather easy to implement). It is a proof of concept that, given the similar minimization procedures, QCA can be successfully used to perform causal analysis and achieve the same results.

One crucial difference between CNA (version 2) and QCA is that CNA does not use a truth table (but a coincidence list), while QCA remains a truth table based method. Apart from the obvious advantages of creating a truth table, I argue that QCA preserves a net superiority over CNA because it allows an exhaustive search, therefore a guaranteed complete list of solutions.

CNA is not guaranteed to be exhaustive, even when a coincidence list is identical with a truth table. By contrast, there is only a single situation where QCA might use an upper bound, when searching for solutions with a lower consistency threshold (using the argument sol.depth). In all other situations the CCubes algorithm is exact and exhaustive.

As shown, package QCA shares none of the drawbacks identified in package cna, and benefits from all theoretical achievements related to the Standard Analysis and Enhanced Standard Analysis, working with various types of counterfactuals, dealing with contradictions, having a single integrated function to deal with all types of sets (crisp, multi-value, fuzzy), negating multi-value sets etc.

And best of all, one could only imagine the potential, fruitful interplay between causal chains and intermediate solutions based on the Enhanced Standard Analysis, and package QCA is able to provide such possibilities.

10.3 Panel / clustered data

For the information presented so far, the functions provided in package QCA are more than sufficient. There are however recent, cutting-edge theoretical developments that have not been (yet) implemented in this package. It is actually impossible for a single software package to keep track of every possible development, a good reason for other authors to write their own extensions.

R is very well suited for such situations, because the functions from the main package are public and can be subsequently used in another package, as long as the new package declares package QCA in the dependency chain. The new functionality becomes available by simply loading the extension package.

For the remaining sections from this chapter, several other packages will be introduced, along with their functionality. Since they are outside the scope of the main package, I will only make a brief introduction and invite users to further consult the papers and manuals written by their authors. It would actually be difficult for me to properly describe the functionality of the functions developed by other programmers, other than what it can be read from their help files.

A first package that is deeply integrated and provides the largest extension is called SetMethods (Medzihorsky, Oana, Quaranta and Schneider, 2017) arrived at version 2.3 at the time of this writing. Apart from some usual QCA functions that seem to duplicate part of the existing functionality (XY plots, parameters of fit etc.), this package has an original set of functions to deal with panel data, and some others designed for what is called set-theoretic MMR (multi method research).

Adapting QCA to panel data is a natural extension involving temporal events, presented in section 10.1. Panel data also contain temporal events, but unlike the previous attempts where the data is cross-sectional, panels are longitudinal and the data for the same cases are gathered over a larger number of cross-sectional units over time.

Garcia-Castro and Ariño (2016) paved the way for such an adaptation, and proposed a number of modified consistency and coverage measures for both across cases and over time. By facilitating this kind of analysis:

… we can infer sufficient and necessary causal conditions that are likely to remain hidden when researchers look at the data in a purely cross-sectional fashion. Further, across-time consistencies can be used as robustness checks in empirical studies.

The data being spread over cases and years, consistency and coverage measures can be calculated:

  • between cases, in the same year (one measure per year)
  • within cases, over the years (one measure per case)
  • pooled, combining the measurements for all cases and all years

The between consistency is the most common, with the same equation as for any other consistency, measuring how consistent are the cases with respect to the outcome inside each year. It may be seen as a sequence of regular cross-sectional data consistencies, for each time t:

\[\begin{equation} inclB_{X_{t}\phantom{.}\rightarrow\phantom{.}Y_{t}\phantom{.}} = \frac{\sum_{i = 1}^{N}{min(\mbox{X}_{it}, \phantom{.}\mbox{Y}_{it})}}{\sum{\mbox{X}_{it}}} \tag{10.1} \end{equation}\]

In this equation, t is held constant to indicate a certain year, and all cases from 1 to N contribute to the calculation of the between consistency.

The within consistency does the opposite, holding each case i as constant (one measure per case) and the calculations are performed using the different measurements of the same case over the years. It essentially measures how the changes of the same case in condition X are related to the outcome Y of the same case, across time:

\[\begin{equation} inclW_{X_{i}\phantom{.}\rightarrow\phantom{.}Y_{i}\phantom{.}} = \frac{\sum_{t = 1}^{T}{min(\mbox{X}_{it}, \phantom{.}\mbox{Y}_{it})}}{\sum{\mbox{X}_{it}}} \tag{10.2} \end{equation}\]

It can be seen the two equations are very similar, measuring how consistent is X as a subset of Y, either between all cases of the same year or within each case over time.

Finally, the pooled consistency brings together all cases over all years, to calculate a composite (pooled) measure:

\[\begin{equation} inclP_{X\phantom{.}\rightarrow\phantom{.}Y\phantom{.}} = \frac{\sum_{t = 1}^{T}\sum_{i = 1}^{N}{min(\mbox{X}_{it}, \phantom{.}\mbox{Y}_{it})}}{\sum_{t = 1}^{T}\sum_{i = 1}^{N}{\mbox{X}_{it}}} = \frac{\sum{min(\mbox{X}, \phantom{.}\mbox{Y})}}{\sum{\mbox{X}}} \tag{10.3} \end{equation}\]

Equation (10.3) can be expressed either in the most simple form (not specifying what to sum meaning that it sums everything, all cases from all years), or separating the double sums to make it more obvious the summation happens for both years and cases.

The coverage measures are calculated exactly the same, just replacing the X in the numerator with Y. In principle, the consistency and coverage measures work even for two or three unit observations, but the more time measurements in the panel the more precise the calculations of these measures will become.

There are some other measures specific to panel data, and they would be better exemplified using a real dataset. The package SetMethods should be loaded first, to access its datasets and functions:


This package provides a dataset called SCHLF, used by M. R. Schneider, Schulze-Bentrop, and Paunescu (2010) to explore the institutional capital of high-tech firms and their export performance, using on data from 19 OECD countries between 1990 to 2003.

Australia_90 0.07    0.90 1.00  0.68  0.45 0.33   0.19 Australia 1990
Austria_90   0.70    0.98 0.01  0.91  0.01 0.05   0.25   Austria 1990
Belgium_90   0.94    0.95 0.14  0.37  0.26 0.14   0.14   Belgium 1990
Canada_90    0.04    0.21 0.99  0.11  0.62 0.31   0.28    Canada 1990
Denmark_90   0.59    0.78 0.10  0.55  0.53 0.10   0.34   Denmark 1990
Finland_90   0.70    0.97 0.20  0.95  0.02 0.13   0.17   Finland 1990

Apart from the usual conditions and outcome columns specific to QCA, the package SetMethods requires two more columns for a panel dataset analysis: one containing an identifier for the years when the data was collected for each case, and the other containing an identifier for the cases.

A regular dataframe in R has the option of providing the case identifiers using the row names, and the row names have to be unique. Since the same cases can be repeated for multiple measurements in time, they cannot be assigned to the row names and instead specified into a different column.

These columns are used by the function cluster(), with argument unit referring to the case names and the argument cluster referring to the year. The overall structure of the function, including all arguments, is presented below:

cluster(data, results, outcome, unit, cluster, sol = 1,
        necessity = FALSE)

The arguments data and results can also be vectors, but their most natural usage refers to the original dataset. The argument sol specifies the number of the solution to be selected, in case of multiple models. The argument results can be used to feed the function with a minimization object of class "qca", in which case the argument sol is specified as a string of type “c1p1i1” to specify the exact combination of conservative, parsimonious and intermediate solution to be selected.

For instance, the following object is created when minimizing the Schneider data for intermediate solutions (for demonstrative purposes, only the first four conditions were used, but the package documentation contains the complete example):

ttSL <- truthTable(SCHLF, conditions = "EMP, BARGAIN, UNI, OCCUP",
                   outcome = "EXPORT", incl.cut = .9, show.cases = TRUE)
sol_yi <- minimize(ttSL, include = "?", dir.exp = "0, 0, 0, 0")
cluster(results = sol_yi, data = SCHLF, outcome = "EXPORT",
        unit_id = "COUNTRY", cluster_id = "YEAR")
                   emp*bargain*OCCUP EMP*bargain*occup emp*BARGAIN*occup
Pooled                         0.909             0.960             0.924
Between 1990                   0.839             0.991             0.838
Between 1995                   0.903             0.991             0.912
Between 1999                   0.928             1.000             1.000
Between 2003                   0.951             0.878             0.954
Within Australia               1.000             1.000             0.791
Within Austria                 1.000             1.000             1.000
Within Belgium                 1.000             1.000             1.000
Within Canada                  1.000             1.000             1.000
Within Denmark                 1.000             1.000             0.774
Within Finland                 1.000             1.000             1.000
Within France                  1.000             1.000             1.000
Within Germany                 1.000             1.000             1.000
Within Ireland                 1.000             1.000             1.000
Within Italy                   1.000             1.000             1.000
Within Japan                   1.000             1.000             1.000
Within Netherlands             1.000             1.000             1.000
Within NewZealand              0.414             0.868             0.437
Within Norway                  0.965             0.958             0.948
Within Spain                   1.000             0.706             1.000
Within Sweden                  1.000             1.000             1.000
Within Switzerland             0.880             1.000             1.000
Within UK                      1.000             1.000             1.000
Within USA                     1.000             1.000             1.000

                       emp*bargain*OCCUP EMP*bargain*occup emp*BARGAIN*occup
From Between to Pooled             0.023             0.026             0.032
From Within to Pooled              0.031             0.017             0.033

                   emp*bargain*OCCUP EMP*bargain*occup emp*BARGAIN*occup
Pooled                         0.194             0.229             0.334
Between 1990                   0.231             0.289             0.399
Between 1995                   0.206             0.249             0.469
Between 1999                   0.174             0.206             0.261
Between 2003                   0.184             0.203             0.274
Within Australia               0.415             0.333             0.951
Within Austria                 0.075             0.075             0.442
Within Belgium                 0.138             0.138             0.372
Within Canada                  0.328             0.299             0.545
Within Denmark                 0.273             0.273             0.604
Within Finland                 0.059             0.059             0.332
Within France                  0.070             0.070             0.173
Within Germany                 0.236             0.251             0.308
Within Ireland                 0.113             0.103             0.580
Within Italy                   0.173             0.327             0.276
Within Japan                   0.161             0.656             0.064
Within Netherlands             0.150             0.169             0.355
Within NewZealand              1.000             0.917             0.861
Within Norway                  0.598             0.739             0.598
Within Spain                   0.204             0.828             0.204
Within Sweden                  0.061             0.075             0.189
Within Switzerland             0.738             0.315             0.337
Within UK                      0.075             0.080             0.282
Within USA                     0.037             0.052             0.045

Since this particular example contains exactly one intermediate solution, the default arguments to select it need not be respecified in the command. Following Garcia-Castro and Ariño’s paper, the function returns a complete set of pooled, between and within consistencies and coverages of the solution terms, for each unit and each cluster. There are also distances from between to pooled and from within to pooled, offering a complete snapshot of the complex panel relations that exist in this dataset.

All of these parameters of fit are calculated for the sufficiency relation, but otherwise the function cluster() can calculate the same parameters for the necessity relation, by activating the argument necessity = TRUE.

Package SetMethods has a large variety of functions that greatly extend the main package QCA, not only for panel data but also for what is probably the hottest QCA extension, called Set Theoretic Multi-Method Research. Since both the method and the associated functions are still under a heavy development, they are not presented here but readers are warmly encouraged to follow the literature and the associated functions from this package.

10.4 Robustness tests

As a middle ground between the qualitative and the quantitative approaches, QCA has inevitably attracted a lot of attention but also a lot of critique, especially from the quantitative researchers. One of the particular contested features of QCA is the ability to work with a small number of cases (fitting in the small and medium-N world), as opposed to statistical analyses which require a high number of cases to perform inference testing.

Arguably, it is easier to collect a small sized data, and it requires a lot of effort to collect data from large scale studies involving thousands of individuals. This is only apparent, because the data collection is very much different: while the quantitative studies focus on a standardized approach over thousands of individuals, the qualitative approach is much more intensive for an in-depth study of each and every case.

In order to establish, for instance, at which particular values to set the calibration thresholds, the QCA researcher should have a deep understanding of not only each separate case but also of each case in comparison with all the others. In the same situation, the quantitative researcher would simply calculate the mean and standard deviation, and chapter 4 shows clearly why such an approach is not suitable.

Nevertheless, quantitative researchers continue to raise doubts about inferences drawn from a small number of cases (small “sample”) and their ability to generalize conclusions to a larger “population”. But this is very far from the QCA approach, which does not use the Central Limit Theorem, it does not use the normal curve, and it does not use confidence intervals or p-values. Instead, QCA works by observing patterns of consistency between various configurations of causal conditions, and how they are associated with the presence of the phenomenon of interest.

A well known, landmark qualitative study is Skocpol’s (1979) book on states and social revolutions, comparing no more than three cases: France, Russia and China. These are very rare events, and they all involve crucial moments in the history of humanity. Obviously, these cases cannot be analyzed using the standard statistical techniques and the quantitative researchers are powerless when asked to draw meaningful insights.

On the other hand to abstain from analyzing these cases, just because they don’t comply with the normal curve, would be an even bigger mistake. There is a genuine possibility to analyze them using the comparative method, since they all display the same outcome and there are some obvious similarities, as well as differences between them.

And this is precisely what QCA does, searching for patterns of consistency among the causal conditions with the outcome. Three cases are clearly not enough to apply QCA, even for a moderately complex theoretical model, in terms of number of causal conditions. Marx and Dușa (2011) have paved the way for a possible optimal ratio between the number of cases and the number of causal conditions, and that work should be extended to with more tests using fuzzy sets. With an optimal such ratio, the Boolean minimization method can be applied to identify consistency patterns, but the quantitative researchers continue to compare the results of this algorithmic method with the standard statistical techniques.

One such critique, emerging from the work of King, Keohane and Verba (1994, notorious for their interpretation of qualitative research from the quantitative perspective), refers to the selection of cases being analyzed. Generally in the qualitative research, and more specifically in the QCA, cases being studied are not selected using random sampling techniques (as it is usually the situation in quantitative research) but quite the reverse, they are selected precisely because they display something which is theoretically meaningful.

In the language of “sampling”, this is similar to what is called purposive sampling (a non-probability method) where cases are selected because they are typical for some research scenario, or inversely because they are deviant compared to the other cases: something potentially interesting happens in those cases and we want to find out what.

What is probably most misunderstood is that QCA is a configurational analysis, and the focus on the (number of) cases is of a secondary importance. The Boolean minimization process is not aware of the number of cases within each configuration, since the input consists of the configurations themselves. QCA is less concerned with the number of cases, since the limited diversity phenomenon states that only a few causal configurations are actually observable.

There could be thousands of cases and only a handful of observed configurations, and that is something different from the quantitative focus on the number of cases. The only situation when the number of cases does count, is their effect on output value of a particular truth table configuration, either through the inclusion threshold, or via the frequency cut-off.

This is a rather long introduction to the topic of this section, related to the robustness of QCA results, but it was necessary to understand that robustness is a very often ill interpreted topic. It all relates to a natural expectation that QCA results should be consistent with previous findings. In this respect, it overlaps with an overarching scientific requirement that a finding is valid (only) if it can be generalized to future similar research situations, or even more strict if they can be generalized to a “population”.

But this is a very quantitative assessment of robustness, as if lab results from a micro research situation are expected to work in the real world scenarios, or inferences drawn from a sample are expected to be generalized to the entire population. It is an inferential way of looking at the QCA models, and expects consistent results on relatively similar input data.

Robustness is indeed an important issue, and it can be approached in a systematic way. There are so many publications dealing with this issue (some of the most recent including Skaaning 2011; Hug 2013; Lucas and Szatrowski 2014; Seawright 2014; Ragin 2014; Lucas 2014; Krogslund, Choi, and Poertner 2015) many of which are similar in nature, that is difficult to address all of their arguments.

Arguably, however, many critiques misunderstand and sometimes misinterpret what has been written before. For instance, Lucas and Szatrowski (2014) argue that Ragin and Rihoux (2004):

…reject simulations, claiming that QCA is for researchers intrinsically interested in the cases they study.

Upon a careful reading of that paper, there is little evidence that Ragin and Rihoux explicitly reject simulations. It is true that QCA researchers are intrinsically interested in the cases they study, but from here to declaring this general approach as equivalent to rejecting simulations, is a long stretch.

In a subsequent rejoinder to clarify matters, Lucas (2014) manages to bring even further confusion over some of them. For instance, Ragin’s example with the difference between a set membership and a probability is very clear, but Lucas includes a particular Bayesian interpretation of the probability as a degree of belief and declares the probability as a mathematical function of the set membership. This is (at best) an opinion, since no such mathematical function exists.

The same type of deliberate (since Lucas seems to understand what QCA is all about but still prefers the) misunderstanding is related to equating the configurations with the interaction effects. These are strong indicators that, although he rejects looking at QCA through a quantitative perspective, it is precisely what he is doing.

Whether or not QCA is able to find the “correct” answer depends on the quality of the raw data, and to follow Skaaning (2011) on the choice of calibration thresholds, as well as on the choice of frequency and consistency thresholds when constructing the truth table. Measurement error is important in both micro-quantitative and macro-qualitative comparative analyses. On a macro level, measurement error has a lower influence on the results. Societal indicators fluctuate a whole lot less than individual scores, and if we agree on this point then it is obvious that “error” has a different meaning in QCA.

In particular, Hug (2013) attempted a more systematic way of looking at the effect of errors (deleting a case from the analysis, or changing the value of the outcome) to calculate the extent to which the final solution is preserved: the results are “robust” if the solution is preserved.

His attempt (exhaustive enumeration) has already been outperformed by Thiem, Spöhel and Dușa (2016), with a proper combinatorial method to calculate the retention rate. The simulations include two types of assumptions: DPA - Dependent Perturbation Assumption when perturbations depend on each other and are tied ex-ante to a fixed number of cases, and IPA - Independent Perturbation Assumption when perturbations are assumed to occur independently of each other.

The entire procedure is enclosed in the a function called retention() in package QCA, with the following structure:

retention(data, outcome = "", conditions = "", incl.cut = 1, n.cut = 1, 
          type = "corruption", dependent = TRUE, p.pert = 0.5, n.pert = 1)

Similar to the main truthTable() function, the arguments incl.cut and n.cut decide which configurations are coded positive, negative or left as logical remainders. The argument p.pert specifies the probability of perturbation under the independent perturbations assumption, while argument n.pert specifies the number of perturbations under the dependent perturbations assumption with at least one perturbation needed to possibly change a csQCA solution, otherwise the solution remains the same (retention rate equal to 100% if zero perturbations occur).

The argument type is the one to choose between "corruption" when various values in the conditions are changed (either from 0 to 1 or from 1 to 0), or "deletion" when the result is the probability of retaining the same solution if a number of cases are deleted from the original data.

The effect of corrupting a case is to allocate it to a different truth table configuration. This may or may not have an influence on the final result, as the input in the minimization procedure is not the cases themselves but the truth table. A case leaving a certain configuration might influence the consistency of that configuration, which in turn might affect its output value. The same with a new case arriving into another truth table configuration.

The effect of deleting a case can have an effect over the same truth table configuration, either by modifying its consistency, or possibly by transforming it into a remainder if it does not pass the frequency cut-off.

It is also possible for a corruption or a deletion to not have any effect at all, depending on the number of cases and the consistency of each configuration. And even in the case the configurations are changed, it can still lead to the same solution if those configurations contain redundant (minimizable) conditions. In all these situation, the solution is retained. The solution is changed in the inverse situation: when the number of cases per configuration is small, the consistency score the configuration being changed is not very high, and the configuration being changed plays an important role in the minimization.

However this scenario is perfectly logical in a macro-qualitative comparative setting, changing the value of a case (a country) has the same effect as replacing one country with another. Under these circumstances the solution being modified should not be surprising at all, especially when the group of countries being analyzed is small. What is truly surprising is Hug’s (and all the others’) expectation the solution should be retained.

Using his own version of the data, the exact retention probability can be calculated with:

Hug <- data.frame(matrix(c(
    rep(1,25), rep(0,20), rep(c(0,0,1,0,0),3),
    0,0,0,1,0,0,1,0,0,0,0, rep(1,7),0,1),
    nrow = 16, byrow = TRUE, dimnames = list(
    c("P", "U", "C", "S", "W"))

retention(Hug, outcome = "W", type = "corruption", dependent = FALSE,
          p.pert = 0.025, incl.cut = 1)
[1] 0.7962228

In the independent perturbation assumption, using a perturbation probability of 0.025 when corrupting cases, the exact retention rate is 0.796, which means the probability that the solution will change is 0.204. This is neither good, nor bad: it is what it is.

A different type of simulation was performed by Seawright (2014), who generated a truth table based on a certain Boolean function, then randomly sampled (with replacement) from those configurations in various sizes between 20 and 100. Apart from the five conditions in the initial Boolean function, Seawright introduced three other conditions that are irrelevant, and tested to what extent the limited diversity problem (including the remainders) affects the final solutions.

In the absence of a replication file, it is unclear how this simulation study was performed. His study raises more questions than answers:

  • How many positive configurations did the generated truth table had?
  • How many negative configurations?
  • How did Seawright made absolutely sure the three additional conditions are causally irrelevant?

It is quite possible, even through random sampling, that causally relevant structures were introduced in the initial truth table for the additional three conditions. This should be tested, verified and ultimately validated, assuming of course the that Seawright was careful enough and used a mandatory starting point for the sampling procedures via the base function set.seed(), otherwise none of his results can ever be replicated.

It is interesting to note that, although QCA is inherently applicable to a small and medium sized number of cases, Seawright treats the input as if it was drawn from a known large population. Leaving aside his justified intention to test the results against something known, in reality he performed a very dubious exercise, as it is also unclear if he sampled from positive configurations (in which case the underlying model should be preserved to a greater extent) or from both positive and negative output configurations.

Both positive and negative output configurations are important: the positive ones influence the underlying model, while the negative ones contribute to eliminating remainders that would otherwise incorrectly contribute to parsimony (thus artificially altering the final solutions).

A newer, but similar to Seawright’s simulation involving a data generating structure, was performed by Baumgartner and Thiem (2017, online first), who also make a comprehensive evaluation of the inappropriate use or inadequate tests by the previous critiques, especially Lucas and Szatrowski. Their approach is not to sample from the saturated truth table, but instead to delete all possible combinations of rows (from 1 to 14) and perform an exhaustive enumeration of the model preserving solutions. Opposite to Seawright, they found a retention rate of 100% for the parsimonious solution in all their simulation setups, with lower retention rates for the conservative and intermediate solutions.

Baumgartner and Thiem’s paper serve a dual purpose: demonstrate why all the other previous attempts to measure the QCA robustness are incorrect (thus defending QCA), while at the same time they claim the parsimonious solution is the only correct solution and the conservative and intermediate solutions are also incorrect (thus indirectly becoming QCA critics themselves).

Both Seawright (2014) and Baumgartner and Thiem (2017, online first) start with the same setup of data generating structure using an underlying Boolean function, and they reach different conclusions using different methods to test robustness, yet it can be shown that both studies are incorrect.

In his attempt to verify the role of the limited diversity, Seawright most likely included all remainders in the analysis to obtain the parsimonious solution, and this is definitely a mistake. The decision to produce only the parsimonious solution ignores years of efforts by an entire community of QCA theoreticians and practitioners, around the intermediate solutions. Directional expectations should be used, and untenable, difficult or even impossible counterfactuals should be avoided from the minimization process.

A similar mistake can be identified in Baumgartner and Thiem, although to their credit a proper effort was invested to calculate intermediate solutions. In their case, the method is partially right and the conclusion is plain wrong, based on an ad-hoc definition of what a “correct” solution is.

The community agreed standard is that both the conservative (CS) and the intermediate solutions (IS) are supersets of the parsimonious solutions (PS). This means that every time a PS is “correct”, the other two solutions are also correct because they contain the elements from the PS. Baumgartner and Thiem’s definition, inspired from Seawright, is that a solution is correct only: “…iff it does not commit causal fallacies…” (i.e. if it does not contain causally irrelevant factors).

In a real limited diversity scenario (the norm in QCA research) the number of empirically observed configurations is low and the number of remainders is high. In such a situation, it is absolutely certain that CS will not be as parsimonious as PS (i.e. CS will almost always contain causally irrelevant factors), therefore in their definition CS is bound to be “incorrect”.

It is the very reason why the parsimonious solutions is called “parsimonious”, having a more simple structure, opposed to the conservative solution which is highly complex when the diversity is very limited. To change the nature of what is “correct”, Baumgartner and Thiem should have provided some kind (any kind) of philosophical, formal logic or mathematical demonstration of the incorrectness. Instead, they resorted to a change in the definition, and that is not embraced by the larger community which issued a Statement on rejecting article submissions because of QCA solution type8.

Defining a solution as “correct” if it does not contain any causally irrelevant factors, automatically increases the likelihood that a conservative solution is “incorrect”, the more rows are eliminated from the saturated truth table.

Despite the article not referring to a specific minimization algorithm, their proof rely on a series of computer simulations performed with the package QCApro version 1.1-1 (a fork of an older version of the main package QCA), using the eQMC algorithm developed by myself in 2007 and described in section 9.1. Since the eQMC algorithm is pseudo-counterfactual, it is implicitly using all possible remainders, including all those untenable, difficult or even impossible counterfactuals. Such an approach has been proven as incorrect for many years, leading to the ESA - Enhanced Standard Analysis.

This fact alone should be sufficient to dismiss Baumgartner and Thiem’s conclusions, but there is even more to reveal. In their simulations, all possible combinations of rows are iteratively eliminated from the saturated truth table, and test if any components from the parsimonious solution are retained. According to this procedure, the expected percentage of “correct” conservative solutions should approach zero when the number of deleted rows approaches the total number of rows in the saturated truth table (when the diversity gets more and more limited).

Figure 4 in their paper shows the correctness preserving curve for the QCA-CS rapidly approach zero after 5 deleted rows, and springs back to life after 11 deleted rows, with an implausible 100% correct ratio for all 16 rows being deleted. In this extreme situation with all rows from the truth table deleted, a solution is impossible since there is no data remaining at all.

By mechanically applying their definition, an impossible situation (no solution) is counted as “correct” because it does not contain causally irrelevant factors. This is a logical fallacy, because something unknown (no solution) cannot be considered correct or incorrect, it is just unknown. Consider this simple example containing 7 values:

correct <- c(1, 0, 0, 0, NA, 0, 1)
[1] NA

The result of averaging this vector is unknown, because the fifth value is unknown (not available). The average value can only be calculated by removing the unknown value, in which case it would be 2 out of 6, not 3 out of 7:

mean(correct, na.rm = TRUE)
[1] 0.3333333

In the Baumgartner and Thiem’s case, it would imply calculating the percentage of “correct” conservative solutions only for those situations when a minimization is possible. After 11 rows being deleted from the truth table, what they report as the percentage of identified “correct” solutions is in fact the proportion of situations when no solutions are possible. For instance, there are choose(16, 14) = 120 possible situations of removing 14 rows out of the total of 16 from the truth table. In 10 of these situations, no solution is possible with a percentage of 10/120 = 0.083, which is exactly the percentage reported as “correct” in the article, while the proper percentage is 0/110 = 0.

It is not only the definition of “correctness” which is artificially bended towards their hypothesis, but also the operationalization of this definition. To consider an inexistent solution as correct counts as a programming error, and their entire simulation is flawed. Combined with the fact that a parsimonious solution is not guaranteed to be correct in the limited diversity scenario (because it involves difficult, untenable or impossible counterfactuals), their ambition to reset the de-facto QCA standards get suddenly deflated.

The same phenomenon can be identified for the intermediate solutions, when the correctness preserving curves display a very quick recovery after 11 eliminated rows, whereas they should approach zero the more rows are eliminated from the saturated truth table. By removing this logical fallacy, all curves except for the parsimonious solution should approach zero, the more limited diversity is accounted for.

The constant rate of 100% correctness for the parsimonious solutions is also highly implausible, and on a closer inspection it appears to be the result of another programming artifact. For instance, when removing the third row from the saturated truth table (using the same underlying model aB+Bc+D), there are not one but two parsimonious solutions: aC+Bc+D and aB+Bc+D.

Clearly, the first one does not conform to their own definition, since the component aC is not part of the expression that generated the data, this particular simulation should be counted as a half success.

There are multiple other similar situations for all possible combinations of deleted rows, for instance deleting the rows 3 and 9 produce three parsimonious solutions: aC+Bc+D, aB+Bc+D, and aB+Ac+D, and in the most extreme situations there are no less than 12 possible parsimonious solutions, out of which only one conforms to their own definition.

Trying to solve this problem, Baumgartner and Thiem resort to yet another ad-hoc definition of causal correctness, by considering the entire set of models (i.e. solutions) as correct if at least one of the models is correct. In justifying their decision, they quote Spirtes, Glymour, and Scheines (2000, 81), but that page is not even remotely related to this particular choice. The only correctness property theorem found there is:

If the input to any of the algorithms is data faithful to G, the output of each of the algorithms is a pattern that represents the faithful indistinguishability class of G.

Leaving aside the fact that Spirtes, Glymour, and Scheines (2000) present statistical (not Boolean) algorithms to identify causal structures in acyclic directed graphs, it is highly unclear how did their theorem of correctness justifies the claim that all models are correct if at least one is correct.

Baumgartner and Thiem’s paper relies not only on a correctness definition that is not embraced by the community, but as it seems that definition is even further artificially bended to fit the computer generated results. In a normal scientific enquiry, data should be used to test a hypothesis or a definition, but in this case the definition itself is constructed to 100% fit the output produced by the Boolean minimization algorithm.

For each simulation, the causal structure is known but it must not be forgotten that in the real life researchers have absolutely no idea which of the concurrent models is the true model representing the underlying causal structure, therefore all models should be given the same weight. In such situations displaying model ambiguity, the success rate should be calculated as 1 over the total number of models, given that only one model faithfully represents the true causal structure.

Counting such situations as 100% correct, when only one model is correct, could only be described as a programming abuse if it was intentional, or otherwise as a programming error. Either way, it is clear they published results that are misleading, as a direct consequence of an artificially constructed initial definition.

From a Boolean minimization perspective, the fact that parsimonious solutions contain less causally irrelevant factors is perfectly logical given that eliminated rows are either irrelevant in the minimization process, or they are included back through the remainders, which means the simulations are arranged in such a way that parsimonious solutions will always contain components from the data generating expression.

Contrary to their claim, the parsimonious solution does not have a magical property to uncover causal structure, it is merely a direct consequence of the Boolean minimization procedure: the more remainders are included, the more parsimonious the solutions will become (but never more parsimonious than the data generating causal structure).

To the other extreme to more rows are eliminated from the saturated truth table, the less observed configurations are served as input for the Quine-McCluskey procedure, the more severe the limited diversity problem and the more complex the conservative solutions will become.

Their article describe the built in properties of the Quine-McClukey (QMC) Boolean minimization algorithm, and they sell it as a groundbreaking finding to claim for instance that QCA-CS (conservative solution of QCA) is incorrect by employing the QMC algorithm. But they fail to mention that QCA-CS is in fact QMC proper, therefore produce a logical impossibility because what they truly claim is that QMC demonstrates that QMC is incorrect.

As Baumgartner and Thiem do not present a formal demonstration of their claims (they resort to simulated results as a form of inductive pseudo-demonstration), their paper can be subjected to at least two logical fallacies. The first is called a “definist fallacy” (Bennett 2012, 96):

A has definition X. X is harmful to my argument. Therefore, A has definition Y.

Similarly, it is also a case of a circular definition fallacy: the simulated results are determined as “correct” using an ad-hoc definition, and the definition is “demonstrated” as correct by the simulated results.

Given this series of logical impossibilities, logical fallacies and (ab)use of programming to change the true nature of my own algorithm eQMC, it is difficult to attribute very much credibility to the conclusions presented in this paper.

Going back to Seawright (2014, 121), he presents one other interesting conclusion that QCA is “fairly prone to false-positive results.” That is, for some of the solutions, at least one additional condition is found in the solutions despite not being causally related to the outcome, according to the initial Boolean function.

According to Baumgartner and Thiem, this effect never happens for the parsimonious solution if the causally irrelevant conditions are truly random. As mentioned, this finding can also be interpreted as highly subjective as they consider a model as correct if at least one solution preserves the causal structure, thus disregarding model ambiguity. It would have been interesting to validate Seawright’s own conclusions, but in the absence of a replication file this is not possible.

However on the official R repository (CRAN), there are some other packages dealing with QCA, among which one being called QCAfalsePositive, created by B. Braumoeller (2015) as an applied programming tool to demonstrate the findings from B. F. Braumoeller (2015). This package has a number of interesting functions to calculate the probability of producing a type I error when including remainders. Applied to QCA, a type I error occurs when a solution contains components that appear by chance, rather than configurational merits.

Further experimentation should be performed to assess the usefulness of these inferential tests. At a first sight, and acknowledged by Braumoeller:

…If a solution set is only found in a single case and the outcome of interest occurs in 90% of the observations, the probability of a false positive result is dangerously high…

This is very true but on the other hand it might also reflect the inherent uncertainty that is specific to inferential tests when the number of observations is very small. It might be the probability of a false positive is high, or it might be that an inferential test is highly imprecise with only one case. However, such initiatives to involve statistical testing, wherever possible, are laudable.

Although QCA is inherently a method that has absolutely nothing to do with probability and statistics, there are some possible intersections, especially at the simplest 2 \(\times\) 2 tables and XYplots where Bayesian analysis on (conditional) proportions might prove to be interesting, and possibly very useful (see for instance Barrenechea and Mahoney 2017, online first).

Dealing with robustness is another associated package from CRAN, named braQCA (Gibson and Burrel 2017) which is built to assess how sensitive is a QCA solution to randomness, using bootstrapping. Their main function returns a probability that a given solution is reached by chance, as it would be the result of a random experiment.

What is particularly interesting about this package is its ability to offer possible threshold values for the inclusion and frequency cut-offs, which can served as data specific guidelines to reduce the probability that a given QCA solution is spurious. This is also an interesting combination of classical statistics with QCA, and more work is needed to further demonstrate its practical utility for the QCA research.


Ambuehl, Mathias, and Michael Baumgartner. 2017. Cna: Causal Modeling with Coincidence Analysis.
Barrenechea, Rodrigo, and James Mahoney. 2017, online first. “A Set-Theoretic Approach to Bayesian Process Tracing.” Sociological Methods and Research, 2017, online first.
Baumgartner, Michael. 2009. “Inferring Causal Complexity.” Sociological Methods and Research 38 (1): 71–101.
———. 2013. “Detecting Causal Chains in Small-n Data.” Field Methods 25 (1): 3–24.
———. 2015. “Parsimony and Causality.” Quality & Quantity 49: 839–56.
Baumgartner, Michael, and Alrik Thiem. 2017, online first. Often Trusted but Never (Properly) Tested: Evaluating Qualitative Comparative Analysis.” Sociological Methods and Research, 2017, online first.
Bennett, Bo. 2012. Logically Fallacious. The Ultimate Collection of over 300 Logical Fallacies. Sudbury:
Berg-Schlosser, Dirk. 2012. Mixed Methods in Comparative Politics. Principles and Applications. UK: Palgrave Macmillan.
Braumoeller, Bear. 2015. QCAfalsePositive: Tests for Type i Error in Qualitative Comparative Analysis (QCA).
Braumoeller, Bear F. 2015. “Guarding Against False Positives in Qualitative Comparative Analysis.” Political Analysis 23 (4): 471–87.
Caren, Neal, and Aaron Panofsky. 2005. TQCA. A Technique for Adding Temporality to Qualitative Comparative Analysis.” Sociological Methods and Research 34 (2): 147–72.
De Meur, Gisele, Benoit Rihoux, and Sakura Yamasaki. 2009. “Adressing the Critiques of QCA.” In Configurational Comparative Methods: Qualitative Comparative Analysis (QCA) and Related Techniques, edited by Benoît Rihoux and Charles Ragin, 147–65. London: Sage Publications.
Garcia-Castro, Roberto, and Miguel A. Ariño. 2016. “A General Approach to Panel Data Set-Theoretic Research.” Journal of Advances in Management Sciences & Information Systems 2: 63–76.
Gibson, Ben C., and Vann Jr Burrel. 2017. braQCA: Bootstrapped Robustness Assessment for Qualitative Comparative Analysis.
Hak, Tony, Ferdinand Jaspers, and Jan Dul. 2013. “The Analysis of Temporaly Ordered Configurations: Challenges and Solutions.” In Configurational Theory and Methods in Organizational Research, edited by Peer C. Fiss, Bart Cambré, and Axel Marx, 107–27. Bingley, UK: Emerald Group Publishing.
Hino, Airo. 2009. “Time-Series QCA: Studying Temporal Change Through Boolean Analysis.” Sociological Theory and Methods 24 (2): 247–65.
Hug, Simon. 2013. Qualitative Comparative Analysis: How Inductive Use and Measurement Error Lead to Problematic Inference.” Political Analysis 21 (2): 252–65.
King, Gary, Robert O. Keohane, and Sidney Verba. 1994. Designing Social Inquiry. Scientific Inference in Qualitative Research. Princeton: Princeton University Press.
Krogslund, Chris, Donghyun Danny Choi, and Mathias Poertner. 2015. “Fuzzy Sets on Shaky Ground: Parameter Sensitivity and Confirmation Bias in fsQCA.” Political Analysis 23 (1): 21–41.
Krook, Mona Lena. 2010. “Women’s Representation in Parliament: A Qualitative Comparative Analysis.” Political Studies 58 (5): 886–908.
Lucas, Samuel R. 2014. Rejoinder: Taking Heat and Giving Light−−Reflections on the Early Reception of ‘Qualitative Comparative Analysis in Critical Perspective’.” Sociological Methodology 44 (1): 127–58.
Lucas, Samuel R., and Alisa Szatrowski. 2014. Qualitative Comparative Analysis in Critical Perspective.” Sociological Methodology 44 (1): 1–79.
Mahoney, James, Erin Kimball, and Kendra L. Koivu. 2009. “The Logic of Historical Explanation in the Social Sciences.” Comparative Political Studies 42 (1): 114–46.
Marx, Axel, and Adrian Dușa. 2011. Crisp-Set Qualitative Comparative Analysis (csQCA), Contradictions and Consistency Benchmarks for Model Specification.” Methodological Innovations Online 6 (2): 103–48.
Marx, Axel, Benoît Rihoux, and Charles Ragin. 2014. “The Origins, Development, and Application of Qualitative Comparative Analysis: The First 25 Years.” European Political Science Review 6 (1): 115–42.
Medzihorsky, Juraj, Ioana-Elena Oana, Mario Quaranta, and Carsten Q. Schneider. 2017. SetMethods: Functions for Set-Theoretic Multi-Method Research and Advanced QCA.
Ragin, Charles. 1987. The Comparative Method. Moving Beyond Qualitative and Quantitative Strategies. Berkeley, Los Angeles & London: University Of California Press.
———. 2014. Lucas and Szatrowski in Critical Perspective.” Sociological Methodology 44 (1): 80–94.
Ragin, Charles, and Benoît Rihoux. 2004. Qualitative Comparative Analysis (QCA): State of the Art and Prospects.” Qualitative Methods 2: 3–13.
Ragin, Charles, and Sarah Ilene Strand. 2008. “Using Qualitative Comparative Analysis to Study Causal Order. Comment on Caren and Panofsky.” Sociological Methods and Research 36 (4): 431–41.
Schneider, Carsten, and Claudius Wagemann. 2012. Set-Theoretic Methods for the Social Sciences. A Guide to Qualitative Comparative Analysis. Cambridge: Cambridge University Press.
Schneider, M. R., C. Schulze-Bentrop, and M. Paunescu. 2010. “Mapping the Institutional Capital of High-Tech Firms: A Fuzzy-Set Analysis of Capitalist Variety and Export Performance.” Journal of International Business Studies 41: 246:266.
Seawright, Jason. 2014. “Comment: Limited Diversity and the Unreliability of QCA.” Sociological Methodology 44 (1): 118–21.
Skaaning, Svend-Erik. 2011. “Assessing the Robustness of Crisp-Set and Fuzzy-Set QCA Results.” Sociological Methods & Research 40 (2): 391–408.
Skocpol, Theda. 1979. States and Social Revolutions. A Comparative Analysis of France, Russia, and China. Cambridge: Cambridge University Press.
Spirtes, Peter, Clark Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search. 2nd ed. Cambridge, Massachusetts: MIT Press.
Thiem, Alrik, Reto Spöhel, and Adrian Dușa. 2016. Enhancing Sensitivity Diagnostics for Qualitative Comparative Analysis: A Combinatorial Approach.” Political Analysis 24: 104–20.