Chapter 6 Analysis of sufficiency

The analysis of sufficiency is the main purpose of the QCA methodology, to find the minimal configurations of conditions that are sufficient for a given outcome. All main algorithms, to create a truth table, or to produce a logical minimization (to name just the more important ones) are designed around finding sufficiency relations.

Initially, this chapter contained a lot of information from all of these topics and grew disproportionately larger than other chapters, so much so that it almost had a contents of its own. Although topics like truth tables and minimizations are integral part of the analysis of sufficiency, they ended up as separate chapters to treat them in-depth and make a more clear distinction between these important topics.

The current chapter introduces the basic concepts of the sufficiency that are structurally very similar to the ones presented for the analysis of necessity. As it will be unfolded in the next sections, especially in the conceptual description, necessity and sufficiency are two complementary and actualy mirrored concepts. For this reason, much of the common details are left out to concentrate on the important differences.

Just like necessity statements, the literature abounds with claims or hypotheses involving the sufficiency of a cause or a configuration of causes over an outcome of interest. When studying an outcome of interest, necessity statements are important but sufficiency hypotheses resemble the most what we usually conceptualize in terms of causes and effects.

It is therefore not a coincidence that sufficiency statements are denoted by the forward arrow “$\rightarrow$” sign. Given a cause X and an outcome Y, the statement “X is sufficient for Y” is written as:

\[\mbox{X} \rightarrow \mbox{Y}\]

6.1 Conceptual description

Just like in the case of necessity, the sufficiency claim can be graphically represented by a subset / superset relation, as in the Venn diagram from figure 6.1. If X is a sufficient condition for Y, then whenever X is present, Y is present as well.

$X $\rightarrow$ Y: causal condition X sufficient for the outcome Y$

Figure 6.1: X $\rightarrow$ Y: causal condition X sufficient for the outcome Y

In terms of set relations, in the sufficiency claim Y is a superset of X, which means that X is never present in the absence (outside) of Y. When this relation is met, whenever X is present, Y is also present: there are situations where Y is present in the absence of X, but in every situation where X happens, Y happens as well. In other words, if we know that X is present, it is guaranteed that Y will be present as well (as a sufficient condition, X guarantees Y to happen).

The fact that Y is a bigger set than X (and it usually is the case) is an indication that no single condition explains all of Y. There might be other conditions, or other combinations of conditions which can explain the rest of the set Y, and the only possible way that X would cover all of Y is the situation when X is both necessary and sufficient for Y, a situation when both sets are equally large.

The complementarity between necessity and sufficiency relations becomes evident from yet another perspective: if X is a sufficient condition for Y, then the absence of the condition X is a necessary condition for the absence of Y. In terms of set representations, if X is a subset of Y then $\sim$X covers a much larger region than $\sim$Y, therefore $\sim$X is a superset of (necessary for) $\sim$Y. For binary crisp sets, the same thing can be represented using 2 $\times$ 2 crosstables:

$X $\rightarrow$ Y (left) and $\sim$X $\leftarrow$ $\sim$Y (right)$

Figure 6.2: X $\rightarrow$ Y (left) and $\sim$X $\leftarrow$ $\sim$Y (right)

It should now be clear that, if X is sufficient for Y and $\sim$X necessary for $\sim$Y, due to the asymmetrical nature of set relations this is not a proof that $\sim$X should be sufficient for $\sim$Y, nor that X should be necessary for Y. Two separate analyses of sufficiency must be performed for the presence and for the absence of the output, leading to different sufficiency statements (expressions).

Mirroring the definitions from necessity, the following might be used for sufficiency:

Definition 6.1: X is a sufficient condition for Y when every time X is present, Y is also present (Y is always present when X occurs).

Definition 6.2: X is a sufficient condition for Y if X does not occur in the absence of Y.

And in terms of set theory, the following equivalent definition:

Definition 6.3: X is a sufficient condition for Y if X is a subset of Y.

Knowing that X occurs is enough (sufficient) evidence to know that Y occurs. As a concrete example, following Weber’s (1930) well known theory about the relation between the Protestant ethic and the spirit of capitalism, we could imagine at aggregate community level, having a Protestant ethic (causal condition X) is sufficient to accumulate capital (outcome Y). Naturally, capital can be accumulated in many other ways, therefore a Protestant ethic is not a necessary condition to accumulate capital, but knowing that a group lives through this kind of ethic and abides certain Protestant traits, is sufficient to understand that group must be accumulating capital.

If necessity for multi-value conditions are a bit difficult to understand (Y having to be a subset of a specific value of X, not a subset of X as a whole), in the case of sufficiency things are a lot more straightforward. Assuming a multi-value condition X has three values (0, 1 and 2), then a sufficiency for value 2 is written as:

\[\mbox{X[2]} \rightarrow \mbox{Y}\]

As it turns out, X[2] is sufficient for Y iff:

all cases where X is equal to 2 are included in the set of Y, and
there is no instance where of X equal to 2 outside the set of Y (X takes the value 2 only inside Y, and nowhere else)

$X[2] $\rightarrow$ Y: causal condition X is sufficient for Y when equal to 2$

Figure 6.3: X[2] $\rightarrow$ Y: causal condition X is sufficient for Y when equal to 2

Only X[2] is a subset of the outcome Y, while both X[1] and X[0] can happen in the absence of Y, therefore only X[2] is a sufficient information to conclude that Y occurs in its presence.

Admittedly, perfect subset relations are rare to happen in the real world. Instead, most of the times a set is more or less included into another set, a topic that will be extensively covered in the next section. For the moment, partial inclusions can be used as decision criteria to conclude that one “set” is sufficient for another, as displayed in the figure below.

Figure 6.4: Almost but not complete inclusion of X into Y

This would be a good moment to pause and reflect upon the sufficiency relation $\mbox{X} \rightarrow \mbox{Y}$: most readers would understand that “a” (single) causal condition X is sufficient for the outcome Y, and most of the times this is true. But “X” is only a notation, and the meaning of the word “set” should not be confused with an automatic equivalence with a “single” causal condition.

A set may refer to many things, including entire conjunctions and/or disjunctions of multiple causal conditions. Imagine a situation with two causal conditions A and B, conjunctively sufficient for an outcome Y. Replacing X with A$\cdot$B, then X $\rightarrow$ Y in fact means A$\cdot$B $\rightarrow$ Y, with the “set” X being represented by the conjunction A$\cdot$B. The same thing applies for an entire expression like A$\cdot$B + C, interpreted as a “set” in its entirety.

Although Venn diagrams such as the one from figure 6.4 can tell a story about partial membership of a set X into Y, such diagrams are designed for crisp sets only. To be more precise, they are designed for bivalent crisp sets, as there are no Venn diagrams for multivalent sets.

In case of fuzzy sets, a similar representation for sufficiency can be graphically represented through an XY plot. Figure 6.5 is another proof that necessity and sufficiency are mirrored: in the necessity case the points should be found below the main diagonal, while in the sufficiency case the points should be located in the grey area above the diagonal.

Figure 6.5: Fuzzy sufficiency

This is also a superset / subset relation, since all values in X are lower than the corresponding values in Y, therefore Y is a fuzzy superset of X. Figure 6.5 is about a perfect fuzzy superset relation, given that all points are located above the diagonal, but there are situations where a small proportion of the points are located below the diagonal, lowering the inclusion score.

6.2 Inclusion / consistency

Inclusion and consistency are two words that refer to the same thing. If the inclusion of X (into Y) is high, then we say that X is highly consistent, or X has a high consistency score. The term “consistency” is a synonym for the word “inclusion”, and they are often used interchangeably.

For binary crisp sets, the same kind of general 2 $\times$ 2 can represent a sufficiency relation:

Figure 6.6: General 2 $ imes$ 2 table for sufficiency

The focus, here, is on the cells c and d, with the sufficiency inclusion score calculated with the formula:

\[\begin{equation} inclS\_{X\phantom{.}\rightarrow\phantom{.}Y\phantom{.}} = \frac{\mbox{X} \phantom{.} \cap \phantom{.} \mbox{Y}}{\mbox{X}} = \frac{\mbox{c}}{\mbox{c} + \mbox{d}} \tag{6.1} \end{equation}\]

The inclusion is the proportion of the cases where both X and Y happen (the intersection between X and Y, cell c), out of all cases where X happen (cells c and d together). In a most trivial command, this intersection can be calculated as:

sum(X & Y)/sum(X)

Using the same crisp version of the Lipset data, the inclusion of the condition DEV in the outcome SURV is:

using(LC, sum(DEV & SURV) / sum(DEV))

[1] 0.8

There are 8 cases where both DEV and SURV are present, out of a total of 10 cases where DEV is present. Although not the highest possible, a sufficiency inclusion score of 0.8 is high enough to conclude the level of development DEV is sufficient, or at least an important part of an expression that is sufficient for the outcome SURV, survival of democracy.

Chapter 7 will introduce the concept of an inclusion cut-off, to compare this score against.

The same result is obtained by using either version of the pof() command, specifying relation "sufficiency" but "suf" will be accepted as well:

pof(DEV, SURV, data = LC, relation = "sufficiency")

pof("DEV -> SURV", data = LC)


        inclS   PRI   covS   covU  
---------------------------------- 
1  DEV  0.800  0.800  1.000    -   
----------------------------------

In the second version, the sufficiency relation is signaled with the help of the right pointing arrow -> sign, making the specification of the argument relation = "sufficiency" redundant.

Calculations for multi-value data are just as simple, and extend a 2 $\times$ 2 table to one with multiple columns:

Figure 6.7: General crosstable for multivalue sufficiency

Here, it is actually more clear that each value of X has its own inclusion in the outcome Y, as if each value was a separate set from all other values in X. This is actually very close to what is happening, each value behaving indeed as a separate set, with an additional constraint that “sets” corresponding to each value are mutually exclusive.

\[\begin{equation} {inclS}\_{X[v]\phantom{.}\rightarrow\phantom{.}Y\phantom{.}} = \frac{\mbox{X[}v\mbox{]} \phantom{.} \cap \phantom{.} \mbox{Y} }{\mbox{X}} \tag{6.2} \end{equation}\]

Replacing $v$ with the value 2, as highlighted in the crosstable 6.7, the focus is on the last column where X takes the value 2, specifically on cell e (where the intersection occurs) and cell f. The equation becomes:

\[{inclS}_{X[2]\phantom{.}\rightarrow\phantom{.}Y\phantom{.}} = \frac{\mbox{e}}{\mbox{e + f}}\]

As a simple practical example, to mimic this command in R involves first loading the multi-value version of the Lipset data (if not already loaded), and the actual inclusion score is given by the second line of code:

data(LM)
using(LM, sum(DEV == 2 & SURV) / sum(DEV == 2))

[1] 1

There is a perfect consistency score of 1 between the value 2 of the development condition, that is sufficient for the survival of democracy, which makes it very likely to find this expression in the final minimization solution. This consistency can further be inspected by subsetting the dataset with:

LM[LM$DEV == 2, c(1, 6)]

   DEV SURV
BE   2    1
FR   2    1
NL   2    1
SE   2    1
UK   2    1

The consistency is perfect, with all countries having a surviving democracy where the level of development is equal to 2. The same conclusion is drawn using the parameters of fit function:

pof("DEV[2] -> SURV", data = LM)


           inclS   PRI   covS   covU  
------------------------------------- 
1  DEV[2]  1.000  1.000  0.625    -   
-------------------------------------

To compare with a different value, the situation is not similar in the cases where the level of development is equal to 1:

LM[LM$DEV == 1, c(1, 6)]

   DEV SURV
AU   1    0
CZ   1    1
FI   1    1
DE   1    0
IE   1    1

Out of the five countries where that happens, democracy survives only in three, which means that DEV{1} is not sufficient for the outcome because its inclusion score of 0.6 is too low:

pof("DEV[1] -> SURV", data = LM)


           inclS   PRI   covS   covU  
------------------------------------- 
1  DEV[1]  0.600  0.600  0.375    -   
-------------------------------------

The equation for fuzzy sets is only a tiny bit more challenging to calculate. Much like in the case of necessity, there are no crosstables to calculate simple cell frequencies but the formula for calculating fuzzy inclusion remains the same, a straightforward matter of summing their fuzzy intersection (the minimum of each pair of values between X and Y), and divide over the sum of all values in X. This gives a score that reflects how much of X is included in Y (or how consistent X is with the outcome Y).

\[\begin{equation} inclS\_{X\phantom{.}\rightarrow\phantom{.}Y\phantom{.}} = \frac{\sum{min(\mbox{X}, \phantom{.} \mbox{Y})}}{\sum{\mbox{X}}} \tag{6.3} \end{equation}\]

Using the fuzzy version of the Lipset data:

# load the data if not previously loaded
data(LF)
# then
using(LF, sum(fuzzyand(DEV, SURV)) / sum(DEV))

[1] 0.7746171

The same inclusion score (rounded to three decimals) is obtained using the parameters of fit function:

pof("DEV -> SURV", data = LF)


        inclS   PRI   covS   covU  
---------------------------------- 
1  DEV  0.775  0.743  0.831    -   
----------------------------------

Much like the necessity relation, simultaneous subset relations appear for sufficiency relations as well. The inclusion in the invertion of the outcome set is not simply the complement of the inclusion in the set itself. With an inclusion of 0.775 for the level of development into survival of democracy outcome, one would expect exactly 0.225 to be outside (included in the invertion of) the outcome set, but the actual value is 0.334:

pof("DEV -> ~SURV", data = LF)


        inclS   PRI   covS   covU  
---------------------------------- 
1  DEV  0.334  0.241  0.322    -   
----------------------------------

6.3 The PRI score

To further analyze the issue of simultaneous subset relations, table 5.1 will be reexamined with the subset relations reversed, the values in set X being switched with the values in set Y:

X <- c(0.2, 0.4, 0.45, 0.5, 0.6)
Y <- c(0.3, 0.5, 0.55, 0.6, 0.7)

Since all the elements of X are now lower than the corresponding elements in Y, we expect there is a full consistency of 1 (full inclusion) for the sufficiency of X for Y:

pof(X, Y, relation = "sufficiency") # "suf" is also accepted


       inclS   PRI   covS   covU  
--------------------------------- 
1   X  1.000  1.000  0.811    -   
---------------------------------

As expected, the inclusion score reaches a maximum, and we should expect the inclusion in the invertion of Y to be very low or even equal to zero. As it turns out, the inclusion of X into the invertion of Y is also very high:

pof(X, 1 - Y, relation = "sufficiency")


       inclS   PRI   covS   covU  
--------------------------------- 
1   X  0.814  0.000  0.745    -   
---------------------------------

This is a typical situation of simultaneous subset relations, with X appearing to be sufficient for both Y and $\sim$Y, which is a logical contradiction. A decision has to be made, and declare X as sufficient for either Y or $\sim$Y, as logically it cannot be sufficient for both.

This is where the PRI measure comes into play, which is found in the output for the parameters of fit function, in the sufficiency relation. PRI stands for Proportional Reduction in Inconsistency, and was introduced in the fs/QCA software by Ragin (2006) and further explained by C. Schneider and Wagemann (2012), and has the following formula that is faithfully implemented in the QCA package:

\[\begin{equation} \mbox{PRI} = \frac{\sum{min(\mbox{X}, \phantom{e} \mbox{Y})} \phantom{e} - \phantom{e} \sum{min(\mbox{X}, \phantom{e} \mbox{Y}, \phantom{e} {\sim}\mbox{Y})}}{\phantom{abcd} \sum{\mbox{X}} \phantom{abcde.} - \phantom{e} \sum{min(\mbox{X}, \phantom{e} \mbox{Y}, \phantom{e} {\sim}\mbox{Y})}} \tag{6.4} \end{equation}\]

Equation (6.4) is an extension of equation (6.3). The first part is exactly the same, both numerator and denominator subtracting the intersection between X, Y and $\sim$Y, to take into account the invertion of the outcome as well as its presence, in the same formula.

When simultaneous subset relations occur, a decision has to be made, and declare X as sufficient for either Y or $\sim$Y. This decision is based on the PRI score, whichever has the highest product between the consistency score and the PRI.

In our example, at the sufficiency relation for the presence of the output, the PRI score is equal to the maximum 1. For the invertion of the outcome, the PRI score drops to 0, therefore it can safely be concluded that X is sufficient for Y, and reject that X could be sufficient for $\sim$Y.

C. Schneider and Wagemann (2012) explain that simultaneous subset relations appear when there is at least one logically contradictory case, for at least one of the relations with Y and/or $\sim$Y. In the example above, X is a perfect subset of Y, hence logically contradictory cases can only appear in the sufficiency relation with the invertion of the outcome:

XYplot(X, 1 - Y, relation = "sufficiency")

Figure 6.8: Sufficiency relation between X and the invertion of Y

Indeed, there is one case in the lower right part of the XY plot, the region where deviant cases in kind are found. Such careful attention to sufficiency scores (for both presence of the outcome and its invertion), combined with an analysis of XY plots to identify deviant cases, possibly with a more in-depth analysis of the deviant cases themselves, define an important preliminary step before deciding that a certain condition or combination of conditions could enter in the minimization process, as it will be shown in chapter 8.

6.4 Coverage: raw and unique

Unlike necessity, where coverage is a measure of how trivial a condition is for an outcome, in the sufficiency relation the coverage is used as a measure to calculate how much of the entire outcome Y is “explained” by a causal condition X.

Readers who are familiar with standard statistical measures might also be familiar with a regression based measure called $R^2$, which reports how much of the variation in the dependent variable is explained by a certain regression model.

Obviously, QCA has nothing to do with statistics and variation, but the overall interpretation is very similar. For sufficiency relations, set X is more important for Y, the more it covers. To the limit, when a set X covers exactly 100% from Y, it becomes not only sufficient but also necessary for the outcome.

An extension of X beyond 100% of Y would still render it as necessary, but it will not sufficient anymore given that X needs to be a subset of Y for a valid sufficiency relation.

A high coverage does not necessarily means that X needs to be large. When Y is a small set, even a small X can cover a lot of Y. If Y is a large set, then X needs to be large in order to cover a significant area of Y. More important is the relative difference between X and Y: if the relative difference is small, then X covers a lot, and if the relative difference is high, it means that X covers very little from Y.

This kind of coverage is called a raw coverage (denoted with covS in the QCA package, from coverage for sufficiency), to differentiate from the so-called unique coverage (denoted with covU), specific to each set that covers Y. Again, this is also very similar to the regression model, where two independent variables can have their own $R^2$ explaining the dependent variable Y, but if the independent variables are collinear then much of their independent explanation overlap and the overall $R^2$ for the entire regression model is not equal to the sum of the individual explanations.

It is more or less similar in the sufficiency relation, raw coverage showing how much of the outcome Y is explained by a set, and unique coverage showing how much of that explanation can be uniquely attributed to that set, and to no other.

Figure 6.9: Unique coverage for X

Figure 6.9 displays two sufficient expressions A and B, together covering a certain part of Y. That is the raw coverage of the union between the sets A and B (the disjunctive expression A + B). Out of that total coverage, a good part is covered by both sets, and whatever is covered uniquely by A is shown in the striped area.

That is called the unique coverage of A, and it can be calculated by subtracting the coverage of the intersection A$\cdot$B, from the entire (raw) coverage of A:

\[covU_{A\phantom{.}\rightarrow{}\phantom{.}Y}\phantom{.} = \phantom{.} {covS}_{A\phantom{.}\rightarrow{}\phantom{.}Y} \phantom{.} - \phantom{.} {covS}_{A{\cdot}B\phantom{.}\rightarrow{}\phantom{.}Y}\]

As coverage for sufficiency has the same formula as the inclusion for necessity, this is translated as:

\[{covU}_{A\phantom{.}\rightarrow{}\phantom{.}Y} \phantom{.} = \phantom{.} \frac{\sum{min(\mbox{Y, A})}}{\sum{\mbox{Y}}} - \frac{\sum{min(\mbox{Y, A, B})}}{\sum{\mbox{Y}}}\]

Here, the set B can be interpreted by a combination of individual sets, for example if we had any number of sets A, B, C and so on, the general formula for the unique coverage of A is the raw coverage of A minus the coverage of the intersection between Y, A, and the union of all other sets B, C and so on:

\[\begin{equation} {covU}\_{A\phantom{.}\rightarrow{}\phantom{.}Y} \phantom{.} = \phantom{.} \frac{\sum{min(\mbox{Y, A})}}{\sum{\mbox{Y}}} - \frac{\sum{min(\mbox{Y, A}, \phantom{.} max(\mbox{B, C, ...)})}}{\sum{\mbox{Y}}} \tag{6.5} \end{equation}\]

It is perhaps less obvious that raw and unique coverage apply not only for individual (atomic) sets but also for any type of sufficient expression. Any kind of SOP - sum of products expression can be interpreted as an individual set, which has a certain coverage within the outcome set Y.

To put it differently, a “set” is not always atomic (formed by the elements of a single condition). A conjunction or a disjunction of conditions is also a set, and also a more complex expression of multiple conjunctions and disjunctions is a “set”. No matter how complex the expression is, the resulting set has a coverage within the outcome set, if it is sufficient.

References

———. 2006. User’s Guide to Fuzzy-Set/Qualitative Comparative Analysis 2.0. Tucson, Arizona: Department of Sociology, University of Arizona.

Schneider, Carsten, and Claudius Wagemann. 2012. Set-Theoretic Methods for the Social Sciences. A Guide to Qualitative Comparative Analysis. Cambridge: Cambridge University Press.

Weber, Max. 1930. The Protestant Ethic and the Spirit of Capitalism. London, New York: Routledge.