SetMethods : an Add-on R Package for Advanced QCA by

This article presents the functionalities of the R package SetMethods, aimed at performing advanced set-theoretic analyses. This includes functions for performing set-theoretic multi-method research, set-theoretic theory evaluation, Enhanced Standard Analysis, diagnosing the impact of temporal, spatial, or substantive clusterings of the data on the results obtained via Qualitative Comparative Analysis (QCA), indirect calibration, and visualising QCA results via XY plots or radar charts. Each functionality is presented in turn, the conceptual idea and the logic behind the procedure being first summarized, and afterwards illustrated with data from Schneider et al. (2010).


Introduction
Set-theoretic methods, in general (Goertz and Mahoney, 2012), and Qualitative Comparative Analysis, in particular, are becoming increasingly popular within different disciplines in the social sciences and neighboring fields (Rihoux et al., 2013).Parallel to conceptual developments and increasing numbers of applied studies, accelerating progress in terms of software development can be witnessed.While less than a decade ago only two functioning software packages were available to users (fsQCA Ragin et al. (2006) and Tosmana Cronqvist (2011)), there are now over a dozen different software solutions offered (see http://compasss.org/software.htm).Many of them are developed within the R software environment, with R package QCA (Dusa, 2007) being not only the one with the longest history, but also the most complete and complex.
In this paper, we discuss the different functionalities of the R package SetMethods (Medzihorsky et al., 2016).It is best perceived of as an add-on tool to package QCA and allows applied researchers to perform advanced set-theoretic analyses.More precisely, SetMethods enables researchers to perform Set-Theoretic Multi-Method Research, the Enhanced Standard Analysis (ESA), Set-Analytic Theory Evaluation, to run diagnostics in the presence of clustered data structures, and to display their results in various ways.
We proceed as follows.Each of the different functionalities within SetMethods is presented in a separate section.Within each section, we first briefly summarize the conceptual idea behind the analysis in question, then describe the computational logic of the function for performing the analysis, after which we demonstrate the use of the function by displaying the R syntax and selected output by using an example from published research.
Even though the main purpose is to present the functionality of R package SetMethods, this article is also useful for researchers who perform their QCA in software environments other than R because we present the logic of several of the main advanced set-analytic procedures in a concise and transparent manner.

The empirical example
In order to illustrate the use of the different functions in SetMethods, we use the empirical example by Schneider et al. (2010) which uses fuzzy-sets for explaining capitalist variety and export performance in high-tech industries.More precisely, the research question focuses on the institutional determinants of export performance in high-tech industries.The outcome consists of the export performance in high-tech industries (EXPORT).The conditions used are: employment protection (EMP), collective bargaining (BARGAIN), university training (UNI), occupational training (OCCUP), stock market size (STOCK), and mergers and aquisitions (MA).The authors analyze 76 cases, representing 19 countries at four time points.
For the sake of simplicity, we use the same data for illustrating all the functions.Our goal is not, of course, to contribute to the substantive discussion on varieties of capitalism or institutional context.This is why we will take the liberty to alter the analytic setup if needed for demonstration purposes, by, for instance, dropping cases or conditions or by changing the outcome to be explained.

Set-theoretic multi-method research (MMR)
The term Set-Theoretic Multi-Method Research (MMR) captures all those empirical approaches for combining cross-case analyses with within-case studies in which both levels of analysis follow the goal of investigating sets and their relations.At the cross-case level the most common methodological tool is Qualitative Comparative Analysis (QCA) and at the within-case level, process tracing.Both tools can be rooted in (fuzzy) set theory (see, e.g.Ragin (2008) for QCA and Mikkelsen (2017) for process tracing).In principle and practice any sequence of analyses can and is performed in the applied literature.In the following, we focus on the sequence 'cross-case QCA first, followed by within-case analyses'.This sequence may or may not be continued by another QCA.As any decent QCA, it is certainly preceded by a thorough accumulation of case knowledge in order to select and calibrate conditions.These crucial research steps, however, fall outside of the definitional scope of set-theoretic MMR. 1 In the following, we limit our discussion to set-theoretic MMR after a cross-case analysis of sufficiency, as discussed in Schneider andRohlfing (2013, 2016), Schneider and Rohlfing (manuscript), and Rohlfing and Schneider (2018). 2 We first briefly summarize the different types of cases and the purpose of their within-case analysis (Section 2.3.1).After this, we discuss the four different feasible comparative within-case analyses in Figure 1 and their analytic purposes (Section 2.3.2).For each form of within-case analysis, we explain the use of the mmr function and the formula used for finding cases (Schneider and Rohlfing, manuscript) 3 .The general structure of function mmr is illustrated in Figure 1.Users need to specify whether they want to perform single or comparative within-case analysis and then on which cases the analysis is performed.

Identifying types of cases
Key for combining QCA with process tracing is the sorting of cases to different case types based on the QCA solution formula.The literature identifies five different types (Schneider and Rohlfing, 2013).Membership in a type is defined by the membership scores of a case in the outcome Y, on the one hand, and the sufficient term T or the solution formula S, on the other hand.Table 1 summarizes the definition of each case type and the analytic purpose of the within-case analysis in single cases.Figure 2 visualizes the location of each case type in an XY plot.
Typical cases and deviant cases consistency are defined based on their membership in a sufficient term T, whereas deviant cases coverage and IIR cases are defined based on their membership solution formula S. Deviant cases consistency are subdivided into deviant in degree and deviant in kind.The latter are always preferable for within-case analysis.IIR cases are not useful for single-case studies, but they play an important role for comparative within-case analyses (see Section 2.3.2).
Table 1 is adapted from Schneider and Rohlfing (manuscript).
1 For a systematic discussion of the pre-QCA case studies, see Rihoux and Lobe (2009). 2 For MMR after an analysis of necessity, see Rohlfing and Schneider (2013). 3For a systematic test of the mathematical formulas used for selecting single cases or pairs of cases for settheoretic MMR see the Appendix of this paper.

Typical cases
Process tracing in typical cases aims at empirically probing the causal mechanism(s) linking the sufficient term S to outcome Y.For conjunction S to be causal, each conjunct C of S must be causal, i.e. they must make a difference to outcome Y by making a difference to mechanism M.This requires as many within-case analyses of typical cases as there are conjuncts in the sufficient conjunction.
For each analysis, one is the focal conjunct FC and the others are the complementary conjuncts CC.
The focal conjunct FC is the conjunct for which we want to find out whether it makes a difference for the mechanism M, while the complementary conjuncts CC represent the other conjuncts of the sufficient term S (Schneider and Rohlfing, manuscript).For causal inference on the configuration we proceed by taking each conjunct at a time as the focal conjunct FC.Additionally, we also apply the test severity principle.With fuzzy-sets the membership in mechanism M can only vary within the corridor established by the membership in FC (the lowest value M can take) and Y (the highest value M can take) for preserving the causal chain FC → M → Y (Schneider and Rohlfing, manuscript).The smaller the corridor, the smaller the range of membership values M can take.Therefore, the most severe test for M is the one in which FC = S = Y because the only consistent membership score in M equals FC = S = Y.
The best-available typical case fulfills the following criteria: a) the focal conjunct is the one that defines the membership of the typical case in the term (FC ≤ CC); b) the corridor for mechanism M as defined by the sufficient term S (from a) we also have S = FC) and Y is small; c) membership in the sufficient term S is high; d) the case is uniquely covered by the sufficient term S.
Figure 3 visualizes the test severity principle in two different ways.The XY plot in the upper panel shows that for cases closer to the diagonal, test severity increases.The length of the vertical and horizontal arrows, respectively, visualizes the range of fuzzy set membership scores for M that would still be consistent.The larger this range, the less severe the test.The Euler diagram in the lower panel visualizes the same by contrasting S 1 almost as big as Y with S 2 being much smaller than Y.The former leaves little and the latter a lot of room for M.
The ideal typical case is located in the upper-right corner of the XY plot in Figure 3 with FC = S = Y = 1.In applied QCA, such cases usually do not exist in the data at hand.Function mmr() identifies the best available typical case in a given data set.
Function mmr() first sorts each typical case based on whether FC ≤ CC (rank 1) or FC > FC (rank 2).Cases in each rank are then further sorted according to Formula 1. Smaller values indicate better suitable cases. 4

TYP = (Y − S)
small corridor for mechanism + (1 − S) large membership in the sufficient term (1) where Y = outcome, S = sufficient term Applied to our example, function mmr() works as follows.After minimizing the truth table TT_y and producing the parsimonious and intermediate solutions sol_yp and sol_yi using package QCA, we input these solutions5 into the mmr() function while setting arguments match to FALSE and cases less severe testing to 2. As argument term is set to 1 the output shows the typical cases for each focal condition in the first sufficient term, together with some additional information.The information included in the output comprises of membership values of the typical cases in the focal conjuct, complementary conjuncts, the whole sufficient term, and the outcome (in this case EXPORT), formula values St, whether the case is the most typical according to the formula, which rank does the case sit in, and whether the case is uniquely covered by the sufficient term.The order of the information that users should look for in this output is whether the case is uniquely covered, what rank is the case in (the smaller, the better), and what formula value St does the case have (the smaller, the better).For example, for focal conjunct emp in sufficient term emp * bargain * OCCUP, Switzerland_03 appears to be the best available typical case, being uniquely covered, being in Rank 1, and having the smallest formula value (St=0.59).

# We create the truth table:
TT_y <-truthTable(SCHF, outcome = "EXPORT", conditions = c("EMP","BARGAIN","UNI", "OCCUP","STOCK", "MA"), parsimonious solution according to the order in the "qca" object the user wants to work with.However, since QCA solutions (conservative, parsimonious, intermediate) are in a subset relationship with each other, they tend to have more complicated structures in which model ambiguity is tied from one solution to the other.For this cases the argument sol allows users to specify the models they want to choose by using a character string of the form "c1p3i2" where c = conservative solution, p = parsimonious solution and i = intermediate solution.

Deviant cases consistency
Deviant cases consistency are puzzling because their membership in the sufficient term S exceeds that in the outcome Y, i.e. S > Y.This becomes even more puzzling if S > 0.5&Y < 0.5, that is, if we have deviant cases consistency in kind rather than just in degree (see Table 1).The more S exceeds Y, the bigger the empirical puzzle, especially if membership in S is high.Within-case analysis of a deviant case consistency aims at identifying the reasons why mechanism M either absent or prevented from producing Y.The reason must be an INUS condition omitted from S. Formula 2 identifies the best available deviant case consistency in a data set.
far from to the diagonal large membership in the sufficient term (2) where Y = outcome, S = sufficient term The R Journal Vol.10/1, July 2018 ISSN 2073-4859 Using the same data from Schneider et al. (2010) and focusing on the parsimonious solution, function mmr() identifies the deviant consistency cases for each sufficient term.For obtaining this we need to keep argument match set to FALSE, as we are doing single case identification, but set argument cases to 3, the identifier for deviant cases consistency (see Figure 1).The output shows the deviant consistency cases (first column) grouped by sufficient term (second column) together with term membership, outcome membership, formula value Sd, and whether the case is the most deviant for a particular term.In the output we see that, for example, for term emp * OCCUP the most deviant case consistency is Switzerland_90 with the smallest formula value (Sd=0.67).Figure 4 shows all the deviant cases consistency (cases in the lower right corner) for the first sufficient path emp * OCCUP of the parsimonious solution.

Deviant cases coverage
Deviant cases coverage are puzzling because they are members of the outcome without, however, being members of any known sufficient term.Within-case analysis aims at identifying sufficient term S + omitted from the solution formula, which triggers mechanism a M and outcome Y.
Since deviant cases coverage are defined by what they are not -members of the solution formula (see Table 1) -this solution formula is not a good place to start selecting the best available deviant cases coverage.Instead, this type of case is selected based on their membership in their truth table row TT.For each TT with at least one deviant case coverage, a within-case analysis can be performed.If more than one deviant case coverage populates the same TT, Formula 3 identifies the best available case for within-case analysis.
where Y = outcome, TT = membership in the Truth Table row Similar to the Formula 1 for identifying the best available typical case, the goal is to minimize the difference between the membership scores in Y and TT and to prefer higher membership in TT.This is achieved by formula 3. Since the primary goal in this within-case analysis is not to draw causal inference but to identify a missing conjunction, there is no need to decompose TT into its constituent sets.
Applied to our example, the following code displays the list of deviant cases coverage (notice argument cases is set to 4), the membership they have in the entire solution formula, their values on the Formula 3, the truth table row they belong to (columns starting with TT indicating the specific combination of conditions the case presents), the membership they have in that specific truth table row, and membership in the outcome.The cases are sorted by truth table row and ranked according to their appropriateness using formula values Sd.For example, we can notice that truth # Get deviant cases coverage for the parsimonious solution: mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE, sol = 1, match = FALSE, cases = 4)

Identifying best-matching pairs of cases for comparative process tracing
The literature identifies four feasible within-case comparisons after a QCA between two types of cases each (Schneider and Rohlfing, manuscript).With each comparison a different analytic goal is pursued.Figure 5 summarizes these goals.The two comparisons 'along the main diagonal' pursue a causal inference goal, whereas the two 'vertical' comparisons aim at improving the QCA model specification by identifying either an INUS condition missing from a known sufficient term or an entire new sufficient term missing from the solution formula.

Matching typical and individually irrelevant (IIR) cases
The purpose of the within-case comparison between a typical case and and IIR case is to empirically investigate whether a sufficient term is a difference-maker, i.e. causal, not only for the outcome (Y) at the cross-case level, but also for the mechanism M at the within-case level.Similarly, the within-case comparison of two typical cases empirically probes whether the same mechanism M links the sufficient term S to outcome Y in typical cases that are as different from each other as possible.For both forms of comparison, it holds that if S is a conjunction, each of its conjuncts C must be a difference-maker.Hence, the comparisons between a typical case and an IIR case (or another typical case) must be performed for each single conjunct C at a time.The following sections provide more details for each form of comparison and spells out the sorting mechanisms and mathematical formulas that underly the respective functions in SetMethods.Function mmr() first sorts each pair of typical and IIR cases into ranks 1-8 as defined in Table 2. Cases in smaller rank numbers are more adequate for the analytic goal of the comparative within-case analysis of these two case types.For case pairs in rank 1, for example, it holds that the differencemaking quality can be attributed to the focal conjunct FC both on the typical and the IIR case, and that it is determinate.
Within each rank, Formula 4 maximizes the following criteria: between both cases, the difference in FC and in Y, respectively, should be small; both should have high membership in CC; and both should be close to the diagonal.Within each rank, case pairs with smaller formula values are more appropriate.Additionally, typical cases should be uniquely covered by the sufficient term under inverstigation, while IIR cases should be globally For switching to comparative MMR and identifying pairs of cases, argument match must be set to TRUE.Additionally, for getting the best available pairs of typical and IIR cases we set cases to 2. In the output for the first sufficient term (notice argument term set to 1) we will get the best available pairs for each focal conjunct in turn as separate tables.The output lists the names of the typical and IIR case, their value on the above formula (Distance), which rank does the pair come from, whether the typical case is uniquely covered, and whether the IIR case is globally uncovered.Researchers should strive to pick cases that are uniquely covered and globally uncovered, have the smallest rank possible, and have the smallest Distance value.For example, for focal conjunct OCCUP, typical case Denmark_99 and IIR case New Zealand_90 are the best pair available as they are in Rank 1, they have the smallest formula value (Distance=1.24),and the typical case is uniquely covered by the term, while the IIR case is globally uncovered by the solution.

Matching two typical cases
The matching of two typical cases follows a logic similar to the one between a typical and an IIR case.
The goal is to probe the difference-making properties of each conjunct (FC) in sufficient term S to mechanism M. Table 3 defines the four ranks that can occur based on two typical cases' membership in FC and the complementary conditions CC.After sorting each possible pair of typical cases into one of these ranks, Formula 5 further ranks those pairs such that their difference in FC and the outcome, respectively, is minimized; that their membership in CC is maximized; and that both are close to the The R Journal Vol.10/1, July 2018 ISSN 2073-4859 diagonal (test severity principle).Additionally, the two typical cases should be uniquely covered by the sufficient term.
large difference in outcome small difference in complementary conditions For getting the best available pairs of two typical cases argument cases in function mmr() must be set to 1.The output is similar to the one for typical-IIR pairs of cases, the best available pair of typical cases for each focal conjunct being situated in as low a rank as possible, having the smaller formula value (Distance), and being both uniquely covered.Looking at the first term of the parsimonious solution, we can see that pair Switzerland_03-Denmark_03 is the best available for focal conjunct emp, while pair Switzerland_99-Denmark_99 is the best available for focal conjunct OCCUP, both pairs containing uniquely covered typical cases, being in Rank 1, and having the smallest Distance value for their respective focal conjunct.

Matching typical and deviant cases consistency
The comparative within-case analysis of a typical and a deviant case consistency aims at identifying the INUS condition missing from the sufficient term S in question.The best available pair of cases The R Journal Vol.10/1, July 2018 ISSN 2073-4859 maximizes the following criteria: their membership in S should be as high and similar as possible and their membership in Y is different as possible.Formula 6 translates these matching criteria into practice.
Setting cases to 3 we get best available pair of typical and deviant consistency cases for each sufficient term in the parsimonious solution sol_yp.For identifying a missing INUS in sufficient term emp * OCCUP, the best available pair of cases that we could choose for process-tracing would be the one between typical case Switzerland_03 and deviant consistency case Australia_90, as they have the smallest formula value (Distance=0.84).

Matching deviant cases coverage and IIR cases
The comparative within-case analysis of a deviant case coverage and an IIR case aims at identifying the sufficient conjunction S + missing from the sufficient solution formula generated with QCA.The point of reference for matching cases is their membership in the truth table row TT to which they belong.Analogous to the within-case comparison of a typical and a deviant case consistency case, the goal is to maximize both cases' membership and their similarity in TT and their difference in Y.
Cases for this forth type of comparison can be identified by setting cases to 4. Since for deviant coverage and IIR cases we are interested in identifying an entire missing sufficient term, the output for these pairs is focused on matching pairs in truth table rows, rather than in sufficient terms.Therefore, the output is sorted by truth table rows (the columns starting with TT showing the combination of conditions) and for each truth table row we can identify a best matching pair of cases according to formula values in column distance.For example, if we focus on truth table row EMP * BARGAI N * uni * OCCUP * stock * ma (rows 6, 7, 8, 9, 10 in the output), the deviant case coverage France_95 and the IIR case Finland_90 constitute the best matching pair, having the smallest formula value for this specific truth table row (distance=1.43).

Enhanced standard analysis (ESA)
Limited empirical diversity is an omnipresent feature in social science data.The treatment of logical remainders rows has been a major theme since the Ragin's path-breaking book (Ragin, 1987, esp. chapter 7).In Ragin (2008, chapters 8 and 9), three approaches towards remainders are proposed under the label of the Standard Analysis (SA).Researchers can decide not to include them into the The R Journal Vol.10/1, July 2018 ISSN 2073-4859 logical minimization (yielding the conservative or complex solution CS), to include all remainders that are simplifying (yielding the most parsimonious solution PS), or to include only those simplifying assumption that are easy based on so-called directional expectations (yielding the intermediate solution (IS)).Schneider and Wagemann (2012, chapter 8) propose the Enhanced Standard Analysis (ESA), which argues that simplifying assumptions on specific remainders can be untenable.There are three sources of untenability.Incoherent counterfactuals, which are either logical remainders contradicting claims of necessity6 or assumptions made for the negated outcome7 , and implausible counterfactuals, which consist of claims about impossible remainders8 .ESA simply stipulates that no QCA solution formula can be based on untenable assumptions.
Figure 6 provides a graphical representation of the different types of assumptions as defined by SA and ESA.Both approaches only allow for simplifying assumptions9 (i.e.those in the inner circle) and both distinguish between difficult and easy counterfactuals (i.e. the vertical line inside the circle)10 .ESA but not SA does block any untenable assumption (i.e. the gray area on the lower part).A risk of making untenable assumption is given whenever a researcher is claiming the presence of a necessary condition, when statements of sufficiency for both the outcome and its negation are made, and/or when two or more conditions with mutually exclusive categories are used in a truth table.ESA requires that researchers identify those logical remainder rows whose inclusion into the logical minimization would amount to an untenable claim.As a result, one obtains the enhanced PS and the enhanced IS. 11   Function esa() provides a straightforward tool for avoiding untenable assumptions and thus putting ESA into practice.First, function esa() can exclude remainders that contradict single necessary conditions, unions of necessary conditions, or more complicated expressions of necessity.For example, assuming that the disjunction STOCK + MA is necessary for the outcome EXPORT, we ban all remainder rows implied by this necessity claim in the nec_cond argument.All the logical remainder rows that are subsets of ¬STOCK¬MA are subsequently set ot OUT = 0 in the truth table object ttnew and thus excluded from further logical minimization.
# Ban logical reminders contradicting necessity: # Let s assume that "STOCK + MA" is necessary for "EXPORT": newtt <-esa(oldtt = TT_y, nec_cond = "STOCK + MA") Secondly, the esa() function can also ban implausible counterfactuals to produce truth tables in which specific logical remainders identifyied through conjunctions are excluded.For example, we can ban all remainder rows that have BARGAI N+ ∼ OCCUP by using the Boolean expression in argument untenable_LR.Finally, the function can exclude contradictory simplifying assumptions (which are another form of untenable assumptions) and empirically observed rows that are part of simultaneous subset relations12 by just using the unique truth table row identifier in the argument contrad_rows.While argument untenable_LR accepts Boolean expression for excluding only logical remainders, argument contrad_rows can exclude both empirically observed rows and remainder rows through their unique identifier (row number). 13 Ban impossible logical remainders: newtt <-esa(oldtt = TT_y, untenable_LR = "BARGAIN*~OCCUP") # Ban contradictory rows: newtt <-esa(oldtt = TT_y, contrad_rows = c("19", "14", "46", "51")) Set-analytic theory evaluation Ragin (1987, chapter 7) spells out the notion of theory evaluation.In essence, it consists of identifying the overlap between a researcher's theory (T) as formulated prior to the empirical analysis and the empirical results (S) obtained via QCA.With both T and S being represented in the form of Boolean expressions, all four logically possible combinations between T and S can be expressed in Boolean terms as well and each case's membership in each of these four expressions be calculated.Theory evaluation reveals which aspects of T are empirically corroborated by S and which ones are not.It also reveals how strong this emirical support is.Last but not least, theory evaluation can serve as a case selection device by identifying cases that display membership scores in the empirical solution and the outcome that are expected or utterly unexpected based on T. Schneider and Wagemann (2012, chapter 11) refine Theory Evaluation by taking into account each case's membership score not only in T and S, but also outcome Y.This is necessary due to the development of parameters of fit based on which by now it has become widespread practice to allow for solution formulas with less than perfect consistency and coverage scores.This means that in applied QCA, there are cases of S&¬Y and of ¬S&Y.This gives rise to eight different areas.Each area can be defined as a Boolean expression, provides different analytic information, and defines

Untenable Easy Counterfactuals
Untenable Difficult Counterfactuals different types of cases.Each case has partial membership in all areas but only in one of higher than 0.5.Figure 7 provides a visualization of the areas and the kinds of cases in each area.14Function theory.evaluation()performs the theory evaluation procedure between a theory specified in Boolean terms and results obtained using the QCA package.Assuming that the theory can be summarized as EMP*∼MA + STOCK, the example below shows how theory evaluation works using the second intermediate solution for outcome EXPORT.The first part of the output shows the names and proportion of cases in each of the intersections between theory and the empirical solution.The second part of the output shows parameters of fit for the solution, the theory, and their intersections, which indicate how much each of these areas are in line with a statement of sufficiency for EXPORT.Additionally, the function also stores the membership of each case in each intersection between theory and empirics, which can be accessed by setting argument print.data to TRUE.

Diagnostic tools for clustered data structures
Most of the data analyzed in the social sciences and neighboring disciplines contains structures, or layers, that might be analytically relevant but are not captured by the models used to analyze that data.García-Castro and Arino (2016) discuss clustering along a temporal dimension.This can be days, years, decades, or substantively important periods (before -after a crisis).Clusters can also be of different origin.For instance, cases can be clustered along geographic units, such as world regions or subnational units.There can also be clusterings along substantive lines, such as e.g. economic sectors, parties, political regime types.
Whenever a researcher is not capturing these differences via a condition in her QCA model, she de facto assumes that the analytic difference does not matter.There are often good reasons to not include additional conditions into an analysis, with keeping limited diversity at bay being one of them.It should, however, been put to an empirical test whether it is ok to pool cases across different time periods, geographic units, and/or substantive areas.
Function cluster() provides the tools for performing such a test.It analyzes whether the QCA solution formula obtained from the pooled data also can be found in each of the sub-populations in the data.If it can, then pooling the data is fine.If it cannot, then pooling the data is not fine because it produces a solution that does not hold for all sub-populations.Rather, it is an artifact of having pooled cases that follow different causal logics.In this case, researchers might decide to drop from their analysis those sub-populations that do not follow the general pattern or to include a condition The R Journal Vol.10/1, July 2018 ISSN 2073-4859 into their model that captures the difference.
For using the cluster() function, researchers need data in the long format, with a column identifying the unit of analysis and a column identifying the clustering element.In our example, using the Schneider et al. (2010) data, the column identifyign the units is the COUNTRY column, while the cluster element is stored in the YEAR column.After having the data in the long format, we can get the diagnostic of how our solution holds throughout the different units and clusters by just imputing the solution (in this case sol_yi) in the cluster() function, while also specifying the data, the outcome, the unit identifier, and the cluster identifier.The first part of the output shows the consistency sufficiency for the overall, pooled sample for each sufficient term and the entire solution to be diagnosed.This first row of the output should be equivalent to the consistency measures obtained when producing the solution.The rows below show consistency values of the same terms and solution, but for each cluster and each unit in part.These are the values we would obtain if the analysis where to be run for each cluster subsample and each unit subsample separately.For example, the pooled consistency of sufficient term emp * UN I * OCCUP * ma is 0.919 but only 0.733 for the year 1990.This might be an indication for the researcher that her sufficiency statement is not as consistent and might not work in the same way for the cluster of cases from 1990.In general, if consistency values between clusters differ greatly from pooled consistency for a term, we might want to rexamine the setup of the analysis to account for this.The Distances: section of the output reports on how much the parameters of fit differ from the clusters to the pooled data.Finally, the last part of the output displays a similar table for coverage, with pooled, between clusters, and within units measures.

QCAradar
Another function included in the package is the QCAradar() function which allows visualization of QCA results or simple Boolean expressions in the form of a radar chart 16 .The function accepts in the argument results sufficient solutions obtained through the function minimize() in package QCA, or Boolean expressions involving more than three conditions, as in the second example below.
# Display radar chart for the second intermediate solution: QCAradar(results = sol_yi, outcome = "EXPORT", fit=TRUE, sol = 2) Figure 8a shows a radar chart for the second intermediate solution formula.The different sufficient terms are overlapping on the radar in different shades.For example, we can see the first term emp*bargain*OCCUP, as condition EMP is missing it is set to 0 for that respective corner, condition BARGAI N is missing and set to 0, and condition OCCUP is present and set to 1. Since the rest of the conditions are not specified in this term, they are all left at -.

Indirect calibration
SetMethods also includes a function for performing the indirect calibration procedure described by Ragin (2008)  17 .This procedure assumes that the cases included in the analysis have interval-scale raw scores which can be initially sorted broadly into different levels of fuzzy set membership.Subsequently, the raw scores are transformed into calibrated scores using a binomial or a beta regression.Assuming that vector x contains the initial raw scores, while vector x_cal contains the rough grouping of those values into set membership scores, function indirectCalibration() can produce a vector of fuzzy-set scores a by fitting the x to x_cal using a binomial regression if binom is set to TRUE.

Conclusions
In this article, we have presented the main functionalities of the R package Setmethods.It is true that starting to perform QCA in R is more onerous than starting with a point-and-click software.
Yet, the flexibility offered by R is also its strength, especially for a young method like QCA.As set methods continue to develop, software implementations need to be updated and improved at a fast rate.Package SetMethods is designed to do precisely this: providing a tool for implementing new ideas that enhance set-theoretic analyses for applied researchers.

Figure 1 :
Figure 1: Types of Post-QCA Case Studies

Figure 2 :
Figure 2: Types of Cases, XY Plot

Figure 3 :
Figure 3: Two visualizations of test severity

Figure 6 :
Figure 6: Graphical Representation of Different Types of Assumptions on Remainders

Table 1 :
Types of cases in fsQCA of sufficiency The R Journal Vol.10/1, July 2018 ISSN 2073-4859 UN I * occup * STOCK * MA (rows 9, 10, 4, and 5 of the output) is populated by 4 deviant coverage cases, out of which UK_90 is the best available for within case analysis, having the smallest formula value (Sd=0.51).
table row The R Journal Vol.10/1, July 2018 ISSN 2073-4859 emp * bargain * Even if not useful for single case studies, identifying IIR cases is informative as their list -together with the deviant cases coverage -indicate the diversity among cases without the outcome.The more different truth table rows are populated by IIR cases (and deviant cases coverage), the more heterogeneous this group of cases is.Function mmr() lists all individually irrelevant cases with respect to the entire solution formula (also called globally uncovered IIR cases) and sorts each of them into the truth table to which they belong best.Since these cases are not informative for single case studies and are being used just to indicate diversity among the cases without the outcome, the function does not involve a formula ranking of IIR cases.
Individually irrelevant (IIR) cases owe their name to the fact that single within-case analyses in this type of cases is not useful.IIR cases do play a crucial role in two forms of comparative within-case analysis (see Section 2.3.2).

Table 2 :
Possible membership constellations between focal (FC) and complementary conjuncts (CC) in comparison of typical and IIR case

Table 3 :
Possible membership constellations between focal (FC) and complementary conjuncts (CC) in comparison of two typical cases data in the long format: