Identifying Counterfactual Queries with the R Package cfid

In the framework of structural causal models, counterfactual queries describe events that concern multiple alternative states of the system under study. Counterfactual queries often take the form of “what if” type questions such as “would an applicant have been hired if they had over 10 years of experience, when in reality they only had 5 years of experience?” Such questions and counterfactual inference in general are crucial, for example when addressing the problem of fairness in decision-making. Because counterfactual events contain contradictory states of the world, it is impossible to conduct a randomized experiment to address them without making several restrictive assumptions. However, it is sometimes possible to identify such queries from observational and experimental data by representing the system under study as a causal model, and the available data as symbolic probability distributions. Shpitser and Pearl (2007) constructed two algorithms, called ID* and IDC*, for identifying counterfactual queries and conditional counterfactual queries, respectively. These two algorithms are analogous to the ID and IDC algorithms by Shpitser and Pearl (2006b,a) for identification of interventional distributions, which were implemented in R by Tikka and Karvanen (2017) in the causaleffect package. We present the R package cfid that implements the ID* and IDC* algorithms. Identification of counterfactual queries and the features of cfid are demonstrated via examples.

Santtu Tikka (Department of Mathematics and Statistics, University of Jyväskylä)
2023-11-01

0.1 Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-053.zip

C. Avin, I. Shpitser and J. Pearl. Identifiability of path-specific effects. In Proceedings of international joint conference on artificial intelligence, pages. 357–363 2005.
A. Balke and J. Pearl. Counterfactual probabilities: Computational methods, bounds and applications. In Proceedings of the 10th conference on uncertainty in artificial intelligence, pages. 46–54 1994a.
A. Balke and J. Pearl. Probabilistic evaluation of counterfactual queries. In Proceedings of the 12th AAAI national conference on artificial intelligence, pages. 230–237 1994b.
E. Bareinboim and J. Pearl. Causal inference by surrogate experiments: \(z\)-identifiability. In Proceedings of the 28th conference on uncertainty in artificial intelligence, pages. 113–120 2012.
M. Chen, V. Chernozhukov, I. Fernandez-Val and B. Melly. Counterfactual: Estimation and inference methods for counterfactual analysis. 2020. URL https://CRAN.R-project.org/package=Counterfactual. R package version 1.2.
G. Csardi and T. Nepusz. The igraph software package for complex network research. InterJournal, Complex Systems: 1695, 2006. URL https://igraph.org.
J. Y. Halpern. Axiomatizing causal reasoning. In Proceedings of the 14th conference on uncertainty in artificial intelligence, pages. 202–210 1998.
P. W. Holland. Statistics and causal inference. Journal of the American Statistical Association, 81(396): 945–960, 1986. URL https://doi.org/10.1080/01621459.1986.10478354.
Y. Huang and M. Valtorta. Pearl’s calculus of intervention is complete. In Proceedings of the 22nd conference on uncertainty in artificial intelligence, pages. 217–224 2006. AUAI Press.
J. Karvanen. R6causal: R6 class for structural causal models. 2022. R package version 0.6.1.
Y. Kivva, E. Mokhtarian, J. Etesami and N. Kiyavash. Revisiting the general identifiability problem. In Proceedings of the 38th conference on uncertainty in artificial intelligence, pages. 1022–1030 2022. PMLR.
M. J. Kusner, J. Loftus, C. Russell and R. Silva. Counterfactual fairness. In Proceedings of the 31st international conference on neural information processing systems, pages. 4069–4079 2017.
S. Lee, J. D. Correa and E. Bareinboim. General identifiability with arbitrary surrogate experiments. In Proceedings of the 35th conference on uncertainty in artificial intelligence, pages. 389–398 2019. PMLR.
J. Pearl. Causal diagrams for empirical research. Biometrika, 669–710, 1995. URL https://doi.org/10.1093/biomet/82.4.669.
J. Pearl. Causality: Models, reasoning and inference. 2nd ed Cambridge University Press, 2009.
I. Shpitser and J. Pearl. Complete identification methods for the causal hierarchy. Journal of Machine Learning Research, 9(64): 1941–1979, 2008.
I. Shpitser and J. Pearl. Identification of conditional interventional distributions. In Proceedings of the 22nd conference on uncertainty in artificial intelligence, pages. 437–444 2006a. AUAI Press.
I. Shpitser and J. Pearl. Identification of joint interventional distributions in recursive semi-Markovian causal models. In Proceedings of the 21st national conference on artificial intelligence - volume 2, pages. 1219–1226 2006b. AAAI Press.
I. Shpitser and J. Pearl. What counterfactuals can be tested. In Proceedings of the 23rd conference on uncertainty in artificial intelligence, pages. 352–359 2007. AUAI Press.
H. Stoll, G. King, L. Zeng, C. Gandrud and B. Sabath. WhatIf: Software for evaluating counterfactuals. 2020. URL https://CRAN.R-project.org/package=WhatIf. R package version 1.5-10.
J. Textor, B. van der Zander, M. S. Gilthorpe, M. Liśkiewicz and G. T. Ellison. Robust causal inference using directed acyclic graphs: The R package dagitty. International Journal of Epidemiology, 45(6): 1887–1894, 2017. URL https://doi.org/10.1093/ije/dyw341.
J. Tian and J. Pearl. A general identification condition for causal effects. In Proceedings of the 19th AAAI national conference on artificial intelligence, pages. 567–573 2002.
S. Tikka, A. Hyttinen and J. Karvanen. Causal effect identification from multiple incomplete data sources: A general search-based approach. Journal of Statistical Software, 99(5): 1–40, 2021. URL https://doi.org/10.18637/jss.v099.i05.
S. Tikka and J. Karvanen. Identifying causal effects with the R package causaleffect. Journal of Statistical Software, 76(12): 1–30, 2017. URL https://doi.org/10.18637/jss.v076.i12.
S. Tikka and J. Karvanen. Surrogate outcomes and transportability. International Journal of Approximate Reasoning, 108: 21–37, 2019.
T. S. Verma and J. Pearl. Equivalence and synthesis of causal models. In Proceedings of the 6th conference on uncertainty in artificial intelligence, pages. 255–270 1990.
J. Zhang and E. Bareinboim. Fairness in decision-making — the causal explanation formula. In Proceedings of the 32nd AAAI conference on artificial intelligence, pages. 2037–2045 2018.

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Tikka, "Identifying Counterfactual Queries with the R Package cfid", The R Journal, 2023

BibTeX citation

@article{RJ-2023-053,
  author = {Tikka, Santtu},
  title = {Identifying Counterfactual Queries with the R Package cfid},
  journal = {The R Journal},
  year = {2023},
  note = {https://doi.org/10.32614/RJ-2023-053},
  doi = {10.32614/RJ-2023-053},
  volume = {15},
  issue = {2},
  issn = {2073-4859},
  pages = {330-343}
}