Matching with Clustered Data: the CMatching Package in R

Matching is a well known technique to balance covariates distribution between treated and control units in non-experimental studies. In many fields, clustered data are a very common occurrence in the analysis of observational data and the clustering can add potentially interesting information. Matching algorithms should be adapted to properly exploit the hierarchical structure. In this article we present the CMatching package implementing matching algorithms for clustered data. The package provides functions for obtaining a matched dataset along with estimates of most common parameters of interest and model-based standard errors. A propensity score matching analysis, relating math proficiency with homework completion for students belonging to different schools (based on the NELS-88 data), illustrates in detail the use of the algorithms.

Massimo Cannas , Bruno Arpino

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at

CRAN packages used

CMatching, Matching, designmatch, optmatch, MatchIT, quickmatch, multiwayvcov

CRAN Task Views implied by cited packages

SocialSciences, Econometrics, ExperimentalDesign, HighPerformanceComputing, Optimization


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".


For attribution, please cite this work as

Cannas & Arpino, "The R Journal: Matching with Clustered Data: the CMatching Package in R", The R Journal, 2019

BibTeX citation

  author = {Cannas, Massimo and Arpino, Bruno},
  title = {The R Journal: Matching with Clustered Data: the CMatching Package in R},
  journal = {The R Journal},
  year = {2019},
  note = {},
  doi = {10.32614/RJ-2019-018},
  volume = {11},
  issue = {1},
  issn = {2073-4859},
  pages = {7-21}