Two-Stage Sampling Design and Sample Selection with the R Package R2BEAT

R2BEAT (“R ‘to’ Bethel Extended Allocation for Two-stage sampling”) is an R package for the optimal allocation of a sample. Its peculiarity lies in properly addressing allocation problems for two-stage and complex sampling designs with multi-domain and multi-purpose aims. This is common in many official and non-official statistical surveys, therefore R2BEAT could become an essential tool for planning a sample survey. The functions implemented in R2BEAT allow the use of different workflows, depending on the available information on one or more interest variables. The package covers all the phases, from the optimization of the sample to the selection of the Primary and Secondary Stage Units. Furthermore, it provides several outputs for evaluating the allocation results.

Giulio Barcaroli (Independent consultant) , Andrea Fasulo (Italian National Institute of Statistics) , Alessio Guandalini (Italian National Institute of Statistics) , Marco D. Terribili (Italian National Institute of Statistics)
2023-12-18

0.1 Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-069.zip

S. Baillargeon and L.-P. Rivest. The construction of stratified designs in R with the package stratification. Survey Methodology, 37(1): 53–65, 2011.
G. Barcaroli. SamplingStrata: An R package for the optimization of stratified sampling. Journal of Statistical Software, 61(4): 1–24, 2014.
G. Barcaroli, T. Buglielli and C. D. Vitiis. MAUSS-R: Multivariate Allocation of Units in Sampling Surveys. 2020. URL https://www.istat.it/en/methods-and-tools/methods-and-it-tools/design/design-tools/mauss-r. R package version 2.4.
J. W. Bethel. Sample allocation in multivariate surveys. Survey methodology, 15(1): 47–57, 1989.
J. Breidaks, M. Liberts and J. Jukams. surveyplanning: Survey planning tools. Riga, Latvia: Central Statistical Bureau of Latvia, 2020. URL https://csblatvia.github.io/surveyplanning/. R package version 4.0.
E. Bueno. optimStrat: Choosing the sample strategy. 2020. URL https://CRAN.R-project.org/package=optimStrat. R package version 2.3.
G. Cicchitelli, A. Herzel and G. E. Montanari. Il campionamento statistico. Bologna: Il mulino, 1992.
W. G. Cochran. Sampling techniques. John Wiley & Sons, 1977.
J.-C. Deville and C.-E. Särndal. Calibration estimators in survey sampling. Journal of the American statistical Association, 87(418): 376–382, 1992.
P. D. Falorsi, M. Ballin, C. De Vitiis and G. Scepi. Principi e metodi del software generalizzato per la definizione del disegno di campionamento nelle indagini sulle imprese condotte dall’ISTAT. Statistica Applicata, 10(2): 235–257, 1998.
M. H. Hansen, W. N. Hurwitz and W. G. Madow. Sample survey methods and theory. Vol. I and II, Methods and applications. 1953.
D. G. Horvitz and D. J. Thompson. A generalization of sampling without replacement from a finite universe. Journal of the American statistical Association, 47(260): 663–685, 1952.
L. Kish. Optima and proxima in linear sample designs. Journal of the Royal Statistical Society: Series A (General), 139(1): 80–95, 1976.
L. Kish. Survey sampling. New York: John Wiley & Sons, Inc., 1965.
R. J. Little and D. B. Rubin. Statistical analysis with missing data. John Wiley & Sons, 2019.
P. Lynn. The use of substitution in surveys. The Survey Statistician, 49: 14–16, 2004.
J. Neyman. On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4): 558–625, 1934.
H. A. G. Rojas. Estrategias de muestreo: Diseño de encuestas y estimación de parámetros. Ediciones de la U, 2016.
H. A. G. Rojas. samplesize4surveys: Sample size calculations for complex surveys. 2020. URL https://CRAN.R-project.org/package=samplesize4surveys. R package version 4.1.1.
C.-E. Särndal and S. Lundström. Estimation in surveys with nonresponse. John Wiley & Sons, 2005.
C.-E. Särndal, B. Swensson and J. Wretman. Model assisted survey sampling. Springer Science & Business Media, 2003.
Y. Tillé and A. Matei. Sampling: Survey sampling. 2021. URL https://CRAN.R-project.org/package=sampling. R package version 2.9.
A. A. Tschprow. On the two different aspects of the representative method: The method of stratified sOn the mathematical expectation of the moments of frequency distributions in the case of correlated observation sampling and the method of purposive selection. Metron, 2: 646–683, 1923.
R. Valliant, J. A. Dever and F. Kreute. Practical tools for designing and weighting survey samples. Springer, 2015.
R. Valliant, J. A. Dever and F. Kreuter. PracTools: Tools for designing and weighting survey samples. 2020. URL https://CRAN.R-project.org/package=PracTools. R package version 1.2.2.
J. R. Waters and A. J. Chester. Optimal allocation in multivariate, two-stage sampling designs. The American Statistician, 41(1): 46–50, 1987.
D. Zardetto. ReGenesees: An advanced R system for calibration, estimation and sampling error assessment in complex sample surveys. Journal of Official Statistics, 31(2): 177–203, 2015.

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Barcaroli, et al., "Two-Stage Sampling Design and Sample Selection with the R Package R2BEAT", The R Journal, 2023

BibTeX citation

@article{RJ-2023-069,
  author = {Barcaroli, Giulio and Fasulo, Andrea and Guandalini, Alessio and Terribili, Marco D.},
  title = {Two-Stage Sampling Design and Sample Selection with the R Package R2BEAT},
  journal = {The R Journal},
  year = {2023},
  note = {https://doi.org/10.32614/RJ-2023-069},
  doi = {10.32614/RJ-2023-069},
  volume = {15},
  issue = {3},
  issn = {2073-4859},
  pages = {191-213}
}