The R Journal: article published in 2018, volume 10:1

Collections in R: Review and Proposal PDF download
Timothy Barry , The R Journal (2018) 10:1, pages 455-471.

Abstract R is a powerful tool for data processing, visualization, and modeling. However, R is slower than other languages used for similar purposes, such as Python. One reason for this is that R lacks base support for collections, abstract data types that store, manipulate, and return data (e.g., sets, maps, stacks). An exciting recent trend in the R extension ecosystem is the development of collection packages, packages that provide classes that implement common collections. At least 12 collection packages are available across the two major R extension repositories, the Comprehensive R Archive Network (CRAN) and Bioconductor. In this article, we compare collection packages in terms of their features, design philosophy, ease of use, and performance on benchmark tests. We demonstrate that, when used well, the data structures provided by collection packages are in many cases significantly faster than the data structures provided by base R. We also highlight current deficiencies among R collection packages and propose avenues of possible improvement. This article provides useful recommendations to R programmers seeking to speed up their programs and aims to inform the development of future collection-oriented software for R.

Received: 2017-11-05; online 2018-06-13
CRAN packages: Rcpp, hashr, hashFunction, filehashSQLite, tictoc, DSL, bit64, bit, Oarray, sets, filehash, hash, hashmap, rstackdeque, rstack, liqueueR, dequer, flifo, listenv, stdvectors, microbenchmark, neuroim, FindMinIC
CRAN Task Views implied by cited CRAN packages: HighPerformanceComputing, MedicalImaging, NumericalMathematics
Bioconductor packages: S4Vectors

CC BY 4.0
This article is licensed under a Creative Commons Attribution 4.0 International license.

  author = {Timothy Barry},
  title = {{Collections in R: Review and Proposal}},
  year = {2018},
  journal = {{The R Journal}},
  url = {},
  pages = {455--471},
  volume = {10},
  number = {1}