A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years

The flexibility of R and the diversity of the R community leads to a large number of programming styles applied in R packages. We have analyzed 108 million lines of R code from CRAN and quantified the evolution in popularity of 12 style-elements from 1998 to 2019. We attribute 3 main factors that drive changes in programming style: the effect of style-guides, the effect of introducing new features, and the effect of editors. We observe in the data that a consensus in programming style is forming, such as using lower snake case for function names (e.g. softplus_func) and <- rather than = for assignment.

Chia-Yi Yen (Mannheim Business School, Universität Mannheim) , Mia Huai-Wen Chang (Amazon Web Services) , Chung-hong Chan (Mannheimer Zentrum für Europäische Sozialforschung, Universität Mannheim)
2022-06-21

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2022-006.zip

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Yen, et al., "A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years", The R Journal, 2022

BibTeX citation

@article{RJ-2022-006,
  author = {Yen, Chia-Yi and Chang, Mia Huai-Wen and Chan, Chung-hong},
  title = {A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years},
  journal = {The R Journal},
  year = {2022},
  note = {https://doi.org/10.32614/RJ-2022-006},
  doi = {10.32614/RJ-2022-006},
  volume = {14},
  issue = {1},
  issn = {2073-4859},
  pages = {6-21}
}