Working with CRSP/COMPUSTAT in R: Reproducible Empirical Asset Pricing

It is common to come across SAS or Stata manuals while working on academic empirical finance research. Nonetheless, given the popularity of open-source programming languages such as R, there are fewer resources in R covering popular databases such as CRSP and COMPUSTAT. The aim of this article is to bridge the gap and illustrate how to leverage R in working with both datasets. As an application, we illustrate how to form size-value portfolios with respect to (Fama and French 1993) and study the sensitivity of the results with respect to different inputs. Ultimately, the purpose of the article is to advocate reproducible finance research and contribute to the recent idea of “Open Source Cross-Sectional Asset Pricing”, proposed by Chen and Zimmermann (2020).

Majeed Simaan (School of Business, Stevens Institute of Technology)
2021-06-08

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2021-047.zip

A. Y. Chen and T. Zimmermann. Open source cross-sectional asset pricing. Available at SSRN, 2020.
E. F. Fama and K. R. French. Common risk factors in the returns on stocks and bonds. Journal of Finance, 1993.

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Simaan, "Working with CRSP/COMPUSTAT in R: Reproducible Empirical Asset Pricing", The R Journal, 2021

BibTeX citation

@article{RJ-2021-047,
  author = {Simaan, Majeed},
  title = {Working with CRSP/COMPUSTAT in R: Reproducible Empirical Asset Pricing},
  journal = {The R Journal},
  year = {2021},
  note = {https://doi.org/10.32614/RJ-2021-047},
  doi = {10.32614/RJ-2021-047},
  volume = {13},
  issue = {1},
  issn = {2073-4859},
  pages = {426-443}
}