Tidy Data Neatly Resolves Mass-Spectrometry’s Ragged Arrays

Mass spectrometry (MS) is a powerful tool for measuring biomolecules, but the data produced is often difficult to handle computationally because it is stored as a ragged array. In R, this format is typically encoded in complex S4 objects built around environments, requiring an extensive background in R to perform even simple tasks. However, the adoption of tidy data (Wickham 2014) provides an alternate data structure that is highly intuitive and works neatly with base R functions and common packages, as well as other programming languages. Here, we discuss the current state of R-based MS data processing, the convenience and challenges of integrating tidy data techniques into MS data processing, and present RaMS, a package that produces tidy representations of MS data.

William Kumler (University of Washington School of Oceanography) , Anitra E. Ingalls (University of Washington School of Oceanography)

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2022-050.zip

H. Wickham. Tidy data. Journal of Statistical Software, 59(10): 1–23, 2014. URL https://doi.org/10.18637/jss.v059.i10.



