The R Journal: accepted article

This article will be copy edited and may be changed before publication.

Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package PDF download
Toby Dylan Hocking

Abstract Regular expressions are powerful tools for extracting tables from non-tabular text data. Cap turing regular expressions that describe information to extract from column names can be especially useful when reshaping a data table from wide (few rows with many regularly named columns) to tall (fewer columns with more rows). We present the R package nc (short for named capture), which provides functions for wide-to-tall data reshaping using regular expressions. We describe the main new ideas of nc, and provide detailed comparisons with related R packages (stats, utils, data.table, tidyr, tidyfast, tidyfst, reshape2, cdata).

Received: 2020-04-30; online 2021-06-07
CRAN packages: ggplot2, nc, namedCapture, rematch2, rex, stringr, stringi, tidyr, re2r, reshape2, tidyfast, tidyfst, cdata, microbenchmark
CRAN Task Views implied by cited CRAN packages: Graphics, NaturalLanguageProcessing, Phylogenetics, TeachingStatistics

CC BY 4.0
This article is licensed under a Creative Commons Attribution 4.0 International license.

  author = {Toby Dylan Hocking},
  title = {{Wide-to-tall Data Reshaping Using Regular Expressions and
          the nc Package}},
  year = {2021},
  journal = {{The R Journal}},
  doi = {10.32614/RJ-2021-029},
  url = {}