The R Journal: accepted article

This article will be copy edited and may be changed before publication.

The Landscape of R Packages for Automated Exploratory Data Analysis PDF download
Mateusz Staniak and Przemysław Biecek

Abstract The increasing availability of large but noisy data sets with a large number of heterogeneous variables leads to the increasing interest in the automation of common tasks for data analysis. The most time-consuming part of this process is the Exploratory Data Analysis, crucial for better domain understanding, data cleaning, data validation, and feature engineering. There is a growing number of libraries that attempt to automate some of the typical Exploratory Data Analysis tasks to make the search for new insights easier and faster. In this paper, we present a systematic review of existing tools for Automated Exploratory Data Analysis (autoEDA). We explore the features of fifteen popular R packages to identify the parts of analysis that can be effectively automated with the current tools and to point out new directions for further autoEDA development.

Received: ; online 2019-08-17, supplementary material, (1.6 Kb)
CRAN packages: cranlogs, radiant, visdat, archivist, xtable, arsenal, DataExplorer, dataMaid, dlookr, ExPanDaR, explore, shiny, exploreR, funModeling, inspectdf, RtutoR, SmartEDA, data.table, summarytools, knitr, ggplot2, xray, tableone, describer, skimr, prettyR, Hmisc, ggfortify, autoplotly, gpairs, GGally, survminer, cr17, DALEX, iml
CRAN Task Views implied by cited CRAN packages: ReproducibleResearch, MissingData, TeachingStatistics, WebTechnologies, Bayesian, ClinicalTrials, Econometrics, Finance, Graphics, HighPerformanceComputing, Multivariate, OfficialStatistics, Phylogenetics, SocialSciences, Survival


CC BY 4.0
This article and supplementary materials are licensed under a Creative Commons Attribution 4.0 International license.

@article{RJ-2019-033,
  author = {Mateusz Staniak and Przemysław Biecek},
  title = {{The Landscape of R Packages for Automated Exploratory Data
          Analysis}},
  year = {2019},
  journal = {{The R Journal}},
  doi = {10.32614/RJ-2019-033},
  url = {https://journal.r-project.org/archive/2019/RJ-2019-033/index.html}
}