RJ-2017-027.knit

title: ‘The NoiseFiltersR Package: Label Noise Preprocessing in R’ description: In Data Mining, the value of extracted knowledge is directly related to the quality of the used data. This makes data preprocessing one of the most important steps in the knowledge discovery process. A common problem affecting data quality is the presence of noise. A training set with label noise can reduce the predictive performance of classification learning techniques and increase the overfitting of classification models. In this work we present the NoiseFiltersR package. It contains the first extensive R implementation of classical and state-of-the-art label noise filters, which are the most common techniques for preprocessing label noise. The algorithms used for the implementation of the label noise filters are appropriately documented and referenced. They can be called in a R-user-friendly manner, and their results are unified by means of the “filter” class, which also benefits from adapted print and summary methods. author: - Pablo Morales - Julián Luengo - Luís P.F. Garcia - Ana C. Lorena - André C.P.L.F. de Carvalho - ’ Francisco Herrera’ date: ‘2017-05-10’ date_received: ‘2016-07-12’ journal: title: The R Journal issn: 2073-4859 firstpage: 219 lastpage: 228 volume: 9 issue: 1 slug: RJ-2017-027 packages: cran: - MICE - Amelia - caret - FSelector - mvoutlier - robustDA - probFDA - NoiseFiltersR - unbalanced - RWeka bioc: ~ preview: preview.png CTV: - MachineLearning - Multivariate - Robust - HighPerformanceComputing - NaturalLanguageProcessing - OfficialStatistics - SocialSciences output: distill::distill_article: self_contained: no toc: no legacy_pdf: yes pdf_url: RJ-2017-027.pdf citation_url: https://doi.org/10.32614/RJ-2017-027 doi: 10.32614/RJ-2017-027 creative_commons: CC BY csl: /home/mitchell/R/x86_64-pc-linux-gnu-library/4.1/rjtools/rjournal.csl

CRAN packages used

MICE, Amelia, caret, FSelector, mvoutlier, robustDA, probFDA, NoiseFiltersR, unbalanced, RWeka

CRAN Task Views implied by cited packages

MachineLearning, Multivariate, Robust, HighPerformanceComputing, NaturalLanguageProcessing, OfficialStatistics, SocialSciences

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Morales, et al., "The NoiseFiltersR Package: Label Noise Preprocessing in R", The R Journal, 2017

BibTeX citation

@article{RJ-2017-027,
  author = {Morales, Pablo and Luengo, Julián and Garcia, Luís P.F. and Lorena, Ana C. and Carvalho, André C.P.L.F. de and Herrera,  Francisco},
  title = {The NoiseFiltersR Package: Label Noise Preprocessing in R},
  journal = {The R Journal},
  year = {2017},
  note = {https://doi.org/10.32614/RJ-2017-027},
  doi = {10.32614/RJ-2017-027},
  volume = {9},
  issue = {1},
  issn = {2073-4859},
  pages = {219-228}
}