The R Journal: accepted article

This article will be copy edited and may be changed before publication.

dGAselID: An R Package for Selecting a Variable Number of Features in High Dimensional Data PDF download
Nicolae Teodor Melita and Stefan Holban

Abstract The dGAselID package proposes an original approach to feature selection in high dimen sional data. The method is built upon a diploid genetic algorithm. The genotype to phenotype mapping is modeled after the Incomplete Dominance Inheritance, overpassing the necessity to define a dominance scheme. The fitness evaluation is done by user selectable supervised classifiers, from a broad range of options. Cross validation options are also accessible. A new approach to crossover, inspired from the random assortment of chromosomes during meiosis is included. Several mutation operators, inspired from genetics, are also proposed. The package is fully compatible with the data formats used in Bioconductor and MLInterfaces package, readily applicable to microarray studies, but is flexible to other feature selection applications from high dimensional data. Several options for the visualization of evolution and outcomes are implemented to facilitate the interpretation of results. The package’s functionality is illustrated by examples.

Received: 2016-08-25; online 2017-08-25
CRAN packages: dGAselID, genalg, GA, nsga2R, gaoptim, STPGA, kofnGA, mogavs, gaselect, scales , CRAN Task Views implied by cited CRAN packages: Optimization , Bioconductor packages: MLInterfaces, MLInterfaces, ALL, genefilter, hgu95av2.db

CC BY 4.0
This article is licensed under a Creative Commons Attribution 4.0 International license.

  author = {Nicolae Teodor Melita and Stefan Holban},
  title = {{dGAselID: An R Package for Selecting a Variable Number of
          Features in High Dimensional Data}},
  year = {2017},
  journal = {{The R Journal}},
  url = {}