ordinalClust: An R Package to Analyze Ordinal Data

Ordinal data are used in many domains, especially when measurements are collected from people through observations, tests, or questionnaires. ordinalClust is an innovative R package dedicated to ordinal data that provides tools for modeling, clustering, co-clustering and classifying such data. Ordinal data are modeled using the BOS distribution, which is a model with two meaningful parameters referred to as "position" and "precision". The former indicates the mode of the distribution and the latter describes how scattered the data are around the mode: the user is able to easily interpret the distribution of their data when given these two parameters. The package is based on the co-clustering framework (when rows and columns are simultaneously clustered). The co-clustering approach uses the Latent Block Model (LBM) and the SEM-Gibbs algorithm for parameter inference. On the other hand, the clustering and the classification methods follow on from simplified versions of the SEM-Gibbs algorithm. For the classification process, two approaches are proposed. In the first one, the BOS parameters are estimated from the training dataset in the conventional way. In the second approach, parsimony is introduced by estimating the parameters and column-clusters from the training dataset. We empirically show that this approach can yield better results. For the clustering and co-clustering processes, the ICL-BIC criterion is used for model selection purposes. An overview of these methods is given, and the way to use them with the ordinalClust package is described using real datasets. The latest stable package version is available on the Comprehensive R Archive Network (CRAN).

Margot Selosse (Université de Lyon, Lyon 2, ERIC EA 3083.) , Julien Jacques (Université de Lyon, Lyon 2, ERIC EA 3083.) , Christophe Biernacki (Inria, Université de Lille, CNRS Université Lille - UFR de Mathématiques - Cité Scientifique - 59655 Villeneuve d’Ascq Cedex)
2021-01-14

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2021-011.zip

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Selosse, et al., "ordinalClust: An R Package to Analyze Ordinal Data", The R Journal, 2021

BibTeX citation

@article{RJ-2021-011,
  author = {Selosse, Margot and Jacques, Julien and Biernacki, Christophe},
  title = {ordinalClust: An R Package to Analyze Ordinal Data},
  journal = {The R Journal},
  year = {2021},
  note = {https://doi.org/10.32614/RJ-2021-011},
  doi = {10.32614/RJ-2021-011},
  volume = {12},
  issue = {2},
  issn = {2073-4859},
  pages = {61-81}
}