ClusTorus: An R Package for Prediction and Clustering on the Torus by Conformal Prediction

Protein structure data consist of several dihedral angles, lying on a multidimensional torus. Analyzing such data has been and continues to be key in understanding functional properties of proteins. However, most of the existing statistical methods assume that data are on Euclidean spaces, and thus they are improper to deal with angular data. In this paper, we introduce the package ClusTorus specialized to analyzing multivariate angular data. The package collects some tools and routines to perform algorithmic clustering and model-based clustering for data on the torus. In particular, the package enables the construction of conformal prediction sets and predictive clustering, based on kernel density estimates and mixture model estimates. A novel hyperparameter selection strategy for predictive clustering is also implemented, with improved stability and computational efficiency. We demonstrate the use of the package in clustering protein dihedral angles from two real data sets.

Seungki Hong (Department of Statistics, Seoul National University) , Sungkyu Jung (Department of Statistics, Seoul National University)
2022-10-11

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2022-032.zip

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Hong & Jung, "ClusTorus: An R Package for Prediction and Clustering on the Torus by Conformal Prediction", The R Journal, 2022

BibTeX citation

@article{RJ-2022-032,
  author = {Hong, Seungki and Jung, Sungkyu},
  title = {ClusTorus: An R Package for Prediction and Clustering on the Torus by Conformal Prediction},
  journal = {The R Journal},
  year = {2022},
  note = {https://doi.org/10.32614/RJ-2022-032},
  doi = {10.32614/RJ-2022-032},
  volume = {14},
  issue = {2},
  issn = {2073-4859},
  pages = {186-207}
}