Most clustering strategies have not changed considerably since their initial definition. The common improvements are either related to the distance measure used to assess dissimilarity, or the function used to calculate prototypes. Time-series clustering is no exception, with the Dynamic Time Warping distance being particularly popular in that context. This distance is computationally expensive, so many related optimizations have been developed over the years. Since no single clustering algorithm can be said to perform best on all datasets, different strategies must be tested and compared, so a common infrastructure can be advantageous. In this manuscript, a general overview of shape-based time-series clustering is given, including many specifics related to Dynamic Time Warping and associated techniques. At the same time, a description of the dtwclust package for the R statistical software is provided, showcasing how it can be used to evaluate many different time-series clustering procedures.
Supplementary materials are available in addition to this article. It can be downloaded at RJ-2019-023.zip
dtwclust, flexclust, cluster, TSdist, TSclust, pdc, dtw, proxy, clue, foreach, RcppParallel, doParallel
TimeSeries, Cluster, Multivariate, HighPerformanceComputing, Environmetrics, Optimization, Robust
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Sardá-Espinosa, "Time-Series Clustering in R Using the dtwclust Package", The R Journal, 2019
BibTeX citation
@article{RJ-2019-023, author = {Sardá-Espinosa, Alexis}, title = {Time-Series Clustering in R Using the dtwclust Package}, journal = {The R Journal}, year = {2019}, note = {https://doi.org/10.32614/RJ-2019-023}, doi = {10.32614/RJ-2019-023}, volume = {11}, issue = {1}, issn = {2073-4859}, pages = {22-43} }