Three-Way Correspondence Analysis in R

Three-way correspondence analysis is a suitable multivariate method for visualising the association in three-way categorical data, modelling the global dependence, or reducing dimensionality. This paper provides a description of an R package for performing three-way correspondence analysis: CA3variants. The functions in this package allow the analyst to perform several variations of this analysis, depending on the research question being posed and/or the properties underlying the data. Users can opt for the classical (symmetrical) approach or the non-symmetric variant - the latter is particularly useful if one of the three categorical variables is treated as a response variable. In addition, to perform the necessary three-way decompositions, a Tucker3 and a trivariate moment decomposition (using orthogonal polynomials) can be utilized. The Tucker3 method of decomposition can be used when one or more of the categorical variables is nominal while for ordinal variables the trivariate moment decomposition can be used. The package also provides a function that can be used to choose the model dimensionality.

Rosaria Lombardo (Department of Economics, University of Campania “Luigi Vanvitelli”) , Michel van de Velden (Econometric Institute, Erasmus University) , Eric J. Beh (National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong)
2023-11-09

0.1 Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-049.zip

E. J. Beh. A comparative study of scores for correspondence analysis with ordered categories. Biometrical Journal, 40: 413–429, 1998a. URL https://doi.org/10.1002/(SICI)1521-4036(199808)40:4<413::AID-BIMJ413>3.0.CO;2-V.
E. J. Beh. Correspondence analysis using orthogonal polynomials. University of Wollongong, Australia: Unpublished PhD Thesis, 1998b.
E. J. Beh. Elliptical confidence regions for simple correspondence analysis. Journal of Statistical Planning and Inference, 140: 2582–2588, 2010. URL http://dx.doi.org/10.1016/j.jspi.2010.03.018.
E. J. Beh. Simple correspondence analysis of ordinal cross-classifications using orthogonal polynomials. Biometrical Journal, 39: 589–613, 1997. URL https://doi.org/10.1002/bimj.4710390507.
E. J. Beh and P. J. Davy. Partitioning Pearson’s chi-squared statistic for a completely ordered three-way contingency table. The Australian and New Zealand Journal of Statistics, 40: 465–477, 1998. URL https://doi.org/10.1111/1467-842X.00050.
E. J. Beh and R. Lombardo. An introduction to correspondence analysis. Chichester, UK: John Wiley & Sons, 2021a.
E. J. Beh and R. Lombardo. Confidence regions and approximate p-values for classical and non symmetric correspondence analysis. Communications in Statistics - Theory and Methods, 44: 95–114, 2015. URL https://doi.org/10.1080/03610926.2013.768665.
E. J. Beh and R. Lombardo. Correspondence analysis, theory, practice and new strategies. Chichester, UK: John Wiley & Sons, 2014.
E. J. Beh and R. Lombardo. Correspondence analysis using the cressie–read family of divergence statistics. International Statistical Review, 2023. URL https://doi.org/10.1111/insr.12541.
E. J. Beh and R. Lombardo. Features of the polynomial biplot for ordered contingency tables. Journal of Computational and Graphical Statistics, 31: 403–412, 2021b. URL https://doi.org/10.1080/10618600.2021.1990773.
E. J. Beh, B. Simonetti and L. D’Ambra. Partitioning a non-symmetric measure of association for three-way contingency tables. Journal of Multivariate Analysis, 98: 1391–1411, 2007. URL https://doi.org/10.1016/j.jmva.2007.01.011.
A. Carlier and P. M. Kroonenberg. Decompositions and biplots in three-way correspondence analysis. Psychometrika, 61: 355–373, 1996. URL https://doi.org/10.1007/BF02294344.
A. Carlier and P. M. Kroonenberg. The case of the French cantons: An application of three-way correspondence analysis. In Visualization of categorical data, Eds J. Blasius and M. Greenacre pages. 253–275 1998. San Diego: Academic Press.
E. Ceulemans and H. A. L. Kiers. Selecting among three-mode principal component models of different types and complexities: A numerical convex hull based method. British Journal of Mathematical & Statistical Psychology, 59: 133–150, 2006. URL https://doi.org/10.1348/000711005X64817.
C. C. Clogg. Some models for the analysis of association in multiway cross-classifications having ordered categories. Journal American Statistical Association, 77: 803–815, 1982. URL https://doi.org/10.2307/2287311.
N. A. C. Cressie and T. R. C. Read. Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society, Series B, 46: 440–464, 1984. URL http://www.jstor.org/stable/2345686.
L. D’Ambra and N. C. Lauro. Non-symmetrical correspondence analysis for three-way contingency table. In Multiway data analysis, Eds R. Coppi and S. Bolasco pages. 301–315 1989. Amsterdam: Elsevier.
J. A. Davis. Codebook for the 1977 General Social Survey. 1977. National Opinion Research Centre, Chicago.
P. H. C. Eilers. : Analysis of multi-way arrays., 2019. URL https://CRAN.R-project.org/package=multiway. R package version 1.0-6 [online; last accessed March 13, 2019].
P. L. Emerson. Numerical construction of orthogonal polynomials from a general recurrence formula. Biometrics, 24: 696–701, 1968. URL https://doi.org/10.2307/2528328.
P. Giordano, H. A. Kiers and M. A. D. Ferraro. Three-way component analysis using the R package ThreeWay. Journal of Statistical Software, 57: 1–23, 2014. URL 10.18637/jss.v057.i07.
L. A. Goodman and W. H. Kruskal. Measures of association for cross classifications. Journal of the American Statistical Association, 49: 732–764, 1954. URL https://doi.org/10.2307/2281536.
J. C. Gower, P. J. F. Groenen and M. van de Velden. Area biplots. Journal of Computational and Graphical Statistics, 19: 46–61, 2010. URL https://www.jstor.org/stable/25651299.
J. C. Gower, P. J. F. Groenen, M. van de Velden and K. Vines. Better perceptual maps: Introducing explanatory icons to facilitate interpretation. Food Quality and Preference, 36: 61–69, 2014. URL https://doi.org/10.1016/j.foodqual.2014.01.004.
J. C. Gower, S. Lubbe and N. le Roux. Understanding biplots. Chichester: Wiley, 2011.
M. Greenacre. Biplots in Practice. Barcelona: Fundación BBVA, 2010.
H. van Herk and M. van de Velden. Insight into the relative merits of rating and ranking in a cross-national context using three-way correspondence analysis. Food Quality and Preference, 18: 1096–1105, 2007. URL https://doi.org/10.1016/j.foodqual.2007.05.006.
M. J. Hoffman. : Fast truncated singular value decomposition and principal components analysis for large dense and sparse matrices., 2017. URL https://CRAN.R-project.org/package=irlba. R package version 2.3.5.1 [online; last accessed October 3, 2022].
L. R. Kahle. Social values and social change: Adaptation to life in america. New York: Praeger, 1983.
H. A. L. Kiers, P. M. Kroonenberg and J. M. F. T. Berge. An efficient algorithm for TUCKALS3 on data with large numbers of observation units. Psychometrika, 57: 415–422, 1992. URL https://doi.org/10.1007/BF02295429.
P. M. Kroonenberg. Applied multiway data analysis. Hoboken, NJ: John Wiley & Sons, 2008.
P. M. Kroonenberg. Singular value decompositions of interactions in three-way contigency tables. In Multiway data analysis, Eds R. Coppi and S. Bolasco pages. 169–184 1989. Amsterdam: Elsevier.
P. M. Kroonenberg. The TUCKALS line: A suite of programs for three-way data analysis. Computational Statistics and Data Analysis, 18: 73–96, 1994. URL https://doi.org/10.1016/0167-9473(94)90133-3.
P. M. Kroonenberg. Three mode principal component analysis. Leiden: DSWO Press, 1983.
P. M. Kroonenberg and J. D. Leeuw. Principal component analysis of three mode data by means of alternating least squares algorithms. Psychometrika, 45: 69–97, 1980. URL https://doi.org/10.1007/BF02293599.
P. M. Kroonenberg and R. Lombardo. Nonsymmetric correspondence analysis: A tool for analysing contingency tables with a dependence structure. Multivariate Behavioral Research Journal, 34: 367–397, 1999. URL https://doi.org/10.1207/S15327906MBR3403_4.
P. M. Kroonenberg and F. J. Oort. Three-mode analysis of multi-mode covariance matrices. British Journal of Mathematical and Statistical Psychology, 56: 305–336, 2003. URL https://doi.org/10.1348/000711003770480066.
H. O. Lancaster. Complex contingency tables treated by the partition of the chi-square. Journal of Royal Statistical Society, Series B, 13: 242–249, 1951. URL https://www.jstor.org/stable/2984066.
N. C. Lauro and L. D’Ambra. L’analyse non symétrique des correspondances. In Data analysis and informatics III, Eds E. Diday and et al pages. 433–446 1984. Amsterdam: Elsevier.
D. Leibovici. Spatio-temporal multiway data decomposition using principal tensor analysis on k-modes: The r package. Journal of Statistical Software, 34(10): 34 pages, 2010. URL 10.18637/jss.v034.i10.
J. Li, J. Bien and M. T. Wells. : An R package for multidimensional array (tensor) unfolding, multiplication, and decomposition. Journal of Statistical Software, 87(10): 31 pages, 2018. URL DOI: 10.18637/jss.v087.i10.
R. J. Light and H. B. Margolin. An analysis of variance for categorical data. Journal of the American Statistical Association, 66: 534–544, 1971. URL https://doi.org/10.2307/2283520.
S. Loisel and Y. Takane. Partitions of Pearson’s chi-square statistic for frequency tables: A comprehensive account. Computational Statistics, 31: 1429–1452, 2016. URL https://doi.org/10.1007/s00180-015-0619-1.
R. Lombardo and E. J. Beh. Partitioning the Cressie-Read divergence statistic for three-way contingency tables: A study on environmental sustainability data. In IES 2022 innovation & society 5.0: Statistical and economic methodologies for quality assessment. Book of short papers, Eds R. Lombardo, I. Camminatiello and V. Simonacci pages. 491–497 2022. PKE Press.
R. Lombardo and E. J. Beh. Three–way correspondence analysis for ordinal–nominal variables. In SIS 2017 statistics and data science: New challenges, new generations, 28–30 june 2017, florence (italy) proceedings of the conference of the italian statistical society, Eds A. Petrucci and R. Verde pages. 613–620 2017. Firenze Press.
R. Lombardo and E. J. Beh. Variants of simple correspondence analysis. The R Journal, 8/2: 167–184, 2016.
R. Lombardo, E. J. Beh and L. Guerrero. Analysis of three-way non-symmetrical association of food concepts in cross-cultural marketing. Quality & Quality, 53: 2323–2337, 2019. URL https://doi.org/10.1007/s11135-018-0733-6.
R. Lombardo, E. J. Beh and P. M. Kroonenberg. Symmetrical and non-symmetrical variants of three-way correspondence analysis for ordered variables. Statistical Science, 36 (4): 542–561, 2021. URL https://doi.org/10.1214/20-STS814.
R. Lombardo, A. Carlier and L. D’Ambra. Nonsymmetric correspondence analysis for three-way contingency tables. Methodologica, 4: 59–80, 1996.
R. Lombardo, Y. Takane and E. J. Beh. Familywise decompositions of Pearson’s chi-square statistic in the analysis of contingency tables. Advances in Data Analysis and Classification, 14 (3): 629–649, 2020. URL https://doi.org/10.1007/s11634-019-00374-7.
M. Marcotorchino. Utilisation des comparaisons par paires en statistique des contingencies: Partie i. 1984a.
M. Marcotorchino. Utilisation des comparaisons par paires en statistique des contingencies: Partie II. 1984b.
M. Marcotorchino. Utilisation des comparaisons par paires en statistique des contingencies: Partie III. 1985.
T. Murakami and P. M. Kroonenberg. Three-mode models and individual differences in semantic differential data. Multivariate Behavioral Research, 38: 247–283, 2003. URL https://doi.org/10.1207/S15327906MBR3802_5.
L. Pardo and M. C. Pardo. Minimum power-divergence estimator in three-way contingency tables. Journal of Statistical Computation and Simulation, 73: 819–831, 2003. URL https://doi.org/10.1080/0094965031000097782.
M. C. Pardo. An empirical investigation of cressie and read tests for the hypothesis of independence in three-way contingency tables. Kybernetika, 32: 175–183, 1996. URL http://hdl.handle.net/10338.dmlcz/124180.
J. C. W. Rayner and E. J. Beh. Towards a better understanding of correlation. Statistica Neerlandica, 63: 324–333, 2009. URL https://doi.org/10.1111/j.1467-9574.2009.00425.x.
W. Revelle. : Procedures for psychological, psychometric, and personality research., 2018. URL https://CRAN.R-project.org/package=psych [online; last accessed September 29, 2022].
T. J. Ringrose. Alternative confidence regions for canonical variate analysis. Biometrika, 83: 575–587, 1996. URL https://doi.org/10.1093/biomet/83.3.575.
T. J. Ringrose. Bootstrap confidence regions for correspondence analysis. Journal of Statistical Computation and Simulation, 82: 1397–1413, 2012. URL https://doi.org/10.1080/00949655.2011.579968.
A. Statnikov. : Algebra for tensors., 2018. URL https://CRAN.R-project.org/package=tensorA. R package version 0.36.2 [online; last accessed November 19, 2020].
Y. Takane and S. Jung. Regularized partial and/or constrained redundancy analysis. Psychometrika, 73: 671–690, 2008. URL https://doi.org/10.1007/s11336-008-9067-y.
M. Timmerman and H. A. L. Kiers. Three-mode principal component analysis: Choosing the numbers of components and sensitivity to local optima. British Journal of Mathematical and Statistical Psychology, 53: 1–16, 2000. URL https://doi.org/10.1348/000711000159132.
L. R. Tucker. Implications of factor analysis of three-way matrices for measurement of change. In Problems in measuring change, Ed C. W. Harris pages. 122–137 1963. University of Wisconsin Press.
M. van de Velden, A. I. D’Enza and F. Palumbo. Cluster correspondence analysis. Psychometrika, 82: 158–185, 2017. URL https://doi.org/10.1007/s11336-016-9514-0.
X. Zhou. : Multivariate outlier detection., 2019. URL https://CRAN.R-project.org/package=mvoutlier. R package version 2.1.1 [online; last accessed July 30, 2021].

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Lombardo, et al., "Three-Way Correspondence Analysis in R", The R Journal, 2023

BibTeX citation

@article{RJ-2023-049,
  author = {Lombardo, Rosaria and Velden, Michel van de and Beh, Eric J.},
  title = {Three-Way Correspondence Analysis in R},
  journal = {The R Journal},
  year = {2023},
  note = {https://doi.org/10.32614/RJ-2023-049},
  doi = {10.32614/RJ-2023-049},
  volume = {15},
  issue = {2},
  issn = {2073-4859},
  pages = {237-262}
}