ClusROC: An R Package for ROC Analysis in Three-Class Classification Problems for Clustered Data

This paper introduces an R package for ROC analysis in three-class classification problems, for clustered data in the presence of covariates, named ClusROC. The clustered data that we address have some hierarchical structure, i


Introduction
In clinical studies, receiver operating characteristic (ROC) surface analysis is widely used to evaluate the accuracy of a diagnostic test (or biomarker) when there are three ordinal disease classes (or diagnostic groups).See Nakas (2014) for a comprehensive review.Although the clinical context is where ROC surface analysis finds its natural application, other contexts, such as economics and engineering, very often face the problem of evaluating the accuracy of a classifier.
Within the ROC surface analysis framework, the following quantities are typically objects of interest: ROC surface, VUS (volume under ROC surface), and optimal pair of thresholds.Statistical methods for evaluating such quantities have been widely discussed in the statistical literature.We cite, among others, papers by Nakas and Yiannoutsos (2004); Xiong et al. (2006); Nakas et al. (2010); Attwood et al. (2014); Bantis et al. (2017); To Duc et al. (2016).Moreover, on the Comprehensive R Archive Network (CRAN), there are several packages implementing estimation methods for ROC surface analysis, for instance, trinROC (Noll and Reinhard, 2022), ThresholdROC (Sara et al., 2021) and bcROCsurface (To, 2021).
Most of the existing methods for ROC surface analysis have focused on a standard setting, in which measurements on statistical units are realizations of independent random variables, and the diagnostic test, or, more broadly, the classifier, is not influenced by any covariate.In some studies, however, not only the classifier can be affected by some covariates that characterize the units themselves (e.g.age, gender), but statistical units can be enrolled in clusters (e.g., families, genotype, communities, etc.).When statistical units are drawn from clusters, they can no longer be treated as independent.Indeed, units from the same cluster are typically more similar to each other than they will be to statistical units from other clusters.Therefore, unobserved variables may induce statistical dependence between observations within clusters that may be uncaptured by covariates.For such kinds of clustered data, which have a hierarchical structure and are dependent, Xiong et al. (2018) proposed the use of a standard linear mixed-effects model (McCulloch and Searle, 2001) to account for clusters and covariates' effects on the classifier.Then, the authors developed an approach to estimate the VUS, under an assumption of normality.Based on the model in Xiong et al. (2018), To et al. (2022) developed an estimation procedure for the ROC surface and methods for choosing the optimal pair of thresholds.The authors also discussed a variant of their approach, based on the Box-Cox transformation, useful when the normality assumption is (not heavily) violated.
In this paper, we introduce our R package for ROC surface analysis with clustered data, named ClusROC.The package (with related details) is available on CRAN at http://CRAN.Rproject.org/package=ClusROC.In the package, we implement procedures for estimating the parameters of the models (with and without the Box-Cox transformation), and for making inferences about the ROC surface, and the optimal pair of thresholds, by following methods outlined in To et al. (2022).In addition, we also implement a procedure for estimating the VUS, as discussed in Xiong et al. (2018).
In the following sections, we first briefly present the reference model and the inferential procedures for ROC surface analysis with clustered data.Then, we describe the ClusROC package and illustrate its use through two real datasets.The last section provides a brief conclusion.
The R Journal Vol.15/1, March 2023 ISSN 2073-4859 2 The ROC surface analysis for clustered data Here we briefly review methods proposed by Xiong et al. (2018) and To et al. (2022).Details and theoretical results can be found in the original articles.For convenience, as in the quoted papers, we refer to some clinical studies in the presentation and use appropriate language for that area.

The models
Let Y be the diagnostic test result, on a continuous scale, and let Y 1 , Y 2 , Y 3 be the test results for subjects in classes 1, 2, and 3, respectively.We assume that higher values of test results are associated with higher severity of the disease, and the severity of the disease grows with the class (i.e., class 3 is the worst).Let X 1 , . . ., X p be p covariates, possibly associated with the test Y.
Let c be the total number of clusters, randomly selected from the population.For the k-th cluster, k = 1, . . ., c, let n ki be the total number of subjects belonging to class i, i = 1, 2, 3 and let n k = n k1 + n k2 + n k3 be the total sample size within the cluster.Note that n ki might be equal to 0 for some clusters.The linear mixed-effects model for the clustering effect on the test result Y, as well as for covariates' effects, is written as follows (Xiong et al., 2018;To et al., 2022): ) is a triplet of test scores from three randomly sampled subjects from the three disease classes, (k 1 , k 2 , k 3 ), k i ∈ {1, . . ., c}, are cluster memberships indicating the clusters from which Y 1 , Y 2 , Y 3 are observed, z i = (1, x 1i , . . ., x pi ) ⊤ are fixed (i.e., not random) covariates values, and , are vectors of parameters representing covariates effects.In model (1), α k are random effects accounting for the presence of clusters, and ε i are subject-level random errors.We assume that: (i) the random effects α k and the subject-level random errors . ., α c and ε 1 , ε 2 , ε 3 are all independent (see also, McCulloch and Searle, 2001).
In some practical situations, data distributions may be skewed and the normality assumption might be violated.For such cases, To et al. (2022) considered the application of Box-Cox transformation for linear mixed-effects models (Lipsitz et al., 2000;Gurka et al., 2006): and λ is the transformation parameter (Box and Cox, 1964).Assumptions about the random effects α k and the subject-level random errors ε i are the same as in model (1).To obtain λ and the REML estimator γ, To et al. (2022) applied the method proposed by Gurka and Edwards (2011) which is based on the scaled Box-Cox transformation model (Gurka et al., 2006).The estimator of the variance-covariance matrix of the REML estimator γ is obtained again by applying the sandwich formula.
For given thresholds t 1 and t 2 (t 1 < t 2 ), the covariate-specific true class fractions (TCFs) are written The R Journal Vol.15/1, March 2023 ISSN 2073-4859 , where Φ(•) is the cumulative distribution function of the standard normal distribution.
Moreover, by setting TCF 1 (t 1 ; z) = p 1 and TCF 3 (t 2 ; z) = p 3 , the covariate-specific ROC surface can be defined as a function of (p 1 , p 3 ), i.e., Based on the above expressions, for clustered data, To et al. (2022) considered covariate-specific estimation of the TCFs at a fixed pair of thresholds (t 1 , t 2 ) and estimation of the ROC surface for each pair (p 1 , p 3 ).Moreover, the authors proposed methods to estimate covariate-specific optimal pairs of thresholds (t + 1 , t + 2 ) by considering three different criteria: (i) maximization of the covariate-specific generalized Youden index (GYI); (ii) minimization of the covariate-specific Euclidean distance between the ideal point (1, 1, 1) and the point (TCF 1 (t 1 ; z), TCF 2 (t 1 , t 2 ; z), TCF 3 (t 2 ; z)) (CtP); (iii) maximization of the covariate-specific volume of the cuboid under the covariate-specific ROC surface (MV).Resulting estimators are shown to be consistent and asymptotically normal, with asymptotic covariance matrices estimable by using the plug-in method.The normal approximation results can be used to construct suitable (joint) confidence regions.
Starting from model (1), covariate-specific inference on the VUS is discussed in Xiong et al. (2018), where maximum likelihood (ML) methods for point and interval estimation are proposed.In particular, Xiong et al. (2018) showed that covariate-specific ML VUS estimator θ(z) is the summation of five components: θ1 In cases where the assumption of normality for Y 1 , Y 2 and Y 3 is unrealistic, inferential procedures for covariate-specific TCFs, ROC surface, optimal pair of thresholds (and VUS) can be obtained starting from model (2); see Section 3.3 in To et al. (2022).

Overview of R package ClusROC
The ClusROC package implements techniques for ROC surface analysis, in case of clustered data and the presence of covariates.The package comprises five major functions: • clus_lme(): This function fits the linear mixed-effects model (1) under the normality assumption, or the model ( 2) when the Box-Cox transformation is used.
• clus_roc_surface(): This function estimates and makes a 3D plot of covariate-specific ROC surfaces.
• clus_opt_thres3(): This function estimates the covariate-specific optimal pair of thresholds.
• clus_tcfs(): This function estimates the covariate-specific TFCs at a specified pair of thresholds.

Description
The REML estimation in function clus_lme() is based on the function lme() of package nlme (Pinheiro et al., 2022).The function clus_lme() needs, firstly, specification of a fixed_formula which is a twosided linear formula object describing the fixed-effects part of the model for three classes, with the response on the left of ~operator and the terms, separated by + operators, on the right.Secondly, the arguments name_class and name_clust are needed to specify the name of the variables indicating the disease classes (or diagnostic groups) and the clusters in the data, respectively.To enable the Box-Cox transformation, users need to set the argument boxcox as TRUE.The Box-Cox parameter λ is estimated by a grid search on the interval (−2, 2).This interval is suggested by Gurka and Edwards (2011), but users can change this range by setting the argument interval_lambda.Before fitting the model, clus_lme() determines the ordering of the disease groups based on the average values of test results in each disease group.If an ordering is provided by the user via the argument levl_class, the ordering of the mean values is still obtained to confirm the input ordering.In case of disagreement between the two orderings, the one based on the averages of test results is adopted.The function plot() provides three diagnostic plots for the model fitted by clus_lme(), namely, a Q-Q plot for residuals, a Fitted vs. Residuals plot, and a Q-Q plot for cluster effects.These plots exploit the ggplot2 package (Hadley et al., 2022).
The functions clus_roc_surface(), clus_opt_thres3(), clus_vus() and clus_tcfs() are the main functions to perform the ROC surface analysis for clustered data.All of them require the output of clus_lme() as an argument.When one of the above functions is called, a check on the monotone ordering assumption is performed.That is, for a given value of the covariates, say z, the three predicted mean values of the test results in the three diagnostic groups, i.e., z ⊤ β 1 , z ⊤ β 2 and z ⊤ β 3 , are computed and compared.If the assumption (z ⊤ β 1 < z ⊤ β 2 < z ⊤ β 3 ) is not met, the ROC surface analysis is not performed at z.
The function clus_roc_surface() estimates a covariate-specific ROC surface at a single point for covariates and makes a 3D plot to display the estimated covariate-specific ROC surface by using rgl package (Duncan et al., 2021).This function also allows plotting an ellipsoidal confidence region for TCFs at a given pair of thresholds, in the ROC surface space.If the constructed confidence region is outside the unit cube, a probit transformation (Bantis et al., 2017) is automatically applied to obtain an appropriate confidence region, which is inside the unit cube.
The function clus_opt_thres3() gives the estimated covariate-specific optimal pair of thresholds as defined by the criteria GYI, CtP, and MV at multiple points of the covariates.The optimization is done by using the function optim() with the "L-BFGS-B" method; however, users can select other optimization methods, such as, "BFGS" or "Nelder-Mead".The function returns also the estimated asymptotic variance-covariance matrix of the (estimated) covariate-specific optimal pair of thresholds.Under the normality assumption, the asymptotic variance-covariance matrix is estimated by the Delta method.If the Box-Cox transformation is applied, a nonparametric bootstrap procedure for clustered data is used to estimate the asymptotic variance-covariance matrix.To speed up the bootstrap procedure, users can set the argument parallel as TRUE to enable parallel computing support.Users can also select the number of CPUs needed for the computation.After calling clus_opt_thres3(), the function plot() can be used to display confidence regions (and point estimates) of covariate-specific optimal pairs of thresholds.
The function clus_vus() estimates the covariate-specific VUS at multiple points of the covariates by using the integrate() routine.This function also performs the statistical test for the null hypothesis H 0 : VUS = 1/6 versus the alternative H A : VUS > 1/6.This statistical test is a formal assessment of the adequacy of a diagnostic test in a three-class classification problem via its VUS at given covariates values.After calling clus_vus(), users can apply ci_clus_vus() to obtain confidence intervals for covariate-specific VUS.Three types of confidence intervals are computed: normal approximationbased; after logit and probit transformations.
The function clus_tcfs() estimates covariate-specific TCFs at a specified pair of thresholds, given one or multiple points for the covariates.This function also implements the Delta method to estimate the asymptotic variance-covariance matrix of the estimated covariate-specific TCFs.Note that, if the Box-Cox transformation is applied for the linear mixed-effects model, the pair of threshold values must be provided in the original scale.

Applications
To illustrate the use of the ClusROC package, we provide two examples with the MouseNeurons dataset and EnergyEthiopia, which are included in the package.
The R Journal Vol.15/1, March 2023 ISSN 2073-4859 The glutamatergic neurons This dataset includes 860 observations (brain cells) and the following variables: the expression of the Lamp5 gene (diagnostic test/biomarker), the mouse genotype (which yields 23 clusters), the class labels (L2/3 IT, L4, and L5 PT), and the sex and age (in days) of the mouse.Below, we illustrate the use of the ClusROC package using this data.
Step 1: Load library and data > library(ClusROC) > data("MouseNeurons") The above code loads the ClusROC package and the MouseNeurons data into R.
Step 2: Model fitting In this step, we fit a linear mixed-effects model with Lamp5_cpm as a response and age_days as a covariate, under the Box-Cox transformation, by using function clus_lme(): > out_md <-clus_lme(fixed_formula = Lamp5_cpm ~age_days, + name_class = "subclass_label", name_clust = "genotype_id", + data = MouseNeurons, boxcox = TRUE) The ordered levels of classes are specified by the order of averages of the test values for each class: L4 < L5 PT < L2/3 IT As, in this case, the rank-ordered nature of the biomarker concerning the classes is not given, the monotone ordering was specified by ordering the classes according to the rank of the biomarker's sample means in the three groups.The results are shown as follows.

> plot(out_md)
The plots suggest that the assumptions of normality and homogeneity of variances are satisfied.To help the reader evaluate the need for transforming the data, we also fitted the linear mixed-effects model without the Box-Cox transformation.The results are reported in Appendix, showing that the assumptions are violated.
Step 3: ROC surface analysis After fitting the linear mixed-effects models of interest and verifying the normality assumption and the homogeneity of variances assumption, the ROC surface analysis can be performed.First, let us start with the estimation for covariate-specific VUS.In this case, we are interested in computing the (estimated) covariate-specific VUS at three different ages of mice, i.e., 54, 60, and 66 days.The 95% confidence intervals for covariate-specific VUS are obtained by:

> ci_clus_vus(out_vus)
The 95% confidence intervals for covariate-specific VUS: Covariates As shown in the results, we can see three types of confidence intervals: normal approximation-based; after logit and probit transformations.
(a) (b) The results consist of three-point estimates of covariate-specific TCFs at the fixed pair of thresholds (350, 1350), corresponding to three different values of age, 54, 58, and 62 (days), and the associated standard errors (Se).Note that, for the function roc_surface() and clus_tcfs(), if the Box-Cox transformation is applied, the pair of thresholds values must be provided in the original scale.
The following call performs the estimation of the covariate-specific optimal pair of thresholds, and the corresponding asymptotic variance-covariance matrices, at three different ages of mice: 55, 65 and 75 days.Here, we consider all criteria, i.e., GYI, CtP and MV.

House cooking fuel choice
We use the dataset EnergyEthiopia to illustrate the use of the package for evaluating a classifier in a three-class setting with panel data.The EnergyEthiopia data, included in the package, is a subset of the panel dataset discussed in Alem et al. (2016).The full dataset is collected in 4 cities of Ethiopia (Addis Ababa, Awassa, Dessie and Mekelle) three different years : 2000, 2004 and 2009.Each household participated in the study at most three times (three years).For each time, the information on household energy choice and some covariates such as expenditure, demographic indicators, household size and educational status, are recorded.Alem et al. (2016)   • the cooking energy states of a household (energy2) with three categories: clean fuel only (denoted as 1), a mix of clean and biomass fuel (2) and biomass fuel only (1); • log real consumption per adult equivalent units (lrconsaeu); • the household size (hhs); • the log of firewood price (lfirewood_pr); • the log of charcoal price (lcharcol_pr); • the log of kerosene price (lkerosene_pr); • the log of electricity price (lelectric_pr).
We are interested in evaluating the ability of the log real consumption per adult equivalent units to discriminate three cooking energy states for a household, given the information provided by some covariates, such as the household size, the prices of firewood, charcoal and kerosene.Above, the term "clean fuel only" refers to clean energy sources such as electricity, gas and kerosene; whereas, "biomass fuel only" refers to energy sources such as firewood, charcoal, dung and crop residues.In our analysis, the 1123 households make up the clusters.
> out_md_enery <-clus_lme( + fixed_formula = lrconsaeu ~hhs_ft + lfirewood_pr + lcharcol_pr + lkerosene_pr, + name_class = "energy2", name_clust = "uqid", + data = EnergyEthiopia, boxcox = FALSE + ) The ordered levels of classes are specified by the order of averages of the test values for each class: 3 < 2 < 1 As in the first application, we still have no information about the rank-ordered nature of the classifier (lrconsaeu) concerning the classes, so the monotone ordering was specified by ordering the classes according to the rank of the sample means of lrconsaeu inside the three groups.The results are shown as follows.

> print(out_md_enery)
The R Journal Vol.The diagnostic plots shown in Figure 4 seem to suggest that the normality assumption and the homogeneity of variances are reliable.

> plot(out_md_enery)
We estimate the covariate-specific VUS at four different combinations of the covariates, when the value of hhs_ft changes from "small" to "very large", the values of lfirewood_pr, lcharcol_pr and lkerosene_pr are fixed as 1, −1 and 2, respectively.Here, the values "1" and "2" of lfirewood_pr and lkerosene_pr refer to the very high price of the firewood, and the kerosene, respectively, while the value "-1" of lcharcol_pr refers to the very low price of the charcoal.The results of the covariatespecific VUS estimation procedure are displayed below.

Conclusion
This paper introduces the ClusROC package, the first R package for ROC surface analysis in threeclass classification problems, for clustered data and in the presence of covariates.The package allows The R Journal Vol.15/1, March 2023 ISSN 2073-4859 obtaining point estimates and confidence regions for true class fractions, ROC surface estimates and plots, point and interval estimates of VUS, and point estimates and confidence regions for optimal pair of thresholds.In the last case, three different criteria can be used: GYI, CtP, and MV.The package is available on Comprehensive R Archive Network (CRAN) at http://CRAN.R-project.org/package=ClusROC.The functions in the package are described and implemented using, as examples, the real datasets MouseNeurons and EnergyEthiopia provided in the package itself.
The linear mixed-effects model relies on the assumptions of normality and homoscedasticity.When such assumptions do not hold, the Box-Cox transformation can be employed.In these cases, a bootstrap procedure is needed to estimate the elliptical confidence regions for the optimal thresholds.This procedure can take a long computation time depending on the size of the data (either the number of clusters or the sample size within the clusters).For this reason, we recommend users enable the parallel computation option, which is already implemented within the function clus_opt_thres3().

5
This research was supported by the Ministero dell'Istruzione, dell'Università e della Ricerca-Italy (grant number DIFO_ECCELLENZA18_01).The authors acknowledge the Editor and two anonymous reviewers whose valuable suggestions contributed to improving the presentation of the contents.

Appendix
To help the reader evaluate the need for transforming the data, we report here the results of the analysis based on the linear mixed-effects model without the Box-Cox transformation.The results show that the assumptions of normality and homoscedasticity are violated.

Figure 1 :
Figure 1: The diagnostic plots for the linear mixed-effects model.

Figure 2 :
Figure 2: The plots of covariate-specific ROC surface at Age as 58 days: (a) without an ellipsoidal confidence region; (b) with an ellipsoidal confidence region for TCFs at t 1 = 350 and t 2 = 1350.

Figure 3 :
Figure 3: The 95% confidence regions for the age-specific optimal pairs of thresholds for three values of age: 55, 65, and 75.

Figure 4 :
Figure 4: The diagnostic plots for the linear mixed-effects model (EnergyEthiopia data).

Figure 6 :
Figure 6:The 95% confidence regions for the age-specific optimal pairs of thresholds for four covariate points (in the house cooking fuel choice example).
of classes are specified by the order of averages of the test values for each class: L4 < L5 PT < L2/3 IT The R Journal Vol.15/1, March 2023 ISSN 2073-4859 To et al. (2022)used the MouseNeurons dataset to evaluate the ability of the Lysosomal Associated Membrane Protein Family Member 5 (Lamp5) gene to discriminate three types of glutamatergic neurons, namely Layer 2/3 Intratelencephalic (L2/3 IT), Layer 4 (L4) and Layer 5 Pyramidal Tract (L5 PT) neurons.A full version of the data is publicly available at http://portal.brain-map.org/atlases-and-data/rnaseq/mouse-v1-and-alm-smart-seq.