This paper introduces the package ROCnReg that allows estimating the pooled ROC curve, the covariate-specific ROC curve, and the covariate-adjusted ROC curve by different methods, both from (semi) parametric and nonparametric perspectives and within Bayesian and frequentist paradigms. From the estimated ROC curve (pooled, covariate-specific, or covariate-adjusted), several summary measures of discriminatory accuracy, such as the (partial) area under the ROC curve and the Youden index, can be obtained. The package also provides functions to obtain ROC-based optimal threshold values using several criteria, namely, the Youden index criterion and the criterion that sets a target value for the false positive fraction. For the Bayesian methods, we provide tools for assessing model fit via posterior predictive checks, while the model choice can be carried out via several information criteria. Numerical and graphical outputs are provided for all methods. This is the only package implementing Bayesian procedures for ROC curves.
The receiver operating characteristic (ROC) curve (Metz 1978) is, unarguably, the most popular tool used for evaluating the discriminatory ability of continuous-outcome diagnostic tests. The ROC curve displays the false positive fraction (FPF) against the true positive fraction (TPF) for all possible threshold values that can be used to dichotomize the test result. The ROC curve thus provides a global description of the trade-off between the FPF and the TPF of the test as the threshold changes. Plenty of parametric and semi/nonparametric methods are available for estimating the ROC curve, either from frequentist or Bayesian viewpoints, and we refer the interested reader to Pepe (1998 5), Zhou et al. (2011 4), (Inácio et al. 2020), and references therein.
It is known that in many situations, the outcome of a test and, possibly, its discriminatory capacity can be affected by covariates. Two different ROC-based measures that incorporate covariate information have been proposed: the covariate-specific or conditional ROC curve (see, e.g., Pepe 2003 6) and the covariate-adjusted ROC curve (Janes and Pepe 2009). The formal definition of both curves is given in Section 2. Succinctly, a covariate-specific ROC curve is an ROC curve that conditions on a specific covariate value, thus describing the accuracy of the test in the ‘subpopulation’ defined by that covariate value. On the other hand, the covariate-adjusted ROC curve is a weighted average of covariate-specific ROC curves. Regarding estimation, since the seminal paper of (Pepe 1998), a plethora of methods have been proposed in the literature for the estimation of the covariate-specific ROC curve and associated summary measures. Without being exhaustive, we mention the work of (Faraggi 2003), (Rodríguez-Álvarez et al. 2011b,a), (Inácio de Carvalho et al. 2013), and (Inácio de Carvalho et al. 2017). A detailed review can be found in (Rodríguez-Álvarez et al. 2011c), (Pardo-Fernández et al. 2014), and (Inácio et al. 2020). With respect to the covariate-adjusted ROC curve, estimation has been discussed in (Janes and Pepe 2009), (Rodríguez-Álvarez et al. 2011b), (Guan et al. 2012), and (Inácio de Carvalho and Rodríguez-Álvarez 2018).
A few R packages for ROC curve analysis are available on the Comprehensive R Archive Network and, as far as we are aware, all of them implementing frequentist approaches. The package sROC (Wang 2012) contains functions to perform nonparametric, kernel-based, estimation of ROC curves. pROC (Robin et al. 2011) offers a set of tools to visualize, smooth, and compare ROC curves, and nsROC (Pérez Fernández et al. 2018) also allows estimating ROC curves, building confidence bands as well as comparing several curves both for dependent and independent data (i.e., data arising from paired and unpaired study designs, respectively). However, covariate information cannot be explicitly taken into account in any of these packages. The packages ROCRegression (available at https://bitbucket.org/mxrodriguez/rocregression) and npROCRegression (Rodriguez-Alvarez and Roca-Pardinas 2017) provide routines to estimate semiparametrically and nonparametrically, under a frequentist framework, the covariate-specific ROC curve. We also mention OptimalCutpoints (López-Ratón et al. 2014) and ThresholdROC (Perez Jaume et al. 2017) that provide a collection of functions for point and interval estimation of optimal thresholds for continuous diagnostic tests. To the best of our knowledge, there is no statistical software package implementing Bayesian inference for ROC curves and associated summary indices and optimal thresholds.
To close this gap, in this paper we introduce the ROCnReg package that allows conducting Bayesian inference for the (pooled or marginal) ROC curve, the covariate-specific ROC curve, and the covariate-adjusted ROC curve. For the sake of generality, frequentist approaches are also implemented. Specifically, in what concerns estimation of the pooled ROC curve, ROCnReg implements the frequentist empirical estimator described in (Hsieh and Turnbull 1996), the kernel-based approach proposed by (Zou et al. 1997), the Bayesian Bootstrap method of (Gu et al. 2008), and the Bayesian nonparametric method based on a Dirichlet process mixture of normal distributions model proposed by (Erkanli et al. 2006). Regarding the covariate-specific ROC curve, ROCnReg implements the frequentist normal method of (Faraggi 2003) and its semiparametric counterpart as described in (Pepe 1998), the kernel-based approach of (Rodríguez-Álvarez et al. 2011b), and the Bayesian nonparametric model based on a single-weights dependent Dirichlet process mixture of normal distributions proposed by (Inácio de Carvalho et al. 2013). As for the covariate-adjusted ROC curve, the ROCnReg package allows estimation using the frequentist semiparametric approach of (Janes and Pepe 2009), the frequentist nonparametric method discussed in (Rodríguez-Álvarez et al. 2011b), and the recently proposed Bayesian nonparametric estimator of (Inácio de Carvalho and Rodríguez-Álvarez 2018). Table 1 shows a summary of all methods implemented in the package. In addition, ROCnReg also provides functions to obtain ROC-based optimal thresholds to perform the classification/diagnosis of individuals as, say, diseased or nondiseased, using two different criteria, namely, the Youden index and the criterion that sets a target value for the false positive fraction. These are implemented for both the ROC curve, the covariate-specific, and the covariate-adjusted ROC curve.
Method | Description |
---|---|
Pooled ROC curve | |
emp | (Frequentist) empirical estimator (Hsieh and Turnbull 1996). |
kernel | (Frequentist) kernel-based approach (Zou et al. 1997). |
BB | Bayesian bootstrap method (Gu et al. 2008). |
dpm | Nonparametric Bayesian approach based on a Dirichlet process mixture of normal distributions (Erkanli et al. 2006). |
Covariate-specific ROC curve | |
sp | (Frequentist) parametric and semiparametric induced ROC regression approach (Pepe 1998; Faraggi 2003) |
kernel | Nonparametric (kernel-based) induced ROC regression approach (Rodríguez-Álvarez et al. 2011b). |
bnp | Nonparametric Bayesian model based on a single-weights dependent Dirichlet process mixture of normal distributions (Inácio de Carvalho et al. 2013). |
Covariate-adjusted ROC curve | |
sp | (Frequentist) semiparametric method (Janes and Pepe 2009). |
kernel | Nonparametric (kernel-based) induced ROC regression approach (Rodríguez-Álvarez et al. 2011b). |
bnp | Nonparametric Bayesian model based on a single-weights dependent Dirichlet process mixture of normal distributions and the Bayesian bootstrap (Inácio de Carvalho and Rodríguez-Álvarez 2018). |
The remainder of the paper is organized as follows. In Section 2, we formally introduce the (pooled or marginal) ROC curve, the covariate-specific ROC curve, and the covariate-adjusted ROC curve. The description of the Bayesian estimation methods implemented in the ROCnReg package is given in Section 3. In Section 4, the usage of the main functions and capabilities of ROCnReg are described and illustrated using a synthetic dataset mimicking endocrine data. The paper concludes with a discussion in Section 5.
This section sets out the formal definition of the pooled or marginal ROC curve, the covariate-specific ROC curve, and the covariate-adjusted ROC curve. It also describes the most commonly used summary measures of discriminatory accuracy, namely, the area under the ROC curve, the partial area under the ROC curve, and the Youden Index. For conciseness, we intentionally avoid giving too many details and refer the interested reader to (Pepe 2003) (and references therein) for an extensive account of many aspects of ROC curves with and without covariates.
In what follows, we denote as
In the case of a continuous-outcome diagnostic test, the classification
is usually made by comparing the test result
|
|
|
(a) | (b) | (c) |
Another summary index of diagnostic accuracy is the Youden index
(Youden 1950; Shapiro 1999)
The conditional or covariate-specific ROC curve, given a covariate value
The covariate-specific ROC curve and associated AUC, pAUCs, and YI
described in the previous section depict the accuracy of the test for
specific covariate values. However, it would be undoubtedly useful to
have a global summary measure that also takes covariate information into
account. Such summary measure was developed by (Janes and Pepe 2009), who proposed
the covariate-adjusted ROC (AROC) curve, defined as
In contrast to the pooled ROC curve (see Expressions (1) and
(6)) and the covariate-specific ROC curve (see Expressions
(10) and (11)), the AROC
curve (and its
For space reasons, we focus ourselves here on the Bayesian methods for ROC curve inference (with and without covariates) implemented in the ROCnReg package. A detailed description, as well as usage examples, of the frequentist approaches are available as Supplementary Material at https://bitbucket.org/mxrodriguez/rocnreg.
In what follows, let
The function pooledROC.bb
implements the Bayesian bootstrap (BB)
approach proposed by (Gu et al. 2008). Their estimator relies on the notion of
placement value (Pepe 2003 5), which is simply a standardization
of the test outcomes with respect to a reference group. Specifically,
Computation of the placement value based on the BB.
For
Generate a realization of the ROC curve. Based on
(24), generate a realization of
The BB estimate of the ROC curve is obtained by averaging over the
ensemble of ROC curves
The Bayesian bootstrap estimator leads to closed-form expressions for
the AUC and pAUC, which are, respectively, given by
The Bayesian nonparametric approach, based on a Dirichlet process
mixture (DPM) of normal distributions, for estimating the pooled ROC
curve (Erkanli et al. 2006) is implemented in the pooledROC.dpm
function. In
this case, as implicit by the name, the CDFs of the test outcomes in
each group are estimated via a Dirichlet process mixture of normal
distributions. That is, it is assumed that the CDF, say in the diseased
group (the one in the nondiseased group, pooledROC.dpm
function. Note that here,
For ease of posterior simulation and because it provides a highly
accurate approximation, we make use of the truncated stick-breaking
representation of the DP (Ishwaran and James 2001), according to which pooledROC.dpm
function and the default values are as follows
pooledROC.dpm
, and in such a case, the defaults for the centering
distribution hyperparameters’ values are as following
pooledROC.dpm
function, we set pooledROC.dpm
may be post-processed, and (informal) model fit
diagnostics obtained; see more in Section 4 and in
the Supplementary Materials.
Because the full conditional distributions for all model parameters are
available in closed-form, posterior simulation can be easily conducted
through Gibbs sampler (see the details, for instance, in Ishwaran and James (2002)).
At iteration
We now let
The function cROC.bnp
implements the Bayesian nonparametric approach
for conducting inference about the covariate-specific ROC curve of
(Inácio de Carvalho et al. 2013), which is based on a single-weights dependent Dirichlet
process mixture of normal distributions (De Iorio et al. 2009). Specifically,
under this method, the conditional CDF in the diseased group is modeled
as follows
cROC.bnp
in ROCnReg
allows modeling the mean function of each component using an additive
smooth structure
cROC.bnp
can also deal with categorical covariates, linear
effects of continuous covariates, as well as interactions. For the
reasons mentioned before, we recommend that all continuous covariates
are modeled as in (27). Nonetheless, posterior predictive
checks, as illustrated in Section 4, can also be
used to informally validate the fitted model. We write
The regression coefficients and variances associated with each of the
cROC.bnp
, test outcomes and covariates are
standardized, which not only facilitates specification of the
hyperparameter values but also improves the mixing of the Markov chain
Monte Carlo (MCMC) chains. The default values are as follows
Similarly to the analogous model for the no-covariate case, at iteration
We finish this section by noting that a particular case of the above
estimator arises when the effect of all continuous covariates is assumed
to be linear and only one component is considered, i.e.,
The estimation of the AROC curve rests on the following three steps:
Estimation of the conditional distribution of test outcomes in the
nondiseased group,
Computation of the placement value
Estimation of the cumulative distribution function of
The approach proposed by (Inácio de Carvalho and Rodríguez-Álvarez 2018) for estimating the AROC curve is
implemented in function AROC.bnp
, and it combines a single-weights
dependent Dirichlet process mixture of normal distributions in Step 1
and the Bayesian bootstrap in Step 3. Again, here, in Step 1, we also
recommend using cubic B-splines transformations of all continuous
covariates. Using the same notation as before, we model the conditional
density as
cROC.bnp
function are adopted for
A point estimate for
This section describes the main functions in the ROCnReg package and
illustrates their usage using, due to confidentiality reasons, a
synthetic dataset mimicking endocrine data from a cross-sectional study
carried out by the Galician Endocrinology and Nutrition Foundation. A
detailed description of the original dataset can be found in (Tomé Martínez de Rituerto et al. 2009).
The original data have also been previously analyzed in (Rodríguez-Álvarez et al. 2011b,a)
and (Inácio de Carvalho and Rodríguez-Álvarez 2018). The synthetic data can be found in the ROCnReg
package under the name endosyn
, and a summary of it follows.
R> library("ROCnReg")
R> data("endosyn")
R> summary(endosyn)
cvd_idf age gender bmi
Min. :0.0000 Min. :18.25 Men :1317 Min. :12.60
1st Qu.:0.0000 1st Qu.:29.57 Women:1523 1st Qu.:23.19
Median :0.0000 Median :39.28 Median :26.24
Mean :0.2433 Mean :41.43 Mean :26.69
3rd Qu.:0.0000 3rd Qu.:50.84 3rd Qu.:29.74
Max. :1.0000 Max. :84.66 Max. :46.20
The dataset is comprised of gender
), with an age
range between bmi
contains the body mass index (BMI) values, and
cvd
idf
is the variable that indicates the presence (1) or
absence (0) of two or more cardiovascular disease (CVD) risk factors.
Following previous studies, the CVD risk factors considered include
raised triglycerides, reduced HD-cholesterol, raised blood pressure, and
raised fasting plasma glucose. Note that from the
Using the ROCnReg package, in the subsequent sections, we will
illustrate how to ascertain, through the pooled ROC curve, the
discriminatory capacity of the BMI (which acts as our diagnostic test in
this example) in differentiating individuals with two or more CVD risk
factors (those belonging to the diseased class age
and gender
on the discriminatory capacity of the BMI. Finally, the
last part of this section focuses on the covariate-adjusted ROC curve as
a global summary measure of the BMI discriminatory ability when taking
the age
and gender
effects into account. In the Supplementary
Material, we show the usage of the package for those methods not
described here in the main text.
The ROCnReg package allows estimating the pooled ROC curve by means of
the four methods listed in Table 1. Here, we only
present the syntax for the functions pooledROC.BB
and pooledROC.dpm
that correspond, respectively, to the Bayesian bootstrap estimator and
the approach based on a Dirichlet process mixture (of normal
distributions). The function pooledROC.emp
, which implements an
empirical estimator, and the function pooledROC.kernel
, which is based
on kernel methods, are illustrated in the Supplementary Material. The
input arguments in the functions are method-specific (details can be
found in the manual accompanying the package), but in all cases,
numerical and graphical summaries can be obtained by calling the
functions print.pooledROC
, summary.pooledROC
, and plot.pooledROC
,
which can be abbreviated by print
, summary
, and plot
. Recall that
our aim is to ascertain, using the endosyn
dataset, the discriminatory
capacity of the BMI in differentiating individuals with two or more CVD
risk factors from those having just one or none CVD risk factors.
R> set.seed(123, "L'Ecuyer-CMRG") # for reproducibility
R> pROC_dpm <- pooledROC.dpm(marker = "bmi", group = "cvd_idf", tag.h = 0,
+ data = endosyn, standardise = TRUE, p = seq(0, 1, l = 101), ci.level = 0.95,
+ compute.lpml = TRUE, compute.WAIC = TRUE, compute.DIC = TRUE,
+ pauc = pauccontrol(compute = TRUE, focus = "FPF", value = 0.1),
+ density = densitycontrol(compute = TRUE),
+ prior.h = priorcontrol.dpm(L = 10), prior.d = priorcontrol.dpm(L = 10),
+ mcmc = mcmccontrol(nsave = 8000, nburn = 2000, nskip = 1),
+ parallel = "snow", ncpus = 2, cl = NULL)
Before describing in detail the previous call, we first present the control functions that are used. In particular,
pauccontrol(compute = FALSE, focus = c("FPF", "TPF"), value = 1)
can be used to indicate whether the pAUC should be computed (by default
it is not computed), and in case it is computed (i.e., compute = TRUE
), whether the focus
should be placed on restricted FPFs (pAUC; see
((4))) or on restricted TPFs (value
. In addition to the pooled ROC curve,
AUC, and pAUC (if required), the function pooledROC.dpm
also allows
computing the probability density function (PDF) of the test outcomes in
both the diseased and nondiseased groups. In order to do so, we use
densitycontrol(compute = FALSE, grid.h = NA, grid.d = NA)
By default, PDFs are not returned by the function pooledROC.dpm
, but
this can be changed by setting compute = TRUE
, and through grid.h
and grid.d
, the user can specify a grid of test results where the PDFs
are to be evaluated in, respectively, the nondiseased and diseased
groups. Value NA
signals auto initialization, with default a vector of
length
priorcontrol.dpm(m0 = NA, S0 = NA, a = 2, b = NA, alpha = 1, L = 10)
A detailed description of these hyperparameters is found in Section 3. Finally, to set the various parameters controlling the MCMC procedure (which in our case is simply a Gibbs sampler), we use
mcmccontrol(nsave = 8000, nburn = 2000, nskip = 1)
Here, nsave
is an integer value with the total number of scans to be
saved, nburn
is the number of burn-in scans, and nskip
is the
thinning interval. Unless due to memory usage reasons, we recommend not
thinning and instead monitoring the effective sample size of the MCMC
chain.
Coming back to the pooledROC.dpm
function, through marker
, the user
specifies the name of the variable containing the test results. In our
case, these are the values of the BMI. The name of the variable that
distinguishes diseased (two or more CVD risk factors, group
, and the value codifying nondiseased
individuals in group
is specified by tag.h
. The data
argument is a
data frame containing the data and all needed variables. Setting
standardise = TRUE
(the default) will standardize (i.e., subtract the
mean and divide by the standard deviation) the test outcomes. The set of
FPFs at which to estimate the pooled ROC curve is specified in the
argument p
, and argument ci.level
allows specifying the level for
the credible intervals (by default: compute.lpml
,
compute.WAIC
, and compute.DIC
to TRUE
. Argument pauc
is an
(optional) list of values to replace the default values returned by the
function pauccontrol
. Here, we ask for the pAUC to be computed, with
the focus on restricted FPFs and upper bound density
is an (optional) list of values to replace the
default values returned by the function densitycontrol
, as it is the
argument mcmc
. Through prior.h
and prior.d
arguments, we specify
the hyperparameters in the nondiseased and diseased groups,
respectively. Again, both arguments are (optional) lists of values to
replace the default values returned by the function priorcontrol.dpm
.
We shall remember that different hyperparameters’ default values are set
depending on whether test outcomes are standardized or not. Finally,
arguments parallel
, ncpus
and cl
allow performing parallel
computations (based on the R-package parallel). In particular, through
parallel
, the user specifies the type of parallel operation: either
"no"
(default), "multicore"
(not available on Microsoft Windows
operating systems), or "snow"
. Argument ncpus
is used to indicate
the number of processes to be used in a parallel operation (when
parallel = "multicore"
, or parallel = "snow"
), and cl
is an
optional parallel or snow cluster to be used when parallel = "snow"
.
If cl
is not supplied (as in our example), a cluster on the local
machine is created for the duration of the call.
A numerical summary of the fitted model can be obtained by calling the
function summary
, which provides, among other information, the
estimated AUC (posterior mean) and ci.level = 0.95
) and, if required,
the LPML, WAIC, and DIC, separately, in the nondiseased (denoted here as
Group H
) and diseased (Group D
) groups.
R> summary(pROC_dpm)
Call:
pooledROC.dpm(marker = "bmi", group = "cvd_idf", tag.h = 0, data = endosyn,
standardise = TRUE, p = seq(0, 1, l = 101), ci.level = 0.95,
compute.lpml = TRUE, compute.WAIC = TRUE, compute.DIC = TRUE,
pauc = pauccontrol(compute = TRUE, focus = "FPF", value = 0.1),
density = densitycontrol(compute = TRUE), prior.h = priorcontrol.dpm(L = 10),
prior.d = priorcontrol.dpm(L = 10), mcmc = mcmccontrol(nsave = 8000,
nburn = 2000, nskip = 1), parallel = "snow", ncpus = 2, cl = NULL)
Approach: Pooled ROC curve - Bayesian DPM
----------------------------------------------
Area under the pooled ROC curve: 0.759 (0.74, 0.777)*
Partial area under the pooled ROC curve (FPF = 0.1): 0.168 (0.139, 0.199)*
* Credible level: 0.95
Model selection criteria:
Group H Group D
WAIC 12490.485 4017.063
WAIC (Penalty) 8.431 5.468
LPML -6245.247 -2008.541
DIC 12490.276 4016.920
DIC (Penalty) 8.326 5.396
Sample sizes:
Group H Group D
Number of observations 2149 691
Number of missing data 0 0
To complement these numerical results, the ROCnReg package also
provides graphical results that can be used to further explore the
fitted model. Specifically, the function plot
depicts the estimated
pooled ROC curve and AUC (posterior means), jointly with
ci.level
R> plot(pROC_dpm, cex.main = 1.5, cex.lab = 1.5, cex.axis = 1.5, cex = 1.5)
The result of the above code is shown in Figure 2.
plot.pooledROC
function for an object of class pooledROC.dpm
. Posterior mean and 95%
pointwise credible band for the pooled ROC curve and corresponding
posterior mean and 95% credible interval for the
AUC.By means of density = densitycontrol(compute = TRUE)
in the call to
the function, the estimates of the PDFs of the BMI in both groups are to
be returned. This information can be accessed through component dens
in the object pROC
dpm
(i.e., pROC
dpm$dens
), which is a
list with elements h
and d
associated with the nondiseased and
diseased groups, respectively. Each of the two elements is itself
another list of two components: (1) grid
, a vector that contains the
grid of test results at which the PDFs have been evaluated (estimated);
and (2) dens
, a matrix with the PDFs at each iteration of the MCMC
procedure. We can use these results to plot, e.g., the posterior mean
(and 95% pointwise credible bands) of the PDF of the BMI in the healthy
and diseased populations (see Figure 3a
obtained using the R package
ggplot2 by
(Wickham 2016)). As can be observed, the estimated densities obtained
under the DPM method follow very closely the histograms of the data.
Further, the estimated densities available in dens
can be used, as
advised by Gelman et al. (2013 553), to monitor convergence of the MCMC
chains. The well-known label switching problem often leads to poor
mixing of the chains of the component-specific parameters, but this may
not impact convergence and mixing of the induced density/distribution of
interest. For instance, Figure 4 shows the
trace plots of the MCMC iterations (after burn-in) of the PDFs of the
BMI in the two groups for different (and randomly selected) values of
the BMI, and Figure 5 depicts the
corresponding effective sample sizes and Geweke statistics (obtained
using the R package coda by
(Plummer et al. 2006)). Note that all plots give evidence of a good mixing and do
not suggest a lack of convergence. For conciseness, the R code for
reproducing Figures 3a,
4, and 5 is not
provided here but in the replication code that accompanies the paper.
|
|
|
|
pROC_dpm
); and (b) a normal model (object
pROC_normal
). Left: Nondiseased individuals (none or one
CVD risk factor). Right: Diseased individuals (two or more CVD risk
factors).
pROC_dpm
. Results
are shown separately for the nondiseased and diseased populations and
for different values of the BMI.
pROC_dpm
in the nondiseased and diseased populations. In
both cases, results are shown along BMI values.
It is worth noting that the function pooledROC.dpm
also allows fitting
a normal distribution in each group. This is just a particular case (for
which prior.d
and
prior.h
arguments. The code follows.
R> set.seed(123, "L'Ecuyer-CMRG") # for reproducibility
R> pROC_normal <- pooledROC.dpm(marker = "bmi", group = "cvd_idf", tag.h = 0,
+ data = endosyn, standardise = TRUE, p = seq(0, 1, l = 101), ci.level = 0.95,
+ compute.lpml = TRUE, compute.WAIC = TRUE, compute.DIC = TRUE,
+ pauc = pauccontrol(compute = TRUE, focus = "FPF", value = 0.1),
+ density = densitycontrol(compute = TRUE),
+ prior.h = priorcontrol.dpm(L = 1), prior.d = priorcontrol.dpm(L = 1),
+ mcmc = mcmccontrol(nsave = 8000, nburn = 2000, nskip = 1),
+ parallel = "snow", ncpus = 2)
For the sake of space, we omit from the summary the call to the function
R> summary(pROC_normal)
Call: [...]
Approach: Pooled ROC curve - Bayesian DPM
----------------------------------------------
Area under the pooled ROC curve: 0.748 (0.728, 0.768)*
Partial area under the pooled ROC curve (FPF = 0.1): 0.224 (0.194, 0.253)*
* Credible level: 0.95
Model selection criteria:
Group H Group D
WAIC 12639.952 4049.004
WAIC (Penalty) 2.431 2.267
LPML -6319.976 -2024.502
DIC 12639.505 4048.714
DIC (Penalty) 1.986 1.987
Sample sizes:
Group H Group D
Number of observations 2149 691
Number of missing data 0 0
The fit of the DPM and normal models in each group can be compared on the basis of the WAIC, DIC, and/or the LPML. Remember that for the LPML, the higher its value, the better the model fit, while for the WAIC and DIC, it is the other way around. By comparing these values, provided in the summary of each fitted model, we can conclude that the three criteria favor, in both the diseased and (especially in the) nondiseased groups, the more general DPM model. This is also corroborated by the plot of the fitted densities in each group shown in Figure 3b.
We now estimate the pooled ROC curve using the Bayesian bootstrap
estimator (function pooledROC.BB
), and comparisons with the results
obtained using the DPM approach are provided.
R> set.seed(123, "L'Ecuyer-CMRG") # for reproducibility
R> pROC_BB <- pooledROC.BB(marker = "bmi", group = "cvd_idf", tag.h = 0, data = endosyn,
+ p = seq(0, 1, l = 101), pauc = pauccontrol(compute = TRUE, focus = "TPF", value = 0.8),
+ B = 5000, ci.level = 0.95, parallel = "snow", ncpus = 2)
R> summary(pROC_BB)
Call: [...]
Approach: Pooled ROC curve - Bayesian bootstrap
----------------------------------------------
Area under the pooled ROC curve: 0.76 (0.74, 0.779)*
Partial area under the pooled ROC curve (FPF = 0.1): 0.17 (0.14, 0.201)*
* Credible level: 0.95
Sample sizes:
Group H Group D
Number of observations 2149 691
Number of missing data 0 0
Note that the posterior means for the AUC and pAUC obtained using the
DPM method (
We finish this section by showing how to use ROCnReg to obtain an
(optimal) threshold value which could be further used to ‘diagnose’ an
individual as diseased (two or more CVD risk factors) or
healthy/nondiseased (none or only one CVD risk factor). To that aim, and
for pooledROC
objects (i.e., those obtained using functions
pooledROC.dpm
, pooledROC.BB
, pooledROC.emp
, and
pooledROC.kernel
), we use the function compute.threshold.pooledROC
,
which allows obtaining (optimal) threshold values using two criteria:
the YI and the one that sets a target value for the FPF. For
illustration, we show here the results using the YI criterion.
R> th_pROC_dmp <- compute.threshold.pooledROC(pROC_dpm, criterion = "YI",
+ ci.level = 0.95, parallel = "snow", ncpus = 2)
R> th_pROC_dmp
$call
compute.threshold.pooledROC(object = pROC_dpm, criterion = "YI",
ci.level = 0.95, parallel = "snow", ncpus = 2)
$thresholds
est ql qh
26.46877 26.07129 26.85029
$YI
est ql qh
0.4045776 0.3721684 0.4366298
$FPF
est ql qh
0.3808336 0.3469580 0.4159478
$TPF
est ql qh
0.7854112 0.7528575 0.8161865
The function returns the posterior mean (est
) and
ci.level
ci.level = 0.95
) credible
interval (lower bound: ql
, upper bound: qh
) for the YI and
associated threshold value, as well as for the FPF and TPF associated
with this cutoff value. For our example, the (posterior mean of the) YI
is
We now turn our attention to the inclusion of covariates in ROC
analysis. As shown in Table 1, with ROCnReg, the user
can estimate the covariate-specific ROC curve by means of three
approaches. As for the functions in ROCnReg for estimating the pooled
ROC curve, the input arguments are method-specific, and we refer the
reader to the manual for details. For all methods, numerical and
graphical summaries are obtained using functions print.cROC
,
summary.cROC
, and plot.cROC
. Here, we describe how to use the
function cROC.bnp
that implements the Bayesian nonparametric approach
for estimating the covariate-specific ROC curve detailed in Section
3. Also, for objects of this class, ROCnReg provides
the function predictive.checks
, which implements tools for assessing
model fit via posterior predictive checks.
Recall that, when including covariate information in ROC analysis,
interest resides in evaluating if and how the discriminatory capacity of
the test varies with such covariates. In particular, in our endocrine
study, we aim at evaluating the possible effect of both age
and
gender
in the discriminatory capacity of the BMI. In what follows,
with this aim in mind, two different models are fitted using the
function cROC.bnp
. One which considers a normal distribution in each
group and that incorporates the age
effect in a linear way and a
second one which caps the maximum number of mixture components in each
group at 10 (i.e., age
effect using cubic B-splines (and thus allows for a nonlinear effect of
age). Following (Rodríguez-Álvarez et al. 2011b,a), both models consider the interaction
between age
and gender
. For clarity, we first focus on the code that
models the age
effect in a linear way and use it to describe in detail
the different arguments of the cROC.bnp
function.
R> # Dataframe for predictions
R> agep <- seq(22, 80, l = 30)
R> endopred <- data.frame(age = rep(agep,2), gender = factor(rep(c("Women", "Men"),
+ each = length(agep))))
R> set.seed(123, "L'Ecuyer-CMRG") # for reproducibility
R> cROC_bp <- cROC.bnp(formula.h = bmi ~ gender*age, formula.d = bmi ~ gender*age,
+ group = "cvd_idf", tag.h = 0, data = endosyn, newdata = endopred,
+ standardise = TRUE, p = seq(0, 1, l = 101), ci.level = 0.95, compute.lpml = TRUE,
+ compute.WAIC = TRUE, compute.DIC = TRUE, pauc = pauccontrol(compute = FALSE),
+ prior.h = priorcontrol.bnp(L = 1), prior.d = priorcontrol.bnp(L = 1),
+ density = densitycontrol(compute = TRUE),
+ mcmc = mcmccontrol(nsave = 8000, nburn = 2000, nskip = 1),
+ parallel = "snow", ncpus = 2)
As can be seen, many arguments coincide with those of the function
pooledROC.dpm
(described in the previous section). We thus focus here
on those that are specific to cROC.bnp
. The arguments formula.h
and
formula.d
are formula
objects specifying the model for the
regression function (see Equation (28)) in, respectively,
the nondiseased and diseased groups. They are similar to the formula
used with the glm
function, except that nonlinear functions (modeled
by means of cubic B-splines) can be added using function f
(an example
will follow later in this section). Note that in both cases, the
left-hand side of the formulas should include the name of the
test/marker (in our case bmi
). In our application, and for both
groups, the model for the component’s means includes, in addition to the
linear effect of age
and gender
, the (linear) interaction between
these two covariates (i.e., gender*age
gender + age + gender:age
). Through the newdata
argument, the user
can specify a new data frame containing the values of the covariates at
which the covariate-specific ROC curve and AUC (and also pAUC and PDFs,
if required) are to be computed. Finally, prior.h
(the same holds for
prior.d
) is a (optional) list of values to replace the defaults
returned by priorcontrol.bnp
, which allows setting the hyperparameters
for the single-weights dependent Dirichlet process mixture of normals
model (see Section 3 and the manual accompanying the
package for more details)
priorcontrol.bnp(m0 = NA, S0 = NA, nu = NA, Psi = NA, a = 2, b = NA,
alpha = 1, L = 10)
In our example, we only modified the upper bound for the number of
components in the mixture model, which by default is
In this case, the summary
of the fitted model provides the following
information.
R> summary(cROC_bp)
Call: [...]
Approach: Conditional ROC curve - Bayesian nonparametric
----------------------------------------------------------
Parametric coefficients
Group H:
Post. mean Post. quantile 2.5% Post. quantile 97.5%
(Intercept) 26.1459 25.8765 26.4096
genderWomen -0.9160 -1.2726 -0.5680
age 1.1949 0.9180 1.4690
genderWomen:age 1.1948 0.8455 1.5394
Group D:
Post. mean Post. quantile 2.5% Post. quantile 97.5%
(Intercept) 29.1865 28.7625 29.6115
genderWomen 2.0826 1.3705 2.7665
age 0.6578 0.2162 1.0904
genderWomen:age -0.7711 -1.4655 -0.0956
ROC curve:
Post. mean Post. quantile 2.5% Post. quantile 97.5%
(Intercept) -0.6959 -0.8177 -0.5776
genderWomen -0.6863 -0.8695 -0.5046
age 0.1229 0.0045 0.2415
genderWomen:age 0.4499 0.2745 0.6245
b 0.9391 0.8824 0.9975
Model selection criteria:
Group H Group D
WAIC 12174.986 4007.980
WAIC (Penalty) 6.283 5.646
LPML -6087.492 -2003.990
DIC 12173.664 4007.329
DIC (Penalty) 4.994 5.053
Sample sizes:
Group H Group D
Number of observations 2149 691
Number of missing data 0 0
The first aspect to note is that, in this case, the summary
function
does not provide the estimated AUC as there is one (possibly different)
AUC for each combination of covariate values. Also, given that: (1) only
one component has been considered for modeling the CDFs of test results
in the diseased and nondiseased groups, and (2) covariate effects have
been modeled in a linear way, the summary
function provides the
posterior mean (and quantiles) of the (parametric) coefficients
associated with the regression functions (Equation (30))
and with the covariate-specific ROC curve (Equation (32)).
We note that since in the call to the function we have specified
standardise = TRUE
(and consequently both the test outcomes and
covariates are standardized), the regression coefficients are on the
scale of the standardized covariates. If we focus on the coefficients
for the covariate-specific ROC curve, it seems that the discriminatory
capacity of the BMI decreases with age, with the decrease being more
pronounced in women (note that the expression of the covariate-specific
ROC curve in Equation (31) implies that positive
coefficients correspond to a decrease in discriminatory capacity). These
results are possibly better judged by plotting the estimated
covariate-specific ROC curves and associated AUCs. This can be done
using the plot
function. For the covariate-specific ROC curve, the
depicted graphics will depend on the number and nature of the covariates
included in the analyses. In particular, for our application, we obtain,
separately for men and women, the covariate-specific ROC curves (and
AUCs) along age. These are shown in Figure 7, obtained
using the code
R> op <- par(mfrow = c(2,2))
R> plot(cROC_sp, ask = FALSE)
R> par(op)
plot.cROC
function
for an object of class cROC.bnp
. Results for the model that includes
the linear interaction between age
and gender
and one mixture
component. Top row: Posterior mean of the covariate-specific ROC curve
along age, separately for men and women. Bottom row: Posterior mean and
95% pointwise credible band for the covariate-specific AUC along age,
separately for men and women.Although in this example we have modeled the age
effect linearly and
only one mixture component was considered, ROCnReg also allows for
modeling the effect of continuous covariates in a nonlinear way, either
using cubic B-spline basis expansions (through the function cROC.bnp
)
or kernel-based smoothers (via the function cROC.kernel
which is
described in the Supplementary Material). Also, as noted before, using
only one mixture component for the single-weights dependent Dirichlet
process mixture of normals model (function cROC.bnp
) is equivalent to
considering a (Bayesian) normal model, which might be too restrictive
for most data applications. In what follows, we provide more flexibility
to the model for the covariate-specific ROC curve by means of (1)
increasing the number of mixture components and (2) modeling the age
effect in a nonlinear way (recall our considerations in Section
3.2 about the lack of flexibility of the single-weights
dependent Dirichlet process mixture of normals model when covariates
effects on the components’ means are modeled linearly). The former is
done by modifying the value of L
in the arguments prior.h
and
prior.d
, with f
when specifying the
component’s mean functions through formula.h
and formula.d
. In
particular, in our application we are interested in modeling the
factor-by-curve interaction between age
and gender
(i.e., we model
the age
effect ‘separately’ for men and women). This is done using,
e.g., bmi ̃gender + f(age, by = gender, K = c(3,5))
. Through argument
K
, we indicate the number of internal knots used for constructing the
cubic B-spline basis used to approximate the nonlinear effect of age
(with the quantiles of age
used to anchor the knots). Note that we can
specify a different number of internal knots for men and women
(K = c(3,5)
), where the order of vector K
should match the ordering
of the levels of the factor gender
. We also note that to assist in the
selection of the number of interior knots (in ROCnReg, the location is
always based on the quantiles of the corresponding covariates), the user
can make use of the WAIC, DIC, and/or LPML. For instance, for this
application, we fitted different models with a different number of
internal knots, and we have chosen the model that provided the lowest
WAIC (this was done in both the nondiseased and diseased populations,
and we remark that the number of knots does not need to be the same in
the two populations). The final model is shown below.
R> # Levels of gender, and its ordering.
R> # Needed if we want to specify different
R> # number of knots for men and women
R> levels(endosyn$gender)
[1] "Men" "Women"
R> set.seed(123, "L'Ecuyer-CMRG") # for reproducibility
R> cROC_bnp <- cROC.bnp(
+ formula.h = bmi ~ gender + f(age, by = gender, K = c(0,0))
+ formula.d = bmi ~ gender + f(age, by = gender, K = c(4,4)),
+ group = "cvd_idf", tag.h = 0, data = endosyn, newdata = endopred,
+ standardise = TRUE, p = seq(0, 1, l = 101), ci.level = 0.95, compute.lpml = TRUE,
+ compute.WAIC = TRUE, compute.DIC = TRUE, pauc = pauccontrol(compute = FALSE),
+ prior.h = priorcontrol.bnp(L = 10), prior.d = priorcontrol.bnp(L = 10),
+ density = densitycontrol(compute = TRUE),
+ mcmc = mcmccontrol(nsave = 8000, nburn = 2000, nskip = 1),
+ parallel = "snow", ncpus = 2)
R> summary(cROC_bnp)
Call: [...]
Approach: Conditional ROC curve - Bayesian nonparametric
----------------------------------------------------------
Model selection criteria:
Group H Group D
WAIC 11833.000 3909.828
WAIC (Penalty) 31.236 38.583
LPML -5916.766 -1955.449
DIC 11829.750 3904.532
DIC (Penalty) 29.611 35.934
Sample sizes:
Group H Group D
Number of observations 2149 691
Number of missing data 0 0
R> op <- par(mfrow = c(2,2))
R> plot(cROC_sp, ask = FALSE)
R> par(op)
plot.cROC
function
for an object of class cROC.bnp
. Results for the model that includes
the factor-by-curve interaction between age
and gender
and 10
mixture components. Top row: Posterior mean of the covariate-specific
ROC curve along age, separately for men and women. Bottom row: Posterior
mean and 95% pointwise credible band for the covariate-specific AUC
along age, separately for men and women.The graphical results are shown in Figure 8. Note
that, especially for women, age displays a marked nonlinear effect.
Recall that for objects of class cROC.bnp
, and if required in the call
to the function, the summary
function provides, separately for the
diseased and nondiseased/healthy groups, the WAIC, LPML, and DIC. Note
that, in both cases, the three criteria support the use of the more
flexible model that uses cubic B-splines and 10 mixture components for
modeling the distribution of the BMI (model cROC
bnp
) over the
more restrictive Bayesian normal linear model (model cROC
bp
).
Because the WAIC, LPML, and DIC are relative criteria, posterior
predictive checks are also available in ROCnReg through the function
predictive.checks
. Specifically, the function generates replicated
datasets from the posterior predictive distribution in the two groups
R> op <- par(mfrow = c(2,3))
R> pc_cROC_bp <- predictive.checks(cROC_bp,
+ statistics = c("kurtosis", "skewness"), devnew = FALSE)
R> par(op)
R> op <- par(mfrow = c(2,3))
R> pc_cROC_bnp <- predictive.checks(cROC_bnp,
+ statistics = c("kurtosis", "skewness"), devnew = FALSE)
R> par(op)
Results are shown in Figure 9. As can be seen,
the model that includes the factor-by-curve interaction between age
and gender
and 10 mixture components performs quite well in capturing
both quantities, while the Bayesian normal linear model fails to do so.
Also shown in Figure 9 (and provided by
function predictive.checks
) are the kernel density estimates of
|
|
|
|
predictive.checks
function for an object of class
cROC.bnp
. Histograms of the statistics skewness
and kurtosis computed from 8000 draws from the posterior predictive
distribution in the diseased and nondiseased groups. The red line is the
estimated statistic from the observed BMI values. The right-hand side
plots show the kernel density estimate of the observed BMI (solid black
line), jointly with the kernel density estimates for 500 simulated datasets drawn from the
posterior predictive distributions.
As for the pooled ROC curve, ROCnReg also provides a function that
allows obtaining (optimal) threshold values for the covariate-specific
ROC curve. For illustration, instead of the threshold values based on
the Youden index, we now use the criterion that sets a target value for
the FPF. The code for model cROC
bnp
, when setting the
R> th_fpf_cROC_bnp <- compute.threshold.cROC(cROC_bnp, criterion = "FPF", FPF = 0.3,
+ newdata = endopred, ci.level = 0.95, parallel = "snow", ncpus = 2)
R> names(th_fpf_cROC_bnp)
[1] "newdata" "thresholds" "TPF" "FPF" "call"
In addition to the data frame newdata
containing the covariate values
at which the thresholds are computed, the function
compute.threshold.cROC
also returns the covariate-specific
thresholds
corresponding to the specified FPF as well as the
covariate-specific TPF
attached to these thresholds. In both cases,
the function returns the posterior mean and the ci.level
compute.threshold.cROC
, graphical results can be easily obtained. For
simplicity, we only show here the code for the covariate-specific
threshold values (thresholds
), but a similar code can be used to plot
the covariate-specific TPFs. Both plots are shown in Figure
10. As can be observed, for an FPF of
R> df <- data.frame(age = th_fpf_cROC_bnp$newdata$age,
+ gender = th_fpf_cROC_bnp$newdata$gender, y = th_fpf_cROC_bnp$thresholds[[1]][,"est"],
+ ql = th_fpf_cROC_bnp$thresholds[[1]][,"ql"],
+ qh = th_fpf_cROC_bnp$thresholds[[1]][,"qh"])
R> g0 <- ggplot(df, aes(x = age, y = y, ymin = ql, ymax = qh)) + geom_line() +
+ geom_ribbon(alpha = 0.2) +
+ labs(title = "Covariate-specific thresholds for an FPF = 0.3",
+ x = "Age (years)", y = "BMI") +
+ theme(strip.text.x = element_text(size = 20),
+ plot.title = element_text(hjust = 0.5, size = 20),
+ axis.text = element_text(size = 20),
+ axis.title = element_text(size = 20)) + facet_wrap(~gender)
R> print(g0)
|
|
|
|
For conciseness, we have not shown here how to perform convergence
diagnostics of the MCMC chains for models fitted using the function
cROC.bnp
. In very much the same way as shown in the previous section
for the object pROC
dpm
, using the information contained in
component dens
in the list of returned values (if required), one can
produce trace plots of the conditional densities at some sampled values,
as well as obtain the corresponding effective sample sizes and Geweke
statistics. Some results are provided in the Supplementary Material, and
the associated code can be found in the replication code that
accompanies this paper.
In this section, we illustrate how to conduct inference about the
covariate-adjusted ROC curve using ROCnReg. Similar to the
covariate-specific ROC curve, three approaches are available for
estimating the AROC curve. The function AROC.bnp
is illustrated below,
while AROC.sp
and AROC.kernel
are exemplified in the Supplementary
Material.
Recall that the AROC curve is a global summary measure of diagnostic
accuracy that takes covariate information into account. In the context
of our endocrine application, we seek to study the overall
discriminatory capacity of the BMI for detecting the presence of CVD
risk factors when adjusting for age and gender. Here, we focus on how to
estimate the AROC curve using the AROC.bnp
function. The function
syntax is exactly similar to the one of cROC.bnp
, with the only
difference being that we only need to specify the arguments related to
the nondiseased population. The code and respective summary follow.
R> set.seed(123, "L'Ecuyer-CMRG") # for reproducibility
R> AROC_bnp <- AROC.bnp(
+ formula.h = bmi ~ gender + f(age, by = gender, K = c(0,0))
+ group = "cvd_idf", tag.h = 0, data = endosyn, standardise = TRUE,
+ p = seq(0, 1, l = 101), ci.level = 0.95, compute.lpml = TRUE, compute.WAIC = TRUE,
+ compute.DIC = TRUE, pauc = pauccontrol(compute = FALSE),
+ prior.h = priorcontrol.bnp(L = 10), density = densitycontrol(compute = TRUE),
+ mcmc = mcmccontrol(nsave = 8000, nburn = 2000, nskip = 1),
+ parallel = "snow", ncpus = 2)
R> summary(AROC_bnp)
Call: [...]
Approach: AROC Bayesian nonparametric
----------------------------------------------
Area under the covariate-adjusted ROC curve: 0.656 (0.629, 0.684)*
* Credible level: 0.95
Model selection criteria:
Group H
WAIC 11833.000
WAIC (Penalty) 31.236
LPML -5916.766
DIC 11829.750
DIC (Penalty) 29.611
Sample sizes:
Group H Group D
Number of observations 2149 691
Number of missing data 0 0
The area under the AROC curve is plot
function is also available (result in Figure
11a).
R> plot(AROC_bnp, cex.main = 1.5, cex.lab = 1.5, cex.axis = 1.5, cex = 1.3)
Finally, we compare the AROC curve with the pooled ROC curve that was obtained earlier by using a DPM model with 10 components in each group. In Figure 11b, we show the plots of the two curves, and, as can be noticed, the pooled ROC curve lies well above the AROC curve, thus evidencing the need for incorporating covariate information into the analysis.
R> plot(AROC_bnp$p, AROC_bnp$ROC[,1], type = "l", xlim = c(0,1), ylim = c(0,1),
+ xlab = "FPF", ylab = "TPF", main = "Pooled ROC curve vs AROC curve", cex.main = 1.5,
+ cex.lab = 1.5, cex.axis = 1.5, cex = 1.5)
R> lines(AROC_bnp$p, AROC_bnp$ROC[,2], col = 1, lty = 2)
R> lines(AROC_bnp$p, AROC_bnp$ROC[,3], col = 1, lty = 2)
R> lines(pROC_dpm$p, pROC_dpm$ROC[,1], col = 2)
R> lines(pROC_dpm$p, pROC_dpm$ROC[,2], col = 2, lty = 2)
R> lines(pROC_dpm$p, pROC_dpm$ROC[,3], col = 2, lty = 2)
R> abline(0, 1, col = "grey", lty = 2)
|
|
|
|
We finish this section with some comments on computational aspects. In
our experience, the methods with the largest computing times are those
implemented in cROC.bnp
when cROC.kernel
when confidence bands are to be constructed. In the first case, the main
reason behind the computational burden is the need to invert
cROC.kernel
function, the
computing time is mainly driven by the number of bootstrap samples used
for constructing the confidence bands. In
Table 2, we show the time, in seconds, needed for
fitting the pooled, the covariate-specific, and the covariate-adjusted
ROC curve using the Bayesian nonparametric and the kernel approaches for
the synthetic endocrine data and when both parallel (with 2 and 4
processes) and no parallel options are used. We note that for the
Bayesian approaches, we also computed the densities/conditional
densities, as well as the WAIC, LPML, and DIC, which further increase
the computing time (in the case of the AROC curve, these were only
computed in the nondiseased population). With respect to the
kernel-based approach (in this case, the fit is done separately for men
and women and the corresponding results are presented in the
Supplementary Material), we have used
No parallel | Snow (2 cores) | Snow (4 cores) | |
pooledROC.dpm |
138 | 118 | 111 |
pooledROC.kernel |
376 | 196 | 105 |
cROC.bnp |
2052 | 1117 | 680 |
cROC.kernel |
Men: 1159 | Men: 528 | Men: 279 |
Women: 1885 | Women: 916 | Women: 466 | |
AROC.bnp |
126 | 115 | 112 |
AROC.kernel |
Men: 847 | Men: 404 | Men: 214 |
Women: 1707 | Women: 833 | Women: 438 |
In this paper, we have introduced the capabilities of the R package ROCnReg for conducting inference about the pooled ROC curve, the covariate-specific ROC curve, and the covariate-adjusted ROC curve and their associated summary indices. As we have illustrated, the current version of the package provides several options for estimating ROC curves, both under frequentist and Bayesian paradigms, either parametrically, semiparametrically, or nonparametrically. To the best of our knowledge, this is the first software package implementing Bayesian inference for ROC curves. Several additions/extensions are planned in the future, and these, among others, include:
The results in this paper were obtained using R 4.0.3 with the ROCnReg 1.0-5 package. The ROCnReg package has several dependencies: graphics, grDevices, parallel, splines, stats, moments (Komsta and Novomestky 2015), nor1mix (Maechler 2019), Matrix (Bates and Maechler 2019), spatstat (Baddeley and Turner 2005), np (Hayfield and Racine 2008), lattice (Sarkar 2008), MASS (Venables and Ripley 2002), and pbivnorm (Genz and Kenkel 2015). R itself and all packages used are available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/.
We acknowledge the reviewer for their constructive comments that led to an improved version of the article. MX Rodríguez-Álvarez was funded by project MTM2017-82379-R (AEI/FEDER, UE), by the Basque Government through the BERC 2018-2021 program and Elkartek project 3KIA (KK-2020/00049) and by the Spanish Ministry of Science, Innovation, and Universities (BCAM Severo Ochoa accreditation SEV-2017-0718).
sROC, pROC, nsROC, npROCRegression, OptimalCutpoints, ThresholdROC, ROCnReg, ggplot2, coda, moments, nor1mix, Matrix, spatstat, np, lattice, MASS, pbivnorm
Bayesian, Cluster, Distributions, Econometrics, Environmetrics, GraphicalModels, MixedModels, NumericalMathematics, Phylogenetics, Psychometrics, Robust, Spatial, SpatioTemporal, Survival, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Rodríguez-Álvarez & Inácio, "ROCnReg: An R Package for Receiver Operating Characteristic Curve Inference With and Without Covariates", The R Journal, 2021
BibTeX citation
@article{RJ-2021-066, author = {Rodríguez-Álvarez, María Xosé and Inácio, Vanda}, title = {ROCnReg: An R Package for Receiver Operating Characteristic Curve Inference With and Without Covariates}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {13}, issue = {1}, issn = {2073-4859}, pages = {525-555} }