One major goal in clinical applications of time-to-event data is the estimation of survival with censored data. The usual nonparametric estimator of the survival function is the time-honored Kaplan-Meier product-limit estimator. Though this estimator has been implemented in several R packages, the development of the condSURV R package has been motivated by recent contributions that allow the estimation of the survival function for ordered multivariate failure time data. The condSURV package provides three different approaches all based on the Kaplan-Meier estimator. In one of these approaches these quantities are estimated conditionally on current or past covariate measures. Illustration of the software usage is included using real data.
One major goal in survival studies is the estimation of the survival function. The most popular method for estimating this function is the well-known product-limit estimator also known as Kaplan-Meier estimator (Kaplan and Meier 1958). The popularity of the product-limit estimator is explained by its simplicity and intuitive appeal while requiring very week assumptions. It simply takes into account the empirical probability of surviving over a certain time. The method does not take into account of covariates, so it is mainly descriptive. Discrete covariates can be included by splitting the sample for each level of the covariate and applying the product-limit estimator for each subsample. This approach is not recommended for continuous covariates. To account for this extra difficulty several generalizations to the Kaplan-Meier estimator have been proposed throughout the last decades. Beran (1981) was the first one who proposed an estimator of the conditional distribution (survival) function with censored data in a fully nonparametric way. His estimator was further studied among others by Dabrowska (1987), Akritas (1994), Gonzalez-Manteiga and Cadarso-Suárez (1994) and Van Keilegom et al. (2001). All these estimators can be used to estimate the distribution (or survival) function conditional on a continuous covariable in a regression model, when data are subject to censoring. However, none of the above methods can be used to estimate the conditional survival when the covariate is censored.
Several software packages in the form of R packages have been developed to estimate the survival function. Though this function can be estimated parametrically or using nonparametric maximum likelihood estimation, the product limit Kaplan-Meier estimator is still one of the best options for estimating the survival function. Several R packages have been developed to implement the product-limit Kaplan-Meier estimator. For instance, the survival package (Therneau 2015) and the prodlim package (Gerds 2015) can be used to obtain Kaplan-Meier estimates. A comprehensive list of the available packages which can be used to estimate the survival function can be seen in the CRAN Task View “Survival Analysis” (Allignol and Latouche 2016) of the Comprehensive R Archive Network (CRAN).
In many longitudinal medical studies, patients may experience several
events through a follow-up period. In these studies, the analysis of
sequentially ordered events are often of interest. The events of concern
can be of the same nature (e.g., recurrent disease episodes in cancer
studies) or represent different states in the disease process (e.g.,
“alive and disease-free”, “alive with recurrence” and “dead”). If the
events are of the same nature, this is usually referred as recurrent
events (Cook and Lawless 2007), whereas if they are based on different disease
categories they are usually modeled through their intensity functions
(Meira-Machado et al. 2009). Again, several R packages have been developed to
deal with problems that arise in these processes (see for example,
Allignol and Latouche (2016)). Some of these packages can be used to estimate occupation
probabilities, transition probabilities and the cumulative incidence
functions. However, none can be used to estimate conditional survival
probabilities such as:
This paper describes the R package
condSURV (available
from the Comprehensive R Archive Network at
https://CRAN.R-project.org/package=condSURV/) and its capabilities for
implementing nonparametric and semiparametric estimators for conditional
survival probabilities for two multivariate ordered times. The package
can also be used to estimate more general functions involving more than
two successive event times. The estimation of these quantities is
essential for long-term survival prognosis which arises in many medical
contexts such as cancer studies, asthma, HIV/AIDS, heart disease,
dementia and Alzheimer’s disease, etc. The methods implemented in the
package also deal with the possible effect of covariates on the
conditional survival probabilities (e.g.,
The remainder of this paper is organized as follows. The following section provides a brief introduction to the methodological background. All estimators for the conditional survival function are presented. Then, a detailed description of the package is presented, and its usage is illustrated through the analysis of a real data set; and finally, the last section contains the main conclusions of this work. The use of the package to more than two consecutive events is illustrated in the Appendix.
Suppose that an individual may experience
Let
Let
The Kaplan-Meier estimator, also known as the product-limit estimator,
is the most frequently used method to estimate survival for censored
data. The most used representation of the Kaplan-Meier estimator of the
total time is through a product of the following form
In this work we are interested in the estimation of the conditional
survival function,
The conditional survival function
The standard error of the nonparametric landmark estimator (LDM
) may
be large when the censoring is heavy, particularly with a small sample
size. Interestingly, the variance of this estimator may be reduced by
presmoothing (Dikta 1998). Here, the idea of presmoothing involves
replacing the censoring indicators (in the expression of the
Kaplan-Meier weights) by some smooth fit before the Kaplan-Meier formula
is applied. This preliminary smoothing may be based on a certain
parametric family such as the logistic (thus leading to a semiparametric
estimator), or on a nonparametric estimator of the binary regression
curve. The corresponding presmoothed landmark estimator is then given by
Here,
Note that
The practical performance of the proposed estimators for the conditional
survival function has been investigated through simulations
(Meira-Machado et al. 2016). It has been demonstrated that all of the studied
estimators perform well, approaching their targets as the sample size
increases. Besides, simulation results reveal that the landmark
estimator (LDM
) performs favorably when compared with the first method
(KMW
). Furthermore, the reported simulation results reveal relative
benefits of presmoothing (PLDM
) in the heavily censored scenarios or
for small sample sizes.
Now we will explain how to introduce covariate information in the
conditional survival function. Discrete covariates can be also included
by splitting the sample for each level of the covariate and repeating
the described procedures for each subsample. This approach is not valid
for continuous covariates. To estimate the survival probabilities
conditionally on continuous covariates we propose to use local smoothing
which is introduced using regression weights. Without loss of generality
the methodology will be explained in the build of the conditional
survival probability
In the absence of censoring, to estimate these quantities, we may use
kernel smoothing techniques by calculating a local average of the
indicator functions. For example,
In our case, however, we allow the data to be right-censored. To handle
right-censoring, inverse probability of censoring weighting (IPCW; see
for example, Satten and Datta (2001)) can be used. In order to introduce our
estimators, note that, assuming that the support of the conditional
distribution of
The estimation of the conditional survival function, given a covariate
under random censoring has been considered in many papers. This topic
was introduced by Beran (1981) and was further studied by several authors
(Dabrowska 1987; Akritas 1994; Gonzalez-Manteiga and Cadarso-Suárez 1994; Van Keilegom et al. 2001). Their
proposals can also be used to estimate the conditional survival function
of
In order to introduce our estimators we propose to plug-in Beran’s
estimator
Similarly, we propose to plug-in Beran’s estimator
Then, we may introduce the IPCW
estimator as follows:
This section introduces an overview of how the package is structured.
condSURV is a shortcut for “conditional survival” and this is its
major functionality: to provide estimates of the survival function
conditional to previous (possibly censored) events. This software
enables both numerical and graphical outputs to be displayed for all
methods (KMW
, LDM
, PLDM
and IPCW
) described in the previous
section. This software is intended to be used with the R environment for
statistical computing and graphics. Our package is composed of 12
functions that allow users to obtain estimates for all proposed methods.
Details on the usage of the functions (described in Table
1) can be obtained with the corresponding help pages.
Function | Description |
---|---|
survCOND |
Conditional survival probabilities based on Kaplan-Meier weights and the Landmark approaches. This function also implements estimation methods for these quantities conditionally on current or past covariate measures. |
survCS |
Create a "survCS" object, usually used as a response variable in a model formula. |
plot.survCS |
Plot for an object of class "survCS". |
summary.survCS |
Summary for an object of class "survCS". |
print.survCS |
Print for an object of class "survCS". |
KM |
Computes the Kaplan-Meier product-limit of survival. |
PKM |
Computes the presmoothed Kaplan-Meier product-limit of survival. |
Beran |
Computes the conditional survival probability |
KMW |
Returns a vector with the Kaplan-Meier weights. |
PKMW |
Returns a vector with the presmoothed Kaplan-Meier weights. |
LLW |
Returns a vector with the local linear weights. |
NWW |
Returns a vector with the Nadaraya-Watson weights. |
It should be noted that to implement the methods described in
Section 2 one needs the following variables of data
in a specific order (as shown): time1
, event1
, Stime
and event
.
The variable time1
represents the observed time to the first event of
interest, and event1
the corresponding status/censoring indicator (if
the survival time is a censored observation, the value is 0 and
otherwise the value is 1). The variable Stime
represents the total
survival time. If event1
= 0, then the total survival time is equal to
the observed time to the first event. The variable event
is the final
status of the individual (takes the value 1 if the final event of
interest is observed and 0 otherwise). The illustration of the
condSURV package for more than two event times is discussed in
the Appendix.
For illustration, we apply the proposed methods to data from a large clinical trial on Duke’s stage III patients, affected by colon cancer, that underwent a curative surgery for colorectal cancer (Moertel et al. 1990). This data set is freely available as part of the R survival package. The data is also available as part of the R package condSURV. From the total of 929 patients, 468 developed a recurrence and among these 414 died. For each individual, an indicator of his/her final vital status (censored or not), the survival times (time to recurrence, time to death) from the entry of the patient in the study (in days), and a vector of covariates including age (in years) and recurrence (coded as 1 = yes; 0 = no) were recorded. The covariate recurrence is a time-dependent covariate which can be expressed as an intermediate event. We will use the methods described in Section 2 to study survival as well as the effect of recurrence on the final outcome (death).
In the following, we will demonstrate the package capabilities using this data. Below is an excerpt of the data.frame with one row per individual.
> library("condSURV")
> data(colonCS)
> head(colonCS[, 1:7])
time1 event1 Stime event rx sex age
1 968 1 1521 1 Lev+5FU 1 43
2 3087 0 3087 0 Lev+5FU 1 63
3 542 1 963 1 Obs 0 71
4 245 1 293 1 Lev+5FU 0 66
5 523 1 659 1 Obs 1 69
6 904 1 1767 1 Lev+5FU 0 57
Individuals represented in lines 1, 3, 4, 5 and 6 experienced a
recurrence of the tumor and have died; the individual represented in
line 2 is alive and without recurrence at the end of follow-up. We note
that event1 = 1
and event = 0
corresponds to individuals with an
observed recurrence that remain alive at the end of the follow-up.
The development of the condSURV R package has been motivated by recent
contributions that allow the estimation of the (conditional) survival
function for ordered multivariate failure time data. This package
contains the function survCS
which takes the input data as an R
formula and creates a survival object among the chosen variables for
analysis. This function will verify if the data has been introduced
correctly and will create a "survCS" object. Arguments in this
function must be introduced in the following order time1
, event1
,
time2
, event2
, …, Stime
and event
, where time1
, time2
,
…, Stime
are ordered event times and event1
, event2
, …,
event
their corresponding indicator statuses. This function plays a
similar role as the Surv
function in the survival R package.
The effect of “recurrence” is important on the patient outcome and can
be studied through the ordered multivariate event time data of
time-to-event from enrolment, to recurrence and to death. Results
obtained from the estimation of the conditional survival probabilities,
Estimates for the conditional survival probabilities are obtained using
function survCOND
. The first argument of this function is a formula
object with the response on the left of a ~
operator. The response
must be a "survCS" object which is obtained using the survCS
function. A single covariate (qualitative or quantitative) can be
included in the right hand side of the formula allowing the estimation
of survival probabilities conditionally on current or past covariate
measures. The use of the main function survCOND
is explained below.
In the absence of covariates, two methods can be used to estimate the
conditional survival probabilities: the method based on the use of
Kaplan-Meier weights (KMW
) and the method based on the landmark
approach (KMW
). A smoothed version of the landmark approach is also
implemented. Given survCOND
with the method based on the use of Kaplan-Meier weights
(method = "KMW"
):
> set.seed(123)
> colon.kmw.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365, y = 1825,
+ data = colonCS, method = "KMW")
> summary(colon.kmw.1)
P(T>y|T1>365)
y estimate lower 95% CI upper 95% CI
1825 0.7303216 0.697005 0.7562444
As can be seen, the survCOND
function provides, by default, 95%
pointwise confidence intervals (conf = TRUE
) using 200 bootstrap
replicates (n.boot = 200
). The construction of the pointwise
confidence intervals is obtained by means of the bootstrap percentile
method by randomly sampling the conf.level
to the desired level.
Given a fixed value of
> colon.kmw.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
+ y = 365 * 1:7, data = colonCS, method = "KMW")
> summary(colon.kmw.2)
P(T>y|T1>365)
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9441430 0.9265015 0.9599035
1095 0.8624983 0.8353103 0.8843765
1460 0.7750519 0.7389898 0.8090082
1825 0.7303216 0.6920535 0.7664671
2190 0.6879923 0.6511133 0.7249555
2555 0.6548414 0.6114144 0.6940938
If argument y
is omitted, then the survCOND
function allows the user
to obtain estimates for all possible y
values. Then, one can use the
summary function to get the estimated values at the desired values
(through argument times
of the summary function). A truncated output
for the following input commands is shown below:
> colon.kmw.3 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
data = colonCS, method = "KMW")
> summary(colon.kmw.3)
P(T>y|T1>365)
y estimate lower 95% CI upper 95% CI
365.0 1.0000000 1.0000000 1.0000000
421.0 0.9985694 0.9956263 1.0000000
430.0 0.9971388 0.9928077 1.0000000
448.0 0.9957082 0.9900836 1.0000000
454.5 0.9942758 0.9871742 0.9985958
465.0 0.9928434 0.9853940 0.9985509
485.0 0.9914111 0.9826719 0.9971681
486.0 0.9899787 0.9808888 0.9958287
499.0 0.9885463 0.9797218 0.9956079
..... ......... ......... .........
Similarly, one can obtain the results for the landmark methods (LDM
and PLDM
) using the same function survCOND
. The unsmoothed landmark
estimator is obtained using argument method = "LDM"
whereas for
obtaining the presmoothed landmark estimator the argument
presmooth = TRUE
is also required.
> colon.ldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
+ data = colonCS, method = "LDM")
> summary(colon.ldm.1, times = 365 * 1:7)
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9441319 0.9296298 0.9614001
1095 0.8624695 0.8418715 0.8877858
1460 0.7750019 0.7413340 0.8002003
1825 0.7302521 0.6957642 0.7584186
2190 0.6878056 0.6515754 0.7196251
2555 0.6543273 0.6119221 0.6916915
> colon.pldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
+ data = colonCS, method = "LDM", presmooth = TRUE)
> summary(colon.pldm.1, times = 365 * 1:7)
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9429609 0.9236513 0.9590418
1095 0.8624778 0.8373879 0.8844013
1460 0.7788757 0.7430835 0.8137728
1825 0.7411599 0.7046557 0.7710392
2190 0.6795849 0.6377276 0.7118881
2555 0.6467549 0.6028921 0.6821533
In addition, one may also be interested in calculating the conditional
survival function, survCOND
by considering the
argument lower.tail = TRUE
:
> colon.ldm.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
+ data = colonCS, method = "LDM", lower.tail = TRUE)
> summary(colon.ldm.2, times=c(90, 180, 365, 730, 1095, 1460, 1825))
y estimate lower 95% CI upper 95% CI
90 0.96956522 0.94541818 0.99122998
180 0.89565217 0.85836820 0.93278055
365 0.66086957 0.60452616 0.73021864
730 0.25652174 0.20552495 0.30834239
1095 0.10434783 0.07551852 0.14219702
1460 0.06956522 0.04000000 0.10000000
1825 0.06086957 0.03553539 0.09006894
It is worth mentioning that, given x, lower.tail = TRUE
provides the
survival estimates conditional to lower.tail = FALSE
provides the survival estimates conditional to
survCOND
.
The package also provides plots for all methods. The following input
commands (shown below) provide the plots for the conditional survival
function LDM
and
PLDM
) are shown in Figure 1. The plots were obtained for
fixed values
To illustrate the usage of the graphical parameter arguments of function
plot.survCS
, plots shown in the first row in Figure 1 were
obtained using arguments col
, confcol
, xlab
, ylab
and ylim
.
Plots shown on the second row were obtained using the default values.
For more details about the graphical parameter arguments, see the
corresponding help file.
> colon.ldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
+ data = colonCS, method = "LDM")
> plot(colon.ldm.1, col = 1, confcol = 2, xlab = "Time (days)", ylab = "S(y|365)",
+ ylim = c(0.3, 1))
> colon.pldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
+ data = colonCS, method = "LDM", presmooth = TRUE)
> plot(colon.ldm.1, col = 1, confcol = 2, xlab = "Time (days)", ylab = "S(y|365)",
+ ylim = c(0.3, 1))
> colon.ldm.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 1095,
+ data = colonCS, method = "LDM")
> plot(colon.ldm.1)
> colon.pldm.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 1095,
+ data = colonCS, method = "LDM", presmooth = TRUE)
> plot(colon.ldm.1)
When comparing the results obtained through the two methods (LDM
and
PLDM
), it is seen that the semiparametric estimator PLDM
has less
variability with more jump points, specially at the right tail. It can
also be seen that the semiparametric estimator takes higher values at
the right tail.
One important goal is to obtain estimates for the above estimated
quantities (conditional survival probabilities) conditionally on current
or past covariate measures. The current version of the package allows
the inclusion of a single covariate. Below we illustrate its usage using
two qualitative covariates rx (treatment: Obs(ervation), Lev(amisole),
Lev(amisole)+5FU), sex (1 – male) and a continuous covariate age (in
years). The following input commands provide the estimates of the
conditional survival
> colon.rx.ldm <- survCOND(survCS(time1, event1, Stime, event) ~ rx, x = 365,
+ data = colonCS, method = "LDM")
> summary(colon.rx.ldm, times = 365 * 1:6)
rx = Obs
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9469212 0.9095018 0.9780613
1095 0.8672736 0.8247419 0.9131253
1460 0.7655017 0.7187629 0.8095914
1825 0.7123480 0.6608400 0.7638970
2190 0.6562687 0.6038035 0.7170123
rx = Lev
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9411765 0.9070484 0.9695116
1095 0.8280543 0.7757624 0.8773140
1460 0.7375566 0.6787002 0.7929553
1825 0.7102667 0.6541314 0.7709486
2190 0.6704293 0.6101410 0.7309958
rx = Lev+5FU
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9442231 0.9142632 0.9716684
1095 0.8884462 0.8431373 0.9243251
1460 0.8165244 0.7672650 0.8666731
1825 0.7639544 0.7092081 0.8144077
2190 0.7314409 0.6709950 0.7813303
Results obtained for the three treatment groups reveal that the combined treatment of levamisole plus fluorouracil have a benefit on overall survival. This is confirmed by the plot shown in Figure 2 which can be obtained using the following input command:
> plot(colon.rx.ldm, xlab = "Time (days)", ylab = "S(y|365)", conf = FALSE)
Similarly, one can obtain the corresponding survival probabilities
factor
.
> colon.sex.ldm <- survCOND(survCS(time1, event1, Stime, event) ~ factor(sex), x = 365,
+ data = colonCS, method = "LDM")
> summary(colon.sex.ldm, times = 365 * 1:6)
factor(sex) = 0
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9569231 0.9370979 0.9792300
1095 0.8769231 0.8417722 0.9172158
1460 0.7876565 0.7424245 0.8314511
1825 0.7475015 0.7084168 0.7982421
2190 0.6940773 0.6518322 0.7563382
factor(sex) = 1
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9329893 0.9037317 0.9615385
1095 0.8498782 0.8109783 0.8846094
1460 0.7639861 0.7247797 0.8053898
1825 0.7152471 0.6742324 0.7612994
2190 0.6822945 0.6336874 0.7277440
The condSURV package also allows the user to estimate the conditional
survival given a continuous covariate (i.e., objects of class
"integer" or "numeric"). For example, estimates and plots for the
conditional survival for individuals aged 48 years,
> colon.ipcw.age <- survCOND(survCS(time1, event1, Stime, event) ~ age, x = 365,
+ z.value = 48, data = colonCS, lower.tail = FALSE)
> summary(colon.ipcw.age, times = 365 * 1:7)
y estimate lower 95% CI upper 95% CI
365 1.0000000 1.0000000 1.0000000
730 0.9582900 0.8993620 0.9960546
1095 0.8994077 0.8354449 0.9570992
1460 0.8069071 0.7154049 0.8968507
1825 0.7490154 0.6531423 0.8387582
2190 0.7211058 0.6265042 0.8126480
2555 0.6860070 0.5588995 0.8012140
> plot(colon.ipcw.age, col = 1, confcol = 2, xlab = "Time (days)",
+ ylab = "P(T>y|T1>365,age=48)", ylim = c(0.5, 1))
The plot shown in Figure 3 depicts the conditional survival
estimates taking into account the influence of the covariate age
together with the 95% pointwise confidence bands based on the percentile
bootstrap which resamples each datum with probability cluster = TRUE
). This allows to run those repeated operations (for
example, the estimation of the conditional probability in each of the
bootstrap replicates) on multiple processors/cores on your computer, or
on multiple nodes of a cluster. Thus, we can reduce the execution time
in the construction of the bootstrap-based confidence interval.
The use of the condSURV package to more than two consecutive events is illustrated in the Appendix.
This paper discusses the implementation in R of some newly developed methods for the estimation of the conditional survival function. The condSURV package implements nonparametric and semiparametric estimators for these quantities. The package also introduces and implements feasible estimation methods for these quantities conditionally on current or past covariate measures. Other related estimators are also implemented in the package. One of these estimators is the Kaplan-Meier estimator typically assumed to estimate the survival function. A modification of the Kaplan-Meier estimator based on a preliminary estimation (presmoothing) of the censoring probability for the survival time, given the available information is also implemented.
Software for multi-state survival analysis has been developed recently.
These models deal with problems that are similar to those implemented in
package condSURV. Among other quantities these packages deal with the
estimation of the transition probabilities. It can be shown that in the
progressive model with three states the conditional survival function
We mention some important topics that we shall consider in future versions of the package. One important issue is about the extension of the proposed methods for interval censoring. Another topic of much practical interest is to establish a more formal relation between our software and the survival package.
The results in this paper were obtained using R 3.2.5. The condSURV package is available from the Comprehensive R Archive Network at https://CRAN.R-project.org/package=condSURV/.
This research was financed by Portuguese Funds through FCT - “Fundação para a Ciência e a Tecnologia”, within Project UID/MAT/00013/2013 and by research grant SFRH/BPD/93928/2013. We thank the reviewers for their constructive comments.
To illustrate the use of the condSURV package to more than two event times we use data from a bladder cancer study (Byar 1980) conducted by the Veterans Administration Cooperative Urological Research Group. In this study, patients had superficial bladder tumors that were removed by transurethral resection. Many patients had multiple recurrences of tumors during the study, and new tumors were removed at each visit. For illustration purposes we use data from 85 individuals in the placebo and thiotepa treatment groups.
Here, only the first three recurrence times (in months) and the
corresponding event times,
Below we illustrate how to obtain estimates for the conditional survival
function survCS
function as the response. The three
event times and their corresponding indicator statuses have to be
specified in this function. Then, the conditional survival function can
be estimated using function survCOND
by considering the argument
lower.tail = c(TRUE, FALSE)
. Below we show the corresponding input
commands for the landmark method (LDM
) and for its presmoothed
version.
> bladder.ldm <- survCOND(survCS(t1, e1, t2, e2, t3, e3) ~ 1, x = c(8, 12),
+ lower.tail = c(TRUE, FALSE), data = bladderCS, method = "LDM")
> summary(bladder.ldm)
P(T>y|T1<=8,T2>12)
y estimate lower 95% CI upper 95% CI
12 1.0000000 1.0000000 1.0000000
19 0.9444444 0.8174641 1.0000000
22 0.8854167 0.7141053 1.0000000
23 0.7083333 0.4991667 0.9230769
24 0.6493056 0.4117563 0.8461538
25 0.5902778 0.3656623 0.8000893
46 0.3935185 0.0000000 0.7692308
47 0.0000000 0.0000000 0.6675000
> bladder.pldm <- survCOND(survCS(t1, e1, t2, e2, t3, e3) ~ 1, x = c(8, 12),
+ lower.tail = c(TRUE, FALSE), data = bladderCS, method = "LDM", presmooth = TRUE)
> summary(bladder.pldm)
P(T>y|T1<=8,T2>12)
y estimate lower 95% CI upper 95% CI
12 1.0000000 1.00000000 1.0000000
14 0.9734802 0.92306857 1.0000000
18 0.9473931 0.89833894 1.0000000
19 0.8935818 0.80656969 0.9932548
22 0.8663338 0.74760747 0.9691555
23 0.7795092 0.57123322 0.9414341
24 0.7485065 0.49188561 0.9316808
25 0.7163915 0.43525931 0.9189147
26 0.6475479 0.32652371 0.8674366
27 0.6102029 0.32640074 0.8358788
29 0.5709044 0.32624995 0.8159263
43 0.4475656 0.23840137 0.6916276
46 0.3217597 0.12021360 0.6010603
47 0.2062917 0.03004109 0.6010603
When comparing the unsmoothed estimator with the semiparametric presmoothed estimator it can be seen that the later has less variability with more jump points, specially at the right tail. Differences obtained for the estimates using the two methods are explained by the small sample size and the high censoring percentage. For such cases we recommend the use of the presmoothed estimator.
It is worth mentioning that the condSURV package can be used to
estimate other quantities involving all possible combinations in the
argument lower.tail
. For example:
ClinicalTrials, Econometrics, Survival
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Meira-Machado & Sestelo, "condSURV: An R Package for the Estimation of the Conditional Survival Function for Ordered Multivariate Failure Time Data", The R Journal, 2017
BibTeX citation
@article{RJ-2016-059, author = {Meira-Machado, Luis and Sestelo, Marta}, title = {condSURV: An R Package for the Estimation of the Conditional Survival Function for Ordered Multivariate Failure Time Data}, journal = {The R Journal}, year = {2017}, note = {https://rjournal.github.io/}, volume = {8}, issue = {2}, issn = {2073-4859}, pages = {460-473} }