It has been well documented that ignoring measurement error may result in severely biased inference results. In recent years, there has been limited but increasing research on causal inference with measurement error. In the presence of misclassified binary outcome variable, (Shu and Yi 2017) considered the inverse probability weighted estimation of the average treatment effect and proposed valid estimation methods to correct for misclassification effects for various settings. To expedite the application of those methods for situations where misclassification in the binary outcome variable is a real concern, we implement correction methods proposed by (Shu and Yi 2017) and develop an R package ipwErrorY for general users. Simulated datasets are used to illustrate the use of the developed package.
Causal inference methods have been widely used in empirical research (e.g., Rothman et al. 2008; Imbens and Rubin 2015; Hernán and Robins 2019). The propensity score, defined to be the probability of an individual to receive the treatment, plays an important role in conducting causal inference (Rosenbaum and Rubin 1983). Many causal inference methods have been developed based on the propensity score (e.g., Rosenbaum 1987, 1998; Robins et al. 2000; Lunceford and Davidian 2004). These methods commonly require modeling the treatment assignment, which can be difficult in some applications. To protect against misspecification of the treatment model, various methods have been proposed (e.g., Robins et al. 1994; Scharfstein et al. 1999; Bang and Robins 2005). Among them, doubly robust methods are often advocated since the resulting estimators are still consistent when either the treatment model or the outcome model (but not both) is misspecified; such an attractive property is referred to as double robustness.
Although many methods are available for causal inference such as for the estimation of average treatment effects (ATE), those methods are vulnerable to poor quality data. Typically when data are error-contaminated, most existing methods would be inapplicable. It has been well documented that measurement error in variables can often lead to seriously biased inference results in many settings (e.g., Fuller 1987; Gustafson 2003; Carroll et al. 2006; Buonaccorsi 2010; Yi 2017).
In the context of causal inference with error-prone data, there has been limited but increasing research on the impact of measurement error on causal inference and the development of correction methods to deal with measurement error. For instances, (McCaffrey et al. 2013) proposed a correction estimation method when baseline covariates are error-prone. (Babanezhad et al. 2010) examined the bias arising from ignoring misclassification in the treatment variable. (Braun et al. 2016) developed a correction method to correct for treatment misclassification using validation data.
In settings with misclassification in the binary outcome variable, (Shu and Yi 2017) explored the estimation of ATE using the inverse probability weighted (IPW) method. They derived the asymptotic bias caused by misclassification and developed consistent estimation methods to eliminate the misclassification effects. Their development covers practical scenarios where (1) the misclassification probabilities are known, or (2) the misclassification probabilities are unknown but validation data or replicates of outcome measurements are available for their estimation. They further propose a doubly robust estimator to provide protection against possible misspecification of the treatment model.
The methods developed by (Shu and Yi 2017) enjoy wide applications, because
misclassified binary outcome data arise commonly in practice. For
example, the self-reported smoking status without being confirmed by
biochemical tests is subject to misclassification; results of screening
tests are often subject to false positive error and/or false negative
error. For datasets with outcome misclassification, ignoring
misclassification effects may lead to severely biased results. To
expedite the application of the correction methods for general users, we
develop an R package, called
ipwErrorY
(Shu and Yi 2019), to implement the methods by (Shu and Yi 2017) for practical
settings where the commonly-used logistic regression model is employed
for the treatment model and the outcome model. The package focuses on
measurement error in the outcome
The remainder is organized as follows. First, we introduce the notation and framework. Secondly, we describe the methods to be implemented in R. Thirdly, we present the implementation steps and illustrate the use of the package with examples. Finally, a discussion is given. The developed R package ipwErrorY is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=ipwErrorY.
Let
Being consistent with the usual causal inference framework (e.g., Lunceford and Davidian 2004), we make the following standard assumptions:
Assumption 1 (Consistency):
Assumption 2 (No Unmeasured Confounding):
Assumption 3 (Positivity):
The objective is to estimate the ATE, defined as
In the presence of outcome misclassification, instead of
In this section we present estimation methods for
In this subsection we assume that misclassification probabilities
The estimator
To estimate the propensity score
Let
Let
In this subsection we assume that misclassification probabilities
With the validation data,
To estimate
Although the validation data based estimator
(Shu and Yi 2017) considered the linear combination of
For any
To obtain
Finally, (Shu and Yi 2017) pointed out the two associated conditions:
In this subsection we assume that misclassification probabilities
Let
Finally, we comment that when implementing ((9)), one
of the following constraints is often used in applications: (C1)
sensitivity equals specificity (i.e.,
Choosing a suitable identifiability constraint is primarily driven by
the nature of data. When the false positive rate
To protect against model misspecification, (Shu and Yi 2017) proposed a
doubly robust estimator of
The estimator
Suppose the outcome model is postulated as
By ((1)) and ((13)), the observed
likelihood function contributed from individual
Suppose that the outcome model is postulated as
To obtain a consistent estimator
Consistency and asymptotic normality of
Define
We construct the following unbiased estimating function for
By the theory of estimating functions (e.g., Newey and McFadden 1994; Heyde 1997; Yi 2017 Ch.1), solving
and
Under regularity conditions, we have that
Finally, we comment that Scenario 2 allows for different covariates-outcome associations for the treated and untreated groups and provides more flexibility than Scenario 1. However, implementing Scenario 2 involves separately estimating parameters of the outcome model for the treated and untreated groups. When one group has a small size, estimation results may be unsatisfactory. In this case, imposing common covariate effects for the treated and untreated groups as in Scenario 1 can help achieve reasonable estimation results.
We develop an R package
ipwErrorY, which
implements the methods (Shu and Yi 2017) described in the previous section.
The developed package imports R packages stats and
nleqslv (Hasselman 2016). To
illustrate the use of
ipwErrorY, for each
method, we simulate a dataset and then apply a function to analyze the
dataset. To make sure users can reproduce the results, we use the
function set.seed
to generate data. Moreover, the simulated data
provide users a clear sense about the data structure.
The function KnownError
produces the ATE estimate using the correction
method with known misclassification probabilities along with the
sandwich-variance-based standard error and KnownError
is defined as
KnownError(data, indA, indYerror, indX, sensitivity, specificity, confidence=0.95)
with arguments described in detail in
ipwErrorY
documentation. Below is an example to illustrate the use of
KnownError
.
We first load the package in R:
R> library("ipwErrorY")
Using sensitivity 0.95 and specificity 0.85, we create da
, a dataset
of size 2000 with “X1
”, “A
” and “Yast
” being the column names for
the covariate, treatment and misclassified outcome, respectively:
R> set.seed(100)
R> X1 = rnorm(2000)
R> A = rbinom(2000, 1, 1/(1 + exp(-0.2 - X1)))
R> Y = rbinom(2000, 1, 1/(1 + exp(-0.2 - A - X1)))
R> y1 = which(Y == 1)
R> y0 = which(Y == 0)
R> Yast = Y
R> Yast[y1] = rbinom(length(y1), 1, 0.95)
R> Yast[y0] = rbinom(length(y0), 1, 0.15)
R> da = data.frame(X1 = X1, A = A,Yast = Yast)
By using the function head
, we print the first six observations of
dataset da
so that the data structure is clearly shown as follows:
R> head(da)
X1 A Yast
1 -0.50219235 1 1
2 0.13153117 1 1
3 -0.07891709 1 1
4 0.88678481 0 1
5 0.11697127 1 1
6 0.31863009 1 1
We call the developed function KnownError
with sensitivity 0.95 and
specificity 0.85, and obtain a list of the estimate, the
sandwich-variance-based standard error and a
R> KnownError(data = da, indA = "A", indYerror = "Yast", indX = "X1",
+ sensitivity = 0.95, specificity = 0.85, confidence=0.95)
$Estimate
[1] 0.1702513
$Std.Error
[1] 0.02944824
$`95% Confidence Interval`
[1] 0.1125338 0.2279688
The function EstValidation
produces the results for the method with
validation data; they include the optimal linear combination estimate,
the sandwich-variance-based standard error, EstValidation
is defined as
EstValidation(maindata, validationdata, indA, indYerror, indX, indY, confidence=0.95)
with arguments described in detail in
ipwErrorY
documentation. Below is an example to illustrate the use of
EstValidation
.
Using sensitivity 0.95 and specificity 0.85, we create mainda
which is
the non-validation main data of size 1200, and validationda
which is
the validation data of size 800:
R> set.seed(100)
R> X1= rnorm(1200)
R> A = rbinom(1200, 1, 1/(1 + exp(-0.2 - X1)))
R> Y= rbinom(1200, 1, 1/(1 + exp(-0.2 - A - X1)))
R> y1 = which(Y == 1)
R> y0 = which(Y==0)
R> Yast = Y
R> Yast[y1] = rbinom(length(y1), 1, 0.95)
R> Yast[y0] = rbinom(length(y0), 1, 0.15)
R> mainda = data.frame(A = A, X1 = X1, Yast = Yast)
R> X1 = rnorm(800)
R> A = rbinom(800, 1, 1/(1 + exp(-0.2 - X1)))
R> Y = rbinom(800, 1, 1/(1 + exp(-0.2 - A - X1)))
R> y1 = which(Y == 1)
R> y0 = which(Y == 0)
R> Yast = Y
R> Yast[y1] = rbinom(length(y1), 1, 0.95)
R> Yast[y0] = rbinom(length(y0), 1, 0.15)
R> validationda = data.frame(A = A, X1 = X1, Y = Y, Yast = Yast)
We print the first six observations of non-validation data mainda
and
validation data validationda
:
R> head(mainda)
A X1 Yast
1 1 -0.50219235 0
2 0 0.13153117 0
3 1 -0.07891709 1
4 1 0.88678481 1
5 0 0.11697127 1
6 1 0.31863009 1
R> head(validationda)
A X1 Y Yast
1 0 -0.0749961081 0 0
2 1 -0.9470827924 1 1
3 1 0.0003758095 1 1
4 0 -1.5249574007 0 0
5 1 0.0983516474 0 0
6 0 -1.5266078213 1 1
The preceding output clearly reveals that the non-validation data and
validation data differ in the data structure. The non-validation data
mainda
record measurements of the treatment, covariate and
misclassified outcome, indicated by the column names “A
”, “X1
” and
“Yast
”, respectively. In comparison, the validation data
validationda
record measurements of the treatment, covariate,
misclassified outcome and the true outcome, indicated by the column
names “A
”, “X1
”, “Yast
”, and “Y
”, respectively.
To apply the optimal linear combination method with validation data, we
call the developed function EstValidation
and obtain a list of the
estimate, the sandwich-variance-based standard error, a
R> EstValidation(maindata = mainda, validationdata = validationda, indA = "A",
+ indYerror = "Yast", indX = "X1", indY = "Y", confidence=0.95)
$Estimate
[1] 0.1714068
$Std.Error
[1] 0.02714957
$`95% Confidence Interval`
[1] 0.1181946 0.2246189
$`estimated sensitivity and estimated specificity`
[1] 0.9482072 0.8557047
The function Est2Replicates
produces the results for the method with
replicates; they include the estimate, the sandwich-variance-based
standard error, Est2Replicates
is defined as
Est2Replicates(data, indA, indYerror, indX, constraint=c("sensitivity equals
specificity", "known sensitivity", "known specificity", "known prevalence"),
sensitivity = NULL, specificity = NULL, prevalence = NULL, confidence=0.95)
with arguments described in detail in
ipwErrorY
documentation. Below is an example to illustrate the use of
Est2Replicates
.
Using sensitivity 0.95 and specificity 0.85, we create da
, a dataset
of size 2000 with “A
”, “X1
”, and {“Yast1
”, “Yast2
”} being the
column names for the treatment, covariate, and two replicates of
outcome, respectively:
R> set.seed(100)
R> X1 = rnorm(2000)
R> A = rbinom(2000, 1, 1/(1 + exp(-0.2 - X1)))
R> Y = rbinom(2000, 1, 1/(1 + exp(-0.2 - A - X1)))
R> y1 = which(Y == 1)
R> y0 = which(Y == 0)
R> Yast1 = Y
R> Yast1[y1] = rbinom(length(y1), 1, 0.95)
R> Yast1[y0] = rbinom(length(y0), 1, 0.15)
R> Yast2 = Y
R> Yast2[y1] = rbinom(length(y1), 1, 0.95)
R> Yast2[y0] = rbinom(length(y0), 1, 0.15)
R> da = data.frame(A = A, X1 = X1, Yast1 = Yast1, Yast2 = Yast2)
By using the function head
, we print the first six observations of
dataset da
so that the data structure is clearly shown as follows:
R> head(da)
A X1 Yast1 Yast2
1 1 -0.50219235 1 1
2 1 0.13153117 1 1
3 1 -0.07891709 1 1
4 0 0.88678481 1 0
5 1 0.11697127 1 1
6 1 0.31863009 1 1
To apply the method with replicates, we call the developed function
Est2Replicates
with the imposed constraint that specificity equals
0.85. The following list of the estimate, the sandwich-variance-based
standard error, a
R> Est2Replicates(data = da, indA = "A", indYerror = c("Yast1", "Yast2"),
+ indX = "X1", constraint = "known specificity", sensitivity = NULL,
+ specificity = 0.85, prevalence = NULL, confidence=0.95)
$Estimate
[1] 0.1908935
$Std.Error
[1] 0.02687287
$`95% Confidence Interval`
[1] 0.1382236 0.2435634
$`imposed constraint`
[1] "known specificity"
$`estimated sensitivity and assumed specificity`
[1] 0.95 0.85
The function KnownErrorDR
produces the ATE estimate using the doubly
robust correction method along with the sandwich-variance-based standard
error and KnownErrorDR
is defined as
KnownErrorDR(data, indA, indYerror, indXtrt, indXout, sensitivity, specificity,
sharePara=FALSE, confidence=0.95)
with arguments described in detail in
ipwErrorY
documentation. Below is an example to illustrate the use of
KnownErrorDR
.
Using sensitivity 0.95 and specificity 0.85, we create da
, a dataset
of size 2000 with “A
”, {“X
”, “xx
”} and “Yast
” being the column
names for the treatment, covariates and misclassified outcome,
respectively:
R> set.seed(100)
R> X = rnorm(2000)
R> xx = X^2
R> A = rbinom(2000, 1, 1/(1 + exp(-0.1 - X - 0.2*xx)))
R> Y = rbinom(2000, 1, 1/(1 + exp(1 - A - 0.5*X - xx)))
R> y1 = which(Y == 1)
R> y0 = which(Y == 0)
R> Y[y1] = rbinom(length(y1), 1, 0.95)
R> Y[y0] = rbinom(length(y0), 1, 0.15)
R> Yast = Y
R> da = data.frame(A = A, X = X, xx = xx, Yast = Yast)
By using the function head
, we print the first six observations of
dataset da
so that the data structure is clearly shown as follows:
R> head(da)
A X xx Yast
1 1 -0.50219235 0.252197157 1
2 1 0.13153117 0.017300447 1
3 1 -0.07891709 0.006227907 1
4 0 0.88678481 0.786387298 0
5 1 0.11697127 0.013682278 1
6 1 0.31863009 0.101525133 0
When applying the doubly robust method with sensitivity 0.95 and
specificity 0.85, covariates indicated by column names “X
” and “xx
”
are both included in the treatment model and the outcome model. Let the
outcome model be fit for the treated and untreated groups separately. We
call the developed function KnownErrorDR
and obtain a list of the
estimate, the sandwich-variance-based standard error, and a
R> KnownErrorDR(data = da, indA = "A", indYerror = "Yast", indXtrt = c("X", "xx"),
+ indXout = c("X", "xx"), sensitivity = 0.95, specificity = 0.85,
+ sharePara = FALSE, confidence=0.95)
$Estimate
[1] 0.2099162
$Std.Error
[1] 0.02811472
$`95% Confidence Interval`
[1] 0.1548124 0.2650201
Misclassified binary outcome data arise frequently in practice and present a challenge in conducting causal inference. Discussion on addressing this issue is rather limited in the literature. (Shu and Yi 2017) developed the IPW estimation methods for ATE with mismeasured outcome effects incorporated. To expedite the application of these correction methods, we develop an R package ipwErrorY. For practical settings where the treatment model and the outcome model are specified as logistic regression models, we implement the correction methods developed by (Shu and Yi 2017) for settings with known misclassification probabilities, validation data, or replicates of the outcome data as well as the doubly robust method with known misclassification probabilities. Our package offers a useful and convenient tool for data analysts to perform valid inference about ATE when the binary outcome variable is subject to misclassification.
For each function of ipwErrorY, we implement the sandwich variance estimate to construct a normality-based confidence interval. Confidence intervals can also be constructed by bootstrapping (Efron 1982; Efron and Tibshirani 1993), which can be done by leveraging available functions of ipwErrorY. Below we provide example code to produce normality-based and percentile-based bootstrap confidence intervals for a doubly robust estimate with 200 bootstrap replicates.
R> drFUN<-function(dt) {
+ KnownErrorDR(data = dt, indA = "A", indYerror = "Yast", indXtrt = c("X", "xx"),
+ indXout = c("X", "xx"), sensitivity = 0.95, specificity = 0.85,
+ sharePara = FALSE, confidence=0.95)$`Estimate`
+ }
R> EST=drFUN(dt=da)
R> set.seed(100)
R> resultsBoot=replicate(200,drFUN(dt=da[sample(1:nrow(da),replace=TRUE),]))
R> STD=sd(resultsBoot)
R> lowN=EST-qnorm(1-(1-0.95)/2)*STD
R> upN=EST+qnorm(1-(1-0.95)/2)*STD
R> CIn=c(lowN,upN)
R> lowP=as.numeric(quantile(resultsBoot,probs=0.025))
R> upP=as.numeric(quantile(resultsBoot,probs=0.975))
R> CIp=c(lowP,upP)
We print the resultant bootstrap normality-based and percentile-based confidence intervals, respectively, as follows.
R> CIn
[1] 0.1562355 0.2635969
R> CIp
[1] 0.1610038 0.2655065
To make sure the users can reproduce the results, here we call the
function set.seed
before KnownErrorDR
. If set.seed
is not used,
then the variance estimates generated at different times can differ due
to the inner randomness of the bootstrap method.
This example code can be easily modified to produce bootstrap confidence
intervals for an estimate obtained from a different method; one needs
only to replace KnownErrorDR
with the function in
ipwErrorY that
corresponds to the method.
Package ipwErrorY
requires the data be complete (i.e., no missing values). An error
message is shown when NAs in the dataset are detected. For example, if
we artificially introduce an NA in dataset da
and call the developed
function KnownErrorDR
, an error message is displayed:
R> da[1,1]=NA
R> KnownErrorDR(data = da, indA = "A", indYerror = "Yast", indXtrt = c("X", "xx"),
+ indXout = c("X", "xx"), sensitivity = 0.95, specificity = 0.85,
+ sharePara = FALSE, confidence=0.95)
Error in KnownErrorDR(data = da, indA = "A", indYerror = "Yast", indXtrt = c("X", :
invalid dataset with NAs (missing data detected)
Once seeing this error message, users need to check their dataset to see if the NAs can be replaced with suitable values. If missing values do occur, the easiest way is to take the subsample of complete observations to conduct analysis. The resulting point estimates can be reasonable if the missing data mechanism is missing completely at random (MCAR) (Little and Rubin 2002); in this instance, efficiency loss can occur. However, when missing data are not MCAR, this procedure often yields biased results.
Our implementation uses a logistic regression model with a linear function of covariates for both the treatment and the outcome processes, as it is perhaps the most widely-used parametric tool to model a binary variable. Such a logistic regression model can be generalized to include additional terms, such as higher order terms, nonlinear functions, or interactions of the covariates. In this case, the users need only to first create an expanded dataset with those terms included as additional columns of new “covariates", and then use the ipwErrorY package to analyze the expanded dataset.
The authors thank the editor and two reviewers for their helpful comments and suggestions which improved the manuscript. This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and was partially supported by a Collaborative Research Team Project of Canadian Statistical Sciences Institute (CANSSI).
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Shu & Yi, "ipwErrorY: An R Package for Estimation of Average Treatment Effect with Misclassified Binary Outcome", The R Journal, 2019
BibTeX citation
@article{RJ-2019-029, author = {Shu, Di and Yi, Grace Y.}, title = {ipwErrorY: An R Package for Estimation of Average Treatment Effect with Misclassified Binary Outcome}, journal = {The R Journal}, year = {2019}, note = {https://rjournal.github.io/}, volume = {11}, issue = {1}, issn = {2073-4859}, pages = {337-351} }