We describe the R package rbw, which implements the method of residual balancing weights (RBW) for estimating marginal structural models. In contrast to other methods such as inverse probability weighting (IPW) and covariate balancing propensity scores (CBPS), RBW involves modeling the conditional means of post-treatment confounders instead of the conditional distributions of the treatment to construct the weights. RBW is thus easier to use with continuous treatments, and the method is less susceptible to model misspecification issues that often arise when modeling the conditional distributions of treatments. RBW is also advantageous from a computational perspective. As its weighting procedure involves a convex optimization problem, RBW typically locates a solution considerably faster than other methods whose optimization relies on nonconvex loss functions — such as the recently proposed nonparametric version of CBPS. We explain the rationale behind RBW, describe the functions in rbw, and then use real-world data to illustrate their applications in three scenarios: effect estimation for point treatments, causal mediation analysis, and effect estimation for time-varying treatments with time-varying confounders.
This paper describes the R package rbw, which implements the method of residual balancing weights for estimating marginal structural models (MSMs) (Zhou and Wodtke 2020). MSMs seek to estimate causal effects in the presence of post-treatment confounding — a common issue in the social sciences. In studies of time-varying treatments, prior treatments may affect the confounders of future treatments. For example, research has shown that political candidates’ decision to run negative advertisements is shaped by their positions in recent polling data, which are in turn affected by their previous decisions to run negative advertisements (Lau et al. 2007; Blackwell 2013). Post-treatment confounding can also occur in causal mediation analysis when confounders of the mediator-outcome relationship are affected by the treatment. For example, such a problem arises in a study of the mediating role of morality in the effect of shared democracy on public support for war (Tomz and Weeks 2013). Post-treatment variables, such as respondents’ beliefs about the likelihood of victory, may affect both perceptions of morality and support for military intervention.
MSMs aim to address two types of bias associated with conventional regression methods that adjust naively for post-treatment confounders: overcontrol and collider-stratification bias (Robins 1986, 2000). Conditioning naively on post-treatment confounders can create overcontrol bias as it blocks the effect of the treatment on the outcome that passes through these variables. Additionally, it can lead to collider-stratification bias when post-treatment confounders are affected by unobserved determinants of the outcome. This is because the adjustment will create a spurious association between the treatment and the unobserved variables.
Researchers often use inverse probability weighting (IPW) to fit MSMs (for an in-depth exposition of the method, see Robins et al. 2000; Robins 2000; Cole and Hernán 2008). In longitudinal settings, MSMs with IPW involve fitting a model for the conditional mean of the outcome given the treatment history using weights that break the dependence between past confounders and the treatment at each time point. In essence, the weights create a pseudo-population where the marginal mean of the potential outcomes under a treatment history equals the conditional mean of the observed outcome given the treatment history. The R package ipw provides functions for estimating inverse probability weights (van der Wal and Geskus 2011; Geskus and van der Wal 2015).
However, IPW’s success depends on correct specification of the models for the conditional distributions of exposure to treatment and/or mediator (hereafter jointly referred to as “exposures”), which is difficult to achieve in practice. Moreover, even when these models are correctly specified, IPW is inefficient and susceptible to finite-sample bias (Wang et al. 2006; Zhou and Wodtke 2020). Finally, when the exposures are continuous, IPW may perform poorly due to unreliable estimation of conditional densities (Vansteelandt 2009; Naimi et al. 2014).
Alternative methods have attempted to mitigate these shortcomings. In particular, Imai and Ratkovic’s (Imai and Ratkovic 2014, 2015) covariate balancing propensity score (CBPS) method proposes a set of balancing conditions when estimating the propensity scores. Because it seeks to maximize the covariate balance between the treatment and control groups, this method is less sensitive to model misspecification than IPW. Fong et al. (2018) expand CBPS to accommodate continuous exposures, but the challenges involved in modeling conditional densities persist. With this in mind, they have also developed a nonparametric extension that constructs weights that maximize the empirical likelihood while meeting a set of balancing conditions. Though the nonparametric CBPS (npCBPS) circumvents the need for specifying a functional form for the propensity score, it does so at a cost: since the empirical likelihood is not generally convex, the optimization procedure is often slow and may fail to find a solution. The latter can happen, for example, when we have a large number of covariates. The authors advance a workaround that adds flexibility to the covariate balancing conditions and penalizes the remaining imbalance. In doing so, they ensure that a weighting solution exists. Users can implement CBPS in R with the CBPS package (Fong et al. 2021).
Recently, Zhou and Wodtke (2020) propose the method of
residual balancing weights (RBW) for estimating MSMs. RBW involves
fitting models for the conditional means of post-treatment confounders
given past treatments and confounders and extracting their residual
terms. It then uses Hainmueller’s
(2012) entropy balancing method to
find weights such that, in the weighted sample, 1) the residuals are
orthogonal to future exposures, past treatments, and past confounders,
and 2) the relative entropy between the weights and a set of base
weights (e.g., survey sampling weights) is minimized. RBW is similar to
npCBPS in that it relies on a set of balancing conditions to find the
weights and does not require modeling the conditional distributions of
the exposures. Both methods can, therefore, be easily adapted to cases
where exposures are continuous.
In the sections that follow, we present an overview of the residual balancing method and how, in addition to contexts involving time-varying treatments, we can use it in cases of point treatments and causal mediation analysis. We then discuss the package that implements the method in R (rbw). Next, we describe the functions included in rbw and illustrate their use with various real-world data sets. The final section concludes.
This section gives an overview of RBW. We first describe the notation used throughout the paper and briefly review MSMs. Next, we explain the underlying logic of RBW and provide an intuition for how the method works using a directed acyclic graph (DAG).
Assume we have a study with
An MSM is a model for the marginal mean of the potential outcomes under some treatment history:
consistency: if
sequential ignorability: at each time point
positivity: at each time point
Under these assumptions, Robins (1986) shows that the
expected value of the potential outcome
Now consider the formula for the conditional mean of the observed
outcome
By comparing Equations (2) and (3), we see that weighting the observed population by
Zhou and Wodtke (2020) demonstrate that the condition
Residual balancing aims to emulate the moment conditions ((5))
that would hold in the pseudo-population if it were possible to weight
the observed population by
Moreover, Zhou and Wodtke (2020) follow
Hainmueller (2012) and minimize the relative entropy
between
We can then use the method of Lagrange multipliers to find a weighting
solution that minimizes the relative entropy between rbw::eb2()
. The convexity of the relative entropy metric renders it
considerably more computationally efficient than nonconvex loss
functions that can also be used to construct weights, such as the
empirical likelihood (Fong et al. 2018).
A typical implementation of residual balancing can be summarized in three steps:
For each covariate
Find a set of weights,
in the weighted sample, the residuals
the relative entroy between
Use the weights to fit an MSM.
Figure 1 depicts the logic of RBW in a DAG. Following the
notation described above,
Finally, since
The rationale described in the previous section is targeted at estimating the causal effects of time-varying treatments. With minor adaptations, we can expand the use of RBW to two other contexts commonly encountered in the social and biomedical sciences: point treatment situations and causal mediation analysis.
RBW can be easily adapted to a point treatment situation where the user aims only to adjust for a set of baseline (i.e., time-invariant) confounders to estimate the average treatment effect. To this end, we modify the procedure above as follows:
Compute the response residuals
Find a set of weights,
in the weighted sample, the residuals
the relative entropy between
Use the weights to fit an MSM.
The DAG of Figure 2 illustrates the point treatment situation.
In causal mediation analysis, researchers are often concerned with the
joint effects of a one-shot treatment,
As in the point treatment scenario, compute the response residuals
Estimate the response residuals
Find a set of weights,
in the weighted sample, i) the baseline residuals
the relative entropy between
Use the weights to fit an MSM for the joint effects of the treatment and the mediator on the outcome:
In causal mediation analysis, the potential outcomes of interest
are denoted by
Alternatively, the baseline confounders can be included in the
MSM if users decide to skip the first step:
Finally, the controlled direct effects (CDE) of the treatment
can be estimated as
The R package rbw contains four functions:
eb2()
, for generating minimum entropy weights subject to a set of
balancing constraints.rbwPoint()
, for constructing residual balancing weights to
estimate the causal effects of one-shot treatments.rbwMed()
, for constructing residual balancing weights to estimate
controlled direct effects in causal mediation analysis.rbwPanel()
, for constructing residual balancing weights to
estimate the marginal effects of time-varying treatments.Next, we explain each of these functions. The package also includes
several real-world data sets (advertisement
, peace
, campaign_long
,
and campaign_wide
), which we describe and analyze in the examples
below.
eb2()
This function is an adaptation of ebal::eb()
(Hainmueller 2014). It is called
internally by other functions in
rbw to implement the method
of Lagrange multipliers for locating a weighting solution that minimizes
the relative entropy between
We can then insert Equation (9) into
eb2()
implements the algorithm. If convergence is successful, the function
tells the user that “Entropy minimization converged within tolerance
level.” Otherwise, it warns that entropy minimization did not converge
and suggests increasing the number of iterations or reducing the number
of balancing constraints.
The convexity of our optimization problem leads to appreciable computational gains over other methods that use alternative loss functions. This will be demonstrated later when we compare the performance of RBW with that of npCBPS — which uses the empirical likelihood — for the same problem.
eb2()
is used as:
eb2(C, M, Q, Z = rep(0, ncol(C)), max_iter = 200, tol = 1e-04, print_level = 1)
and takes the following arguments:
C
is a constraint matrix, with each column corresponding to a
balancing constraint.M
is a vector of moment conditions to be met in the reweighted
sample (per Equation (6), this is a vector of zeros with
length equal to the number of columns of C
when the other
functions in rbw call
eb2()
internally).Q
is a vector of base weights.Z
is a vector of Lagrange multipliers to be initialized.max_iter
determines the maximum number of iterations for Newton’s
method.tol
is a tolerance parameter used to determine convergence.
Specifically, convergence is achieved if tol
is greater than the
maximum absolute value of the deviations between the moments of the
reweighted data and the target moments (i.e., M
).print_level
determines the level of printing:
The output returned by eb2()
is a list containing the following
elements:
W
is a vector of normalized minimum entropy weights.Z
is a vector of Lagrange multipliers.converged
is a logical indicator for convergence.maxdiff
is a scalar indicating the maximum absolute value of the
deviation between the moments of the reweighted data and the target
moments.rbwPoint()
This function produces residual balancing weights to be used in a point
treatment situation. It first takes a set of baseline confounders and
computes the residuals for each confounder by centering it around its
sample mean. Then it calls eb2()
to find a set of weights, rbwPoint()
calls a
function that ensures that the matrix of balancing constraints comprises
only linearly independent columns.
rbwPoint()
is used as:
rbwPoint(treatment, data, baseline_x, base_weights, max_iter = 200,
tol = 1e-04, print_level = 1)
and takes the following arguments:
max_iter
, print_level
, and tol
have the same definitions as in
eb2()
.treatment
is a symbol or character string for the treatment
variable.data
is a data frame containing all variables in the model.baseline_x
is an expression for a set of baseline confounders
stored in data
or a character vector of the names of these
variables.base_weights
is an optional vector of base weights (or its
name). If no value is supplied, the function sets a vectors of ones
with length equal to the sample size as the base weights.The output returned by rbwPoint()
is a list containing the following
elements:
weights
is a vector of residual balancing weights.constraints
is a matrix of linearly independent residual balancing
constraints.eb_out
contains the results from calling the eb2()
function.call
is the matched call (the function call with all arguments
specified by their full names).rbwMed()
This function produces residual balancing weights for causal mediation
analysis. It takes an optional set of baseline confounders (as
explained above, users can opt instead to include these covariates in
the MSM later) and a list of model objects for the conditional mean of
each post-treatment confounder given the treatment and baseline
confounders. It then calls eb2()
to find a set of weights,
rbwMed()
takes an additional argument, interact
, a logical variable
indicating whether the baseline and post-treatment confounders should be
balanced against the treatment-mediator interaction. This argument is
set to FALSE
by default, but users suspecting an interaction effect
may find it prudent to balance against it. Like rbwPoint()
, rbwMed()
calls a function internally to ensure that only linearly independent
columns constitute the matrix of balancing constraints.
It is used as:
rbwMed(treatment, mediator, zmodels, data, baseline_x, interact = FALSE,
base_weights, max_iter = 200, tol = 1e-04, print_level = 1)
and takes the following arguments:
treatment
, data
, base_weights
, max_iter
, print_level
, and
tol
are defined as in rbwPoint()
.baseline_x
is an optional expression for a set of baseline
confounders stored in data
or a character vector of the names of
these variables.mediator
is a symbol or character string representing the mediator
variable.zmodels
is a list of fitted lm
or glm
objects for
post-treatment confounders of the mediator-outcome relationship.
Users should set this argument to NULL
if there are no
post-treatment confounders.interact
is a logical variable indicating whether baseline and
post-treatment confounders should be balanced against the
treatment-mediator interaction term(s).The output returned by rbwMed()
is a list containing the same elements
as the output from rbwPoint()
.
rbwPanel()
This function produces residual balancing weights for estimating the
marginal effects of time-varying treatments. It takes a list of model
objects for the conditional mean of each post-treatment confounder given
past treatments and past confounders. Then it calls eb2()
to find a
set of weights, rbwPanel()
ensures that the matrix of balancing constraints consists only of
linearly independent columns.
It is used as:
rbwPanel(treatment, xmodels, id, time, data, base_weights, future = 1L,
max_iter = 200, tol = 1e-04, print_level = 1)
and takes the following arguments:
data
, base_weights
, max_iter
, print_level
, and tol
are
defined as in rbwPoint()
and rbwMed()
.treatment
is a symbol or character string for the time-varying
treatment.xmodels
is a list of fitted lm
or glm
objects for time-varying
confounders.id
is a symbol or character string for the unit id variable.time
is a symbol or character string for the time variable.future
is an integer indicating the number of future treatments in
the balancing conditions. When future > 0
, the residualized
time-varying covariates are balanced not only with respect to
current treatment future = 1
, assumes away higher-ordered lagged effects of the
covariates on the treatment. Users can leave out lagged effects
entirely by setting future
to zero.The output returned by rbwPanel()
is essentially the same as those
from rbwPoint()
and rbwMed()
. The only difference is that the
weights
object, instead of a vector, is now a data frame with two
columns: the id variable and the residual balancing weights (the column
storing the weights is called rbw
).
We now illustrate the functions rbwPoint()
, rbwMed()
, and
rbwPanel()
with data sets advertisement
, peace
, campaign_long
,
and campaign_wide
, which are included in
rbw. We expect users to
rarely need to call eb2()
manually since its primary use is to be
called internally by the other functions. Hence, we see little gain in
providing a separate example for it.
Urban and Niebler (2014) studied the effects of televised political advertisements on campaign contributions. Presidential candidates do not tend to deliberately advertise in states where competition for electoral votes is tame. Yet, some areas of noncompetitive states have overlapping media markets with battleground states. Because these media market spillovers do not encompass other forms of campaigning (e.g., rallies, speeches, etc.), the authors can isolate the effect of television advertising by restricting their analyses to noncompetitive states. Their original method involved estimating the propensity score with a logistic model and then conducting propensity score matching. To do so, they dichotomized the political advertising variable to indicate whether a zip code received more than 1000 advertisements.
Deeming this approach inadequate — in part because balancing against a dichotomous treatment does not ensure covariate balance on the underlying continuous variable — Fong et al. (2018) revisit the study using the CBPS method applied to a continuous treatment. Next, we examine how RBW fares compared with CBPS and IPW in this point treatment situation.
Importantly, CBPS assumes that the treatment variable is normally
distributed. To satisfy this assumption,
Fong et al. (2018) search across Box-Cox
transformations to find the most appropriate transformation of the
treatment. Since Q-Q plots show no sizable differences between the
Box-Cox approach and a simple log transformation in achieving normality,
we favor the latter to avoid extraneous details that would divert the
example from its primary objective of illustrating RBW’s implementation.
Hence, we have
Our MSM includes the transformed treatment variable and state dummies
The data set advertisement
contains the variables necessary for the
analyses. We start by loading the necessary packages and our data set:
library("rbw")
data("advertisement")
rbwPoint()
will construct the residual balancing weights by following
the steps described above. It first computes the baseline residuals
advertisement
does not include sampling weights, rbwPoint()
will set
a vector of ones with length equal to the sample size as the base
weights.
rbwPoint_fit <-
rbwPoint(
treatment = treat,
baseline_x = c(
log_TotalPop,
PercentOver65,
log_Inc,
PercentHispanic,
PercentBlack,
density,
per_collegegrads,
CanCommute
),
data =
advertisement
)
Next, we attach the
advertisement$rbwPoint_weights <- rbwPoint_fit$weights
Following most applications of MSMs, we compute standard errors using
the robust (“sandwich”) variance estimator. This can be implemented with
the function survey::svydesign()
, which allows us to specify a complex
survey design and estimate standard errors consistent with this
specification.
library("survey")
rbwPoint_design <- svydesign(ids = ~ 1,
weights = ~ rbwPoint_weights,
data = advertisement)
We then use the residual balancing weights to fit the MSM defined in Equation (12):
rbwPoint_msm <- svyglm(Cont ~ treat + factor(StFIPS),
design = rbwPoint_design)
Since we have transformed our treatment variable, we need to make the
necessary adjustments to compute the treatment effect. As
Urban and Niebler (2014) suggest, it is informative to study the
effect of going from zero to 1,000 political advertisements on campaign
contributions. Hence, we create a dose
variable to account for this.
If
dose <- log(1000 + 1)
To find the estimate
rbwPoint_tau <- 1000 * coef(rbwPoint_msm)[2] * dose
Next, we use the basic properties of the variance operator to derive the standard error. Again, we multiply the result by 1,000 given the scale of the outcome variable:
rbwPoint_vcov <- stats::vcov(rbwPoint_msm)
rbwPoint_se <- 1000 * dose * sqrt(rbwPoint_vcov[2, 2])
We also report the results using the functions from the CBPS and ipw packages. Recall from the introduction that the parametric CBPS is similar to IPW in that it requires explicit models for the conditional distributions of the treatment. However, it improves IPW by considering a set of balancing conditions during the propensity score estimation, thereby reducing sensitivity to model misspecification. By contrast, the nonparametric CBPS (npCBPS) does not require direct estimation of the propensity score; instead, it finds weights that maximize the empirical likelihood while meeting a set of balancing constraints. As such, like RBW, it avoids the need to specify a propensity score model. Readers can find the code detailing the construction of the CBPS and IPW weights in the supplementary material.
Table 1 summarizes the findings. All methods yield relatively similar point estimates and standard errors. In particular, they indicate that going from zero to 1,000 political advertisements increases campaign contributions by around four thousand dollars, on average, although the point estimates from CBPS and IPW are slightly larger than those produced by RBW and npCBPS. The last column shows that different loss functions for the optimization problem can lead to stark differences in computation time. While RBW’s relative entropy metric leads to a convex optimization problem that Newton’s method can solve in less than one second, npCBPS’s algorithm takes much longer to run.
Method | Estimate | Standard Error | Computation Time (in Seconds) |
---|---|---|---|
RBW | 4043 | 2131 | 0.51 |
npCBPS | 3914 | 2091 | 118.70 |
CBPS | 4181 | 2118 | 7.71 |
IPW | 4118 | 2078 | 0.04 |
Computation time may differ depending on system setup. | |||
System setup used to generate the results: | |||
MacBook Pro (15-inch, 2018), 2.2 GHz 6-Core Intel Core i7, 16GB RAM. |
We next illustrate the use of rbwMed()
and rbwPanel()
.
A stylized fact in political science is that democracies do not engage in war with one another. To assess the role of public opinion in keeping the peace, Tomz and Weeks (2013) designed survey experiments that presented participants with a situation where a country was developing nuclear weapons. When describing the situation, the authors randomly and independently changed three characteristics of the country: its political regime (whether it was a democracy), alliance status (whether it had signed a military alliance with the United States), and economic ties (whether it had high levels of trade with the US). The outcome of interest was the respondent’s support for military action on a five-point scale ranging from “oppose strongly” to “favor strongly.” The authors found that respondents were considerably less supportive of military action against democracies than otherwise identical autocratic regimes.
Tomz and Weeks (2013) then went on to investigate the causal mechanisms through which shared democracy reduces public enthusiasm for war. In particular, they measured individuals’ beliefs about the level of threat posed by the potential adversary (number of adverse events respondents considered probable if the US did not engage in war); the cost of the intervention (number of negative consequences anticipated if the US engaged in war); and the likelihood of success (a three-point scale assigning values of 0, 1, and 2 to the following beliefs, respectively: the operation had less than a 50-50 chance of working even in the short run, it would succeed only in the short run, and it would be successful both in the short and long run). They also collected data on people’s moral concerns about using military force.
Their methodological framework focused on estimating the natural direct and natural indirect effects (Imai et al. 2010, 2011), whose identification assumptions require that no post-treatment confounding of the mediator-outcome relationship exists. As such, the authors examined the role of each mediator separately by assuming they operate independently of one another. Yet, as discussed in Zhou and Wodtke (2020), beliefs about the threat, cost, and likelihood of success may affect an individual’s perceptions of morality while also influencing support for war directly. By treating these variables as post-treatment confounders, Zhou and Wodtke (2020) analyzed the mediating role of morality using controlled direct effects.
Let
Our MSM is thus defined as:
We can alternatively include the baseline confounders in the MSM instead of balancing them across the treatment and the mediator with baseline residuals:
The peace
data set includes the variables we will use in the analyses.
Let us first consider the approach where the baseline confounders are
balanced across the treatment and the mediator using the baseline
residuals computed from centering each covariate around its sample mean.
As explained above, RBW in causal mediation analysis requires models for
the conditional means of each post-treatment confounder threatc
,
cost
, and successc
are measured on a continuous scale and fit linear
models for each:
data("peace")
z1 <- lm(threatc ~ ally + trade + h1 + i1 + p1 + e1 + r1 +
male + white + age + ed4 + democ,
data = peace)
z2 <- lm(cost ~ ally + trade + h1 + i1 + p1 + e1 + r1 +
male + white + age + ed4 + democ,
data = peace)
z3 <- lm(successc ~ ally + trade + h1 + i1 + p1 + e1 + r1 +
male + white + age + ed4 + democ,
data = peace)
We then store the three model objects together in a list to be passed
later to the zmodels
argument in rbwMed()
:
zmodels <- list(z1, z2, z3)
To construct the residual balancing weights, rbwMed()
will 1) compute
the baseline residuals peace
does not include
sampling weights. We also pass the name of our mediator to the
mediator
argument:
rbwMed_fit <- rbwMed(
treatment = democ,
mediator = immoral,
zmodels = zmodels,
baseline_x = c(ally, trade, h1, i1,
p1, e1, r1, male, white, age, ed4),
data = peace
)
#> Entropy minimization converged within tolerance level
Next, we attach the weights to peace
:
peace$rbwMed_weights <- rbwMed_fit$weights
We use survey::svydesign()
to estimate robust standard errors:
rbwMed_design <- svydesign(ids = ~ 1,
weights = ~ rbwMed_weights,
data = peace)
Finally, we use the residual balancing weights to fit the MSM from Equation (13):
rbwMed_msm <- svyglm(strike ~ democ * immoral,
design = rbwMed_design)
Steps are similar for the case where the baseline confounders are not
balanced against the treatment and the mediator but instead adjusted in
the MSM, as defined in Equation (14). The models for the
conditional means of the post-treatment confounders are the same as
before, but we now leave the baseline_x
argument empty:
rbwMed2_fit <- rbwMed(
treatment = democ,
mediator = immoral,
zmodels = zmodels,
data = peace
)
#> Entropy minimization converged within tolerance level
peace$rbwMed2_weights <- rbwMed2_fit$weights
rbwMed2_design <- svydesign(ids = ~ 1,
weights = ~ rbwMed2_weights,
data = peace)
rbwMed2_msm <- svyglm(strike ~ ally + trade + h1 + i1 + p1 +
e1 + r1 + male + white + age + ed4 + democ * immoral,
design = rbwMed2_design)
We summarize the results in Table 2. The
estimated CDE if respondents lacked any moral qualms about military
intervention, i.e.,
Baseline Confounders Balanced | Baseline Confounders Adjusted in the MSM | |
---|---|---|
Shared democracy | ||
(0.098) | (0.080) | |
Moral concerns | ||
(0.175) | (0.135) | |
Shared democracy * Moral concerns | 0.138 | |
(0.210) | (0.157) | |
Robust standard errors in parentheses. |
We conclude this section with an example of RBW applied to a context involving time-varying treatments and confounders. Political scientists have shown interest in examining the cumulative effect of negative campaign advertising on election outcomes (Lau et al. 2007; Blackwell 2013; Imai and Ratkovic 2015). This is an intricate process because while polling results affect current campaign strategies, they are also constantly shifting, as they respond both to previous results and candidates’ use of negative advertising in the past. For their ability to accommodate dynamic causal relationships — particularly since they allow past treatments to affect current outcomes (i.e., “carryover effects”) and past outcomes to influence current treatment (i.e., “feedback effects”) (Imai and Kim 2019) — MSMs are suitable for this research question.
We use a dataset on the campaign of 113 Democratic candidates in US
Senate and Gubernatorial Elections from 2000 to 2006. Let
Zhou and Wodtke (2020) define an MSM as follows:
rbw contains two data sets
associated with this problem: campaign_long
and campaign_wide
. They
represent, respectively, the long-format and wide-format data on
negative campaign advertising.
Recall that RBW requires us to fit a model for the conditional mean of
each covariate at each time point given the observed past. We thus
estimate regression models of our time-varying confounders
data("campaign_long")
data("campaign_wide")
x1 <-
lm(dem.polls ~ (neg.dem.l1 + dem.polls.l1 + undother.l1) * factor(week),
data = campaign_long)
x2 <-
lm(undother ~ (neg.dem.l1 + dem.polls.l1 + undother.l1) * factor(week),
data = campaign_long)
We then create a list with the model objects to be passed later to the
xmodels
argument in rbwPanel()
:
xmodels <- list(x1, x2)
To construct the residual balancing weights, rbwPanel()
first extracts
the residual terms future
argument to zero to replicate the results from
Zhou and Wodtke (2020) since the authors balance the
residualized time-varying confounders only with respect to the current
treatment, thus assuming away lagged effects of the covariates on the
treatment.future > 0
, up to future = 4
.
rbwPanel_fit <- rbwPanel(
treatment = neg.dem,
xmodels = xmodels,
id = id,
time = week,
data = campaign_long,
future = 0
)
#> Entropy minimization converged within tolerance level
We now attach the weights to campaign_wide
using the pipe operator and
the left_join()
function from the package
dplyr (this merging is
permitted because the weights
object returned by rbwPanel()
is a
data frame containing the id variable and the residual balancing
weights):
library("dplyr")
campaign_wide <- campaign_wide %>%
left_join(rbwPanel_fit$weights, by = "id")
We use the functions from dplyr to rename the weights so that our variable’s name is consistent with the previous examples:
campaign_wide <- campaign_wide %>%
rename(rbwPanel_weights = rbw)
Again, we use survey::svydesign()
to compute robust standard errors:
rbwPanel_design <- svydesign(ids = ~ 1,
weights = ~ rbwPanel_weights,
data = campaign_wide)
Finally, we use the residual balancing weights to fit the MSM from Equation (15):
rbwPanel_msm <- svyglm(demprcnt ~ ave_neg * deminc + camp.length +
factor(year) + office,
design = rbwPanel_design)
We report the model results in Table 3. The
estimate for nonincumbents is
Average proportion | 0.490 |
(0.315) | |
Incumbency | 14.971 |
(2.823) | |
Average proportion * Incumbency | |
(0.531) | |
Robust standard errors in parentheses. |
We conclude by comparing RBW’s computational performance to CBPS for a dichotomized version of the treatment representing whether more than 10% of the candidate’s advertising was negative for each week. We use this dichotomized variable because CBPS has not been extended to work with continuous treatments in longitudinal settings. Additionally, we do not compare RBW to npCBPS because the latter’s use is restricted to point treatment situations. Lastly, though we can construct IPW weights almost immediately, Zhou and Wodtke (2020) show that IPW yields considerably larger effect estimates than RBW and CBPS for this particular case (likely due to the method’s susceptibility to model misspecification), so we do not report the IPW results.
Let us first construct the RBW weights. The steps are identical to the
ones detailed above for the continuous treatment, the only change being
the name of our treatment variable. Next, we use the CBMSM()
function
from the CBPS
package to generate our CBPS weights. To facilitate
comparisons of computation time, we also call Sys.time()
before and
after running the functions that produce the weights in each package.
The scalar objects rbwPanel_time
and CBPSPanel_time
store how long
each method takes to construct the corresponding weights.
m1 <-
lm(dem.polls ~ (d.gone.neg.l1 + dem.polls.l1 + undother.l1) * factor(week),
data = campaign_long)
m2 <-
lm(undother ~ (d.gone.neg.l1 + dem.polls.l1 + undother.l1) * factor(week),
data = campaign_long)
xmodels <- list(m1, m2)
rbwPanel_start <- Sys.time()
rbwPanel_fit <- rbwPanel(
treatment = d.gone.neg,
xmodels = xmodels,
id = id,
time = week,
data = campaign_long,
future = 0
)
#> Entropy minimization converged within tolerance level
rbwPanel_end <- Sys.time()
rbwPanel_time <- rbwPanel_end - rbwPanel_start
CBPSPanel_form <-
"d.gone.neg ~ d.gone.neg.l1 + dem.polls + undother + camp.length + deminc + office + factor(year)"
CBPSPanel_start <- Sys.time()
CBPSPanel_fit <-
CBMSM(
formula = CBPSPanel_form,
time = campaign_long$week,
id = campaign_long$demName,
data = campaign_long,
type = "MSM",
iterations = NULL,
twostep = TRUE,
msm.variance = "approx",
time.vary = TRUE
)
CBPSPanel_end <- Sys.time()
CBPSPanel_time <- CBPSPanel_end - CBPSPanel_start
The output below shows the computation times:
rbwPanel_time
#> Time difference of 0.107008 secs
CBPSPanel_time
#> Time difference of 1.059287 mins
Whereas RBW takes less than one second to construct the weights, CBPS takes much longer. Hence, the longitudinal setting presents the same pattern we saw above for the point treatment situation: RBW has considerable gains in computational performance over alternative methods of constructing weights for MSMs. Since our focus here is computational performance, we omit the effect estimates for the dichotomized treatment. As shown in Zhou and Wodtke (2020), the results are broadly consistent with those based on the continuous treatment, with RBW and CBPS yielding similar point estimates.
Compared to other methods of constructing weights for MSMs, RBW has several advantages. In particular, it does not require modeling the conditional distributions of exposures and is thus easy to use with continuous treatments. Previous simulation studies suggest that it is often more efficient and more robust to model misspecification than alternative weighting strategies (Zhou and Wodtke 2020). RBW is also favorable from a computational perspective. Its procedure for finding weights involves a convex optimization problem, allowing RBW to locate a solution substantially faster than alternative methods whose optimization relies on nonconvex loss functions — such as the recently proposed nonparametric version of CBPS, which uses the empirical likelihood (Fong et al. 2018). Table 4 sums up these comparisons.
Method | Models for Conditional Distributions of Exposures | Balancing Constraints | Implemented for Time-varying Treatments? | R Package |
---|---|---|---|---|
IPW | Required | Absent | Yes | ipw |
CBPS | Required | Present | Yes | CBPS |
npCBPS | Not Required | Present | No | CBPS |
RBW | Not Required | Present | Yes | rbw |
After explaining the underlying logic of RBW, we have described its implementation in the R package rbw. With examples from several data sets, we have demonstrated the use of rbw in three distinct contexts: effect estimation for point treatments, causal mediation analysis, and effect estimation for time-varying treatments with time-varying confounders.
Nonetheless, RBW is not without limitations. In particular, it depends
on models for the conditional means of post-treatment confounders. When
these models are incorrectly specified, the pseudo-population generated
by residual balancing weights will fail to mimic the original unweighted
population, and our estimates will be biased. Even when these models are
correctly specified, RBW may also yield biased estimates when we have
insufficient balancing conditions. Adding more functions such as
cross-products and high-order terms to the set
Despite these limitations, RBW will be useful to many social scientists interested in using marginal structural models to study causality in dynamic settings. We will continue to upgrade the package by expanding RBW’s ranges of applicability — specifically, to censored data, to contexts involving repeated outcome measures including survival data (Hernán et al. 2000, 2002), and to cases where, as discussed above, exact balance is infeasible and approximate balance must be pursued.
We thank Geoffrey Wodtke and two anonymous reviewers for their valuable comments on a previous version of this paper.
CausalInference, Databases, MissingData, ModelDeployment
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
future > 0
, up to future = 4
.[↩]Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Baum & Zhou, "rbw: An R Package for Constructing Residual Balancing Weights", The R Journal, 2022
BibTeX citation
@article{RJ-2022-049, author = {Baum, Derick S. and Zhou, Xiang}, title = {rbw: An R Package for Constructing Residual Balancing Weights}, journal = {The R Journal}, year = {2022}, note = {https://rjournal.github.io/}, volume = {14}, issue = {3}, issn = {2073-4859}, pages = {174-192} }