Meta-analysis, a statistical procedure that compares, combines, and synthesizes research findings from multiple studies in a principled manner, has become popular in a variety of fields. Meta-analyses using study-level (or equivalently aggregate) data are of particular interest due to data availability and modeling flexibility. In this paper, we describe an R package metapack that introduces a unified formula interface for both meta-analysis and network meta-analysis. The user interface—and therefore the package—allows flexible variance-covariance modeling for multivariate meta-analysis models and univariate network meta-analysis models. Complicated computing for these models has prevented their widespread adoption. The package also provides functions to generate relevant plots and perform statistical inferences like model assessments. Use cases are demonstrated using two real data sets contained in metapack.
The U.S. Food and Drug Administration provides a clear definition of meta-analysis as “the combining of evidence from relevant studies using appropriate statistical methods to allow inference(s) to be made to the population of interest” (U.S. Food and Drug Administration et al. 2018). In fields like medicine, pharmacology, and epidemiology, meta-analysis has become popular for reconciling conflicting results or corroborating consistent ones in multiple studies (Chalmers et al. 2002; Borenstein et al. 2011; Hartung et al. 2011; Balduzzi et al. 2019). Findings produced from meta-analyses are often placed at the apex of the evidence hierarchy (U.S. Food and Drug Administration et al. 2018).
R already has a large supply of meta-analysis packages. meta (Schwarzer 2007) and rmeta (Lumley 2018) use the method of moments introduced in DerSimonian and Laird (1986). metafor (Viechtbauer 2010) further contains moderator analyses and fits meta-regression (Berkey et al. 1995) through weighted least squares. On the other hand, metaLik (Guolo 2012) takes a likelihood approach based on the second-order approximation of the modified likelihood ratio test statistic (Skovgaard et al. 1996). metatest (Huizenga et al. 2011) further includes hypothesis testing capabilities through the likelihood ratio test with Barlett correction, and mvmeta (Gasparrini et al. 2012) fits multivariate meta-analysis and meta-regression models via the method of maximum likelihood. There are packages for Bayesian meta-analytic inference as well. bayesmeta (Röver 2020) assumes the normal-normal hierarchical random-effect model and allows the user to choose prior distributions with a great deal of flexibility, both informative and noninformative. nmaINLA (Guenhan et al. 2018) provides functionalities for network meta-analysis and meta-regression with integrated nested Laplace approximations (INLA) as an alternative to the Markov chain Monte Carlo (MCMC) algorithm. On the other hand, bmeta (Ding and Baio 2016) delivers flexible meta-analytic modeling by interfacing with JAGS (Plummer 2003). MetaStan (Guenhan 2020) provides binomial-normal hierarchical models with weakly informative priors, building upon the probabilistic language Stan (Stan Development Team 2020).
Despite its importance and the wide array of R packages available, meta-analysis is still regarded as a niche field that interests a narrow group of researchers and remains relatively low impact. We partially attribute this phenomenon to the fact that the R community has yet to come up with a user interface that unifies the theoretical distinctions between univariate and multivariate models, and between meta-analysis and network meta-analysis although the models have grown more complicated in the intervening years. Furthermore, a large class of simple Bayesian meta-analytic models is handled by probabilistic programming languages like Stan (Stan Development Team 2020) or BUGS (Sturtz et al. 2005). Meanwhile, many complex models are not easily programmable in probabilistic languages, and are not readily available in R. Especially, in the context of variance-covariance matrix modeling in (network) meta-analysis, metapack is the first attempt in the R cosmos, to the best of our knowledge, to provide easy access to regression-modeling of the variances (of the treatment effects) as well as a wide array of options for modeling the response covariance matrices when the aggregate responses are multivariate with only a partially observed within-study sample covariance matrix.
metapack presented in this paper proposes a formula structure that flexibly represents the types of responses (univariate and multivariate) and the number of treatments (meta-analysis and network meta-analysis). The package also provides functions to assess model fits such as the deviance information criterion (DIC) and the logarithm of the pseudo-marginal likelihood (LPML), and to generate diagnostic plots. Some potential complications, theoretical and computational, in these model selection criteria may break the algorithm or erode the statistical inference when unaddressed (see Section 5.0.0.1), which metapack takes care of by default—an advantage over model-agnostic programming languages.
The rest of this paper is organized as follows. Section 2 briefly reviews the (network) meta-analysis model. Section 3 describes the general form of a meta-analysis data set to establish a generic data structure for meta-analysis and network meta-analysis. Section 4 explains how the data structure can be represented using R’s extended formula and how metapack’s main function parses it, and lays down the various modeling options for meta-analysis and network meta-analysis. Section 5 further introduces the S3 methods available for performing statistical inferences and comparing models. Some computational considerations to accelerate the computation are detailed as well. Section 6 provides demonstrations using the cholesterol data and TNM data included in metapack. Finally, Section 7 concludes the paper by offering a cautionary remark for multivariate meta-analysis models regarding the number of observations required to perform valid inferences on the correlation matrix, and discussing future research and package development directions.
In this section, we briefly review the models considered in metapack. There are largely two umbrella models: univariate or multivariate meta-analysis based on (Yao et al. 2015), and univariate network meta-analysis based on (Li et al. 2021). The various modeling options for each model introduced in this section encompass the ones in (Yao et al. 2015) and (Li et al. 2021). In what follows, the model description will deal with a general multivariate model including both meta-analysis and network meta-analysis, which is valid even when univariate response is assumed, i.e. \(J=1\). Occasionally, the univariate model description will be provided side by side to avoid confusion.
Assume \(K\) randomized controlled trials (RCTs) where the \(k\)-th trial includes \(T_k\) treatment arms. Meta-analysis refers to a special case where \(T_k=2\) for all \(k=1,\ldots,K\). We adopt the notational abuse of omitting the trial indicator in the treatment’s subscript. For instance, for the sample size of the \(t\)-th treatment arm of the \(k\)-th trial, we write \(n_{kt}\), not \(n_{t_kk}\). Furthermore, for readability, note that boldface lowercase Latin letters are vectors, while uppercase Latin letters are matrices. Boldface Greek letters can either be vectors or matrices and will be defined contextually. Let \(\mathbf{y}_{kt}\) be a \(J\)-dimensional aggregate response vector for the \(t\)-th treatment of the \(k\)-th trial. Similarly, let \(\boldsymbol{x}_{ktj}\in \mathbf{R}^{p_j}\) be the treatment-within-trial level covariate corresponding to the \(j\)-th response, reflecting the fixed effects of the \(t\)-th treatment arm, and let \(\boldsymbol{w}_{ktj} \in \mathbf{R}^{q_j}\) be the vector of covariates for the random effects. The model \(\boldsymbol{y}_{kt}=\boldsymbol{X}_{kt}\boldsymbol{\beta} + \boldsymbol{W}_{kt} \boldsymbol{\gamma}_k + \boldsymbol{\epsilon}_{kt}\) describes the aggregate data model, where \(\boldsymbol{\epsilon}_{kt} = (\epsilon_{kt1},\ldots,\epsilon_{ktJ})^\top\), \(\boldsymbol{X}_{kt} = \oplus_{j=1}^J x_{ktj}^\top\) for which \(\oplus\) indicates direct sum, \(\boldsymbol{\beta} = (\boldsymbol{\beta}_1^\top,\ldots,\boldsymbol{\beta}_J^\top)^\top\), \(\boldsymbol{W}_{kt}=\oplus_{j=1}^J \boldsymbol{w}_{ktj}^\top\), and \(\boldsymbol{\gamma}_k = (\boldsymbol{\gamma}_{k1}^\top,\boldsymbol{\gamma}_{k2}^\top,\ldots,\boldsymbol{\gamma}_{kJ}^\top)^\top\). The aggregate (network) meta-analysis model becomes \[\label{eq:inital-mvmr} \begin{dcases} \mathbf{y}_{kt} = \mathbf{X}_{kt}\mathbf{\beta} + \mathbf{W}_{kt} \mathbf{\gamma}_k + \mathbf{\epsilon}_{kt} \;\; and \;\; (n_{kt}-1)\mathbf{S}_{kt} \sim \mathcal{W}_{n_{kt}-1}(\Sigma_{kt}),&\text{if J }\geq \text{ 2 (multivariate)}\\ y_{kt} = \mathbf{x}_{kt}^\top\mathbf{\beta} + \mathbf{w}_{kt}^\top \mathbf{\gamma}_k + \epsilon_{kt} \;\; and \;\; (n_{kt}-1)s^2_{kt}/\sigma^2 \sim \chi^2_{n_{kt}-1},&\text{if J=1 (univariate)} \end{dcases}, \tag{1}\] where \(\mathbf{\epsilon}_{kt} \sim \mathcal{N}(\mathbf{0}, \Sigma_{kt}/n_{kt})\) or \(\epsilon_{kt} \sim \mathcal{N}(0,\sigma_{kt}^2/n_{kt})\), \(\mathbf{S}_{kt}\) is the sample covariance matrix, \(s_{kt}^2\) is the sample variance, and \(\Sigma_{kt} \in \mathcal{S}_{++}^J\) for which \(\mathcal{S}_{++}^J\) is the space of \(J\times J\) symmetric positive-definite matrices. In Equation (1), \(\mathcal{W}_{\nu}(\Sigma)\) is the Wishart distribution with \(\nu\) degrees of freedom and a \(J\times J\) scale matrix \(\Sigma\) with density \[p(X\mid \nu, \Sigma) = \dfrac{1}{2^{J\nu}|\Sigma|^{\nu/2}\Gamma_J(\nu/2)} |X|^{(\nu-J-1)/2}\exp\left(-\dfrac{1}{2}\mathrm{tr}(\Sigma^{-1}X) \right)I(X\in \mathcal{S}_{++}^J),\] where \(\Gamma_J\) is the multivariate gamma function defined by \(\Gamma_J(z) = \pi^{J(J-1)/4}\prod_{j=1}^J \Gamma[z+(1-j)/2]\). \(\chi_\nu^2\) indicates the chi-squared distribution with \(\nu\) degrees of freedom.
Stacking the random effects for all response endpoints, \(\boldsymbol{\gamma}_k = (\boldsymbol{\gamma}_{k1}^\top, \ldots, \boldsymbol{\gamma}_{kJ}^\top)^\top\). Since the random effects are assumed to follow a distribution in the location-scale family (either a multivariate \(t\)-distribution or a multivariate normal distribution), i.e. \(\boldsymbol{\gamma}_k \sim \mathcal{LS}(\boldsymbol{\gamma}, \Omega)\), the fixed-effect coefficient vector \(\boldsymbol{\beta}\) absorbs the random effects’ location parameter \(\boldsymbol{\gamma}\), forming \(\boldsymbol{\theta} = (\boldsymbol{\beta}^\top,\boldsymbol{\gamma}^\top)^\top\). The corresponding design matrix \(\boldsymbol{X}_{kt}\) is also expanded to include the random-effect design matrix \(\boldsymbol{W}_{kt}\), written as \(\boldsymbol{X}_{kt}^* = [\boldsymbol{X}_{kt}, \boldsymbol{W}_{kt}]\). With \(\boldsymbol{\gamma}_{k,o} = \boldsymbol{\gamma}_k - \boldsymbol{\gamma}\), the model now becomes \[\label{eq:model-centered} \begin{dcases} \mathbf{y}_{kt} = \mathbf{X}_{kt}^*\mathbf{\theta} + \mathbf{W}_{kt} \mathbf{\gamma}_{k,o} + \mathbf{\epsilon}_{kt},&\text{if J } \geq \text{ 2 (multivariate)}\\ \begin{split} y_{kt} &= {\mathbf{x}_{kt}^*}^\top\mathbf{\theta} + \mathbf{w}_{kt}^\top \mathbf{\gamma}_{k,o} + \epsilon_{kt}\\ &\Rightarrow \boldsymbol{y}_k = \boldsymbol{X}_k^* \boldsymbol{\theta} + \boldsymbol{W}_k \boldsymbol{\gamma}_{k,o} + \boldsymbol{\epsilon}_k \end{split} ,&\text{if J=1 (univariate)} \end{dcases}, \tag{2}\] where \(\boldsymbol{y}_k=(y_{k,t_{k1}},\ldots,y_{k,t_{kT_k}})^\top\), \(\boldsymbol{X}_k^*= ((\boldsymbol{x}^*_{k,t_{k1}})^\top,\ldots,(\boldsymbol{x}^*_{k,t_{kT_k}})^\top)^\top\), \(\boldsymbol{W}_k = (\boldsymbol{w}_{k,t_{k1}}^\top,\ldots,\boldsymbol{w}_{k,t_{kT_k}}^\top)^\top\), and \(\boldsymbol{\epsilon}_k=(\epsilon_{k,t_{k1}},\ldots,\epsilon_{k,t_{kT_k}})^\top\). We briefly restore the correct subscripts (i.e. \(t_{kl}\) for \(l=1,\ldots,T_k\)) to demonstrate that \(\boldsymbol{y}_k\), \(\boldsymbol{X}_k^*\), and \(\boldsymbol{W}_k\) may be different lengths and dimensions for different \(k\)’s. Here, \(\{t_{k1},\ldots,t_{kT_k}\}\) denotes the set of treatments compared in the \(k\)-th trial. The random effects are modeled differently in meta-analysis than in network meta-analysis. A major reason for this divergence is that the variables explaining the treatment effects are not easily found in the presence of varying numbers of treatments in different trials. The differences are further detailed in Section 4.3 and Section 4.4.
To streamline configuring models in R formula, it is important to understand the data structure for metapack. Table 1 represents a typical arm-level data set for (network) meta-analysis, where each row represents a trial arm.
Outcome (\(\mathbf{y}_{kt}\)) | SD (\(s_{kt}\)) | DesignM1 (\(\mathbf{x}_{kt}\)) | DesignM2 (\(\mathbf{w}_{kt}\)) | Trial (k ) |
Treat (t ) |
n |
---|---|---|---|---|---|---|
\(\mathbf{y}_{14}^\top\) | \(s_{14}\) | \(\mathbf{x}_{14}^\top\) | \(\mathbf{w}_{14}^\top\) | 1 | 4 | 1000 |
\(\mathbf{y}_{11}^\top\) | \(s_{11}\) | \(\mathbf{x}_{11}^\top\) | \(\mathbf{w}_{11}^\top\) | 1 | 1 | 545 |
\(\mathbf{y}_{21}^\top\) | \(s_{21}\) | \(\mathbf{x}_{21}^\top\) | \(\mathbf{w}_{21}^\top\) | 2 | 1 | 1200 |
\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |
Outcome
is the response (or responses for multivariate cases), SD
is
the standard deviation(s) of the response(s), DesignM1
and DesignM2
are design matrices, and n
is the arm sample size. The pair of trial
and treatment indicators is unique to a row. The first design matrix,
DesignM1
, contains the covariates for fixed effects and will be
written as \(\mathbf{X}\) henceforth. The second design matrix, DesignM2
or \(\mathbf{W}\), represents different things depending on the model,
which will be explained in Section 4.3 and
Section 4.4. It should be noted that there can
always be two design matrices, whose configuration will be illustrated
in Section 4.
A meta-analytic data set is characterized as folows: (1) univariate or
multivariate and (2) meta-analysis or network meta-analysis. Here,
meta-analysis refers to when trials have specifically two treatments
(i.e. \(t=1,2\) for all \(k\)) and all treatments are compared head to head.
On the other hand, network meta-analysis includes more than two total
treatments, where each trial can have a different set of treatments,
allowing indirect comparison between treatments that are not compared
head to head. The data structure is unchanged for network meta-analysis
except that Treat
can have more than two unique values. The first
category (univariate vs. multivariate) is determined by the number of
response endpoints, and the second category (meta- vs. network
meta-analysis) by the number of treatments. All other modeling choices
fall into prior specification.
bmeta_analyze
is the main function in metapack, whose first argument
is an R formula. bmeta_analyze
internally parses a formula to identify
a model and ultimately calls a worker function. An extension of R’s
formula class, Formula
(Zeileis and Croissant 2010), accommodates multiple responses and parts, lending itself
well into meta-analysis modeling. Once a model is fully identified, the
MCMC algorithm is executed in C++, thanks to
Rcpp (Eddelbuettel and Balamuta 2017) and
RcppArmadillo
(Eddelbuettel and Sanderson 2014).
Name | Functionality | Description |
---|---|---|
bmeta_analyze |
Estimation | Fits (network) meta-analysis models |
hpd |
Inference | Computes highest posterior density (HPD) intervals of model parameters |
model_comp |
Inference | Computes model comparison measures (DIC or LPML) |
print |
Displays a summary of the output | |
summary |
Displays a summary of the output | |
plot |
Plots trace plots, density plots, or surface-under-the-cumulative-ranking-curve (SUCRA) plots | |
fitted |
Computes posterior means, standard deviations, and HPD intervals | |
coef |
Computes posterior means of fixed-effect coefficients | |
cholesterol |
Data set | Cholesterol data for multivariate meta-analysis |
TNM |
Data set | Triglyceride data for network meta-analysis |
Formula
The three characterizations of a meta-analytic data set must be encoded in the formula. Requiring the formula to have two left-hand sides (LHS) and two or three right-hand sides (RHS)1 is enough to communicate the characterizations for a wide class of meta-analysis models. We invite other R package developers to adopt the following representation for meta-analytic models, the general form of which is given by
+ y2 | sd1 + sd2 ~ x1 + x2 + x3 + ns(n) | w1 + w2 + w3 | treat + trial (+ groups) y1
Each part in LHS or RHS is an R expression where variables (or
functions of variables) are chained with a plus sign (+
)—e.g.
x1 + x2
. The tilde (~
) separates all LHSs from all RHSs, each
further separated into parts by vertical bars (|
). The meaning of each
part is syntactically determined by its location within the formula,
like an English sentence. Therefore, every part must come in the exact
order as prescribed for bmeta_analyze
to correctly identify the model.
y1 + y2
), the responses, is required of all models.
Depending on the number of variables given in the first LHS,
bmeta_analyze
will determine whether the model is multivariate or
univariate. For example, a first LHS with only y
would flag the
model as univariate.sd1 + sd2
) supplies the standard deviations of the
endpoints required of an aggregate-data meta-analysis. The function
call will break if this part is missing.x1 + x2 + x3 + ns(n)
) defines the fixed-effect
covariates. For aggregate-data models, the arm sample sizes must be
passed as an argument to ns()
. In the example code, n
is the
variable containing the arm sample sizes.w1 + w2 + w3
) defines either the random-effect
covariates (\(\boldsymbol{w}_{kt}^\top \boldsymbol{\gamma}\)) or the
variance-related covariates
(\(\log \tau_{kt} = \boldsymbol{w}_{kt}^\top \boldsymbol{\phi}\))—see
Sections 4.3 or 4.4 for
details. This part is optional. If omitted, bmeta_analyze
will
assume \(\boldsymbol{w}_{kt}=\boldsymbol{1}\) where \(\boldsymbol{1}\)
is a vector of ones.treat + trial
or treat + trial + groups
)
corresponds to the treatment and trial indicators, and optionally
the grouping variable if it exists. Variables here must be provided
in the exact order shown in the example.The dimension of the response(s) is explicit in the formula, which
determines the first characterization. The treatments are coerced to a
factor—if not already one—whose number of levels is extracted (i.e.
nlevels(treat)
) to resolve the second characterization, meta-analysis
versus network meta-analysis.
Aside from the first two arguments, formula
and data
, there are four
other optional arguments that must be provided as R’s list
class:
prior
, mcmc
, control
, and init
. All hyperparameters for the
prior distributions should be included in prior
—see
Section 2 for hyperparameters. mcmc
only
regards the numbers of MCMC iterations: ndiscard
for the number of
burn-in iterations, nskip
for thinning, and nkeep
for the posterior
sample size. control
configures the Metropolis-Hastings algorithm.
*_stepsize
with the asterisk replaced with one of the parameter names
indicates the step size for determining the sample evaluation points in
the localized Metropolis algorithm. Lastly, initial values for the
parameters can be provided in init
in case a user has a priori known
high-quality starting points.
For meta-analysis models, metapack acknowledges the possible existence of the first-line and second-line treatments trials. More generally, the trials may be grouped by a factor believed to generate disparate random effects. Although an arbitrary number of groups can exist in theory, we restrict our attention to two groups. Denoting the binary group indicators by \(u_{kt}\in \{0,1\}\) yields \[\label{eq:meta-first-second-line} y_{ktj} = \mathbf{x}_{ktj}^\top \mathbf{\beta} + (1-u_{kt})\mathbf{w}_{ktj}^\top \mathbf{\gamma}_{kj}^0 + u_{kt}\mathbf{w}_{ktj}^\top \mathbf{\gamma}_{kj}^1 + \epsilon_{ktj}. \tag{3}\] The random effects are modeled as \(\mathbf{\gamma}_{kj}^{l} \overset{\text{ind}}{\sim} \mathcal{N}({\mathbf{\gamma}_j^l}^*, \Omega_j^{l})\) and \((\Omega_j^{l})^{-1} \sim \mathcal{W}_{d_{0j}}(\Omega_{0j})\). Stacking the vectors, \(\mathbf{\gamma}_k^l = ((\mathbf{\gamma}_{k1}^l)^\top,\ldots,(\mathbf{\gamma}_{kJ}^l)^\top)^\top \sim \mathcal{N}({\mathbf{\gamma}^l}^*,\Omega^l)\), where \({\mathbf{\gamma}^l}^* = (({\mathbf{\gamma}_1^l}^*)^\top,\ldots,({\mathbf{\gamma}_J^l}^*)^\top)^\top\), \(\Omega_j =\Omega_j^0 \oplus \Omega_j^1\), and \(\Omega = \oplus_{j=1}^J \Omega_j\) for \(l\in \{0,1\}\). Adopting the noncentered parameterization (Bernardo et al. 2003), define \(\mathbf{\gamma}_{k,o}^l = \mathbf{\gamma}_k^l - {\mathbf{\gamma}^l}^*\). Denoting \(\mathbf{W}_{kt}^* = [(1-u_{kt})\mathbf{W}_{kt}, u_{kt}\mathbf{W}_{kt}]\), \(\mathbf{X}_{kt}^* = [\mathbf{X}_{kt}, \mathbf{W}_{kt}^*]\), \(\mathbf{\theta} = (\mathbf{\beta}^\top,{{\mathbf{\gamma}^0}^*}^\top, {{\mathbf{\gamma}^1}^*}^\top)^\top\), and \(\mathbf{\gamma}_{k,o} = ((\mathbf{\gamma}_{k,o}^0)^\top, (\mathbf{\gamma}_{k,o}^1)^\top)^\top\), the model is written as follows: \[\label{eq:final-mvmr} \mathbf{y}_{kt} = \mathbf{X}_{kt}^*\mathbf{\theta} + \mathbf{W}_{kt}^* \mathbf{\gamma}_{k,o} + \mathbf{\epsilon}_{kt}. \tag{4}\] If there is no distinction between the first-line and second-line therapies, then setting \(u_{kt} = 0\) for all \((k,t)\) reduces the model back to Equation (1). Finally, we assume \(\boldsymbol{\theta}\sim \mathcal{N}(\boldsymbol{0},c_0\boldsymbol{I})\) where \(\boldsymbol{I}\) is an identity matrix.
A (boilerplate) template for this class of models is as follows:
<- "y1 + y2 | sd1 + sd2 ~ x1 + x2 | w1 + w2 | treat + trial + groups"
f <- bmeta_analyze(formula(f), data = df,
fit prior = list(c0 = [real], dj0 = [real], Omega0 = [matrix],
a0 = [real], b0 = [real],
d0 = [real], nu0 = [real], Sigma0 = [matrix]),
control = list(sample_Rho = [logical], Rho_stepsize = [real],
R_stepsize = [real], delta_stepsize = [real], model = [string]))
We use real
and string
as aliases for double
and character
in R.
Every bracketed expression should be replaced with an instance of the
enclosed class. The hyperparameters in prior
and step sizes in
control
will be clarified in the following modeling options. Note that
all parameters with a step size are sampled through the
Metropolis-Hastings algorithm.
The covariance matrix between the response endpoints (\(\Sigma_{kt}\)) can
be modeled depending on (1) the amount of data available; and (2) what
assumptions the practitioner is willing to make. The diagonal elements
of \(\Sigma_{kt}\) are always identifiable whereas the off-diagonal
elements require additional modeling assumptions. metapack presently
offers five options, specifiable through model
in the control
argument. For \(\mathcal{M}_2\)–\(\mathcal{M}_5\), the unobserved sample
correlation matrices are sampled from their conditional distributions
\((R_{kt}\mid V_{kt},\Sigma_{kt})\) given by
\[\label{eq:sample-correlation-density}
p(R_{kt}\mid V_{kt},\Sigma_{kt}) \propto |R_{kt}|^{(n_{kt}-J-2)/2} \exp\left\{-\dfrac{(n_{kt}-1)}{2}\mathrm{tr}\left(V_{kt}^{\frac{1}{2}}\Sigma_{kt}^{-1}V_{kt}^{\frac{1}{2}}R_{kt} \right) \right\}. \tag{5}\]
model="NoRecovery"
) The simplest and easiest way
to model the covariance matrices is to relinquish correlation
recovery, in which case
Equation (5) is ignored. We assume
\(\Sigma_{kt} = \mathrm{diag}(\sigma_{kt,11}^2,\ldots, \sigma_{kt,JJ}^2)\)
with \(\sigma_{kt,jj}^2 \sim \mathcal{IG}(a_0,b_0)\) for \(a_0,b_0>0\),
where \(\mathcal{IG}(a,b)\) denotes the inverse-gamma distribution
whose density function is proportional to \(x^{-(a+1)} \exp(-b/x)\)
for \(x>0\). For univariate meta-analyses, this is the only valid
option since there are no off-diagonal entries.model="EquiCovariance"
) We can assume the
covariance matrix is for all pairs of treatments and trials. That
is, \(\Sigma_{kt} = \Sigma\) for every combination of \((k,t)\) for
\(t=1,\ldots,T\) and \(k=1,\ldots,K\). A Wishart prior distribution is
assumed for \(\Sigma^{-1}\), i.e.
\(\Sigma^{-1} \sim \mathcal{W}_{s_0}(\Sigma_0)\) for \(s_0>J-1\) and
\(\Sigma_0\in \mathcal{S}_{++}^J\).model="EquiWithinTreat"
) If the equal covariance
is too strong an assumption, we can allow the covariance matrices to
be equivalent within a treatment. That is, \(\Sigma_{kt} = \Sigma_t\)
for \(t=1,\ldots,T\). Similar to \(\mathcal{M}_2\), a Wishart prior
distribution is assumed for \(\Sigma_t^{-1}\), i.e.
\(\Sigma_t^{-1} \sim \mathcal{W}_{s_0}(\Sigma_0)\).model="EquiCorrelation"
) To achieve the best of
both worlds—variances and correlations—we can let the variances
enjoy maximum freedom but attempt to recover the correlations by
restricting them to be identical across treatments and trials for
identifiability. Performing the decomposition,
\[\Sigma_{kt} = \mathbf{\delta}_{kt}\mathbf{\rho} \mathbf{\delta}_{kt},\]
where
\(\mathbf{\delta}_{kt} = \mathrm{diag}(\Sigma_{kt,11}^{1/2},\ldots,\Sigma_{kt,JJ}^{1/2})\),
and \(\mathbf{\rho}\) is the correlation matrix, the elements in
\(\mathbf{\delta}_{kt}\) and \(\mathbf{\rho}\) are sampled through the
Metropolis-Hastings algorithm.model="Hierarchical"
) The hierarchical prior for
\(\Sigma_{kt}\) is given by
\((\Sigma_{kt}^{-1}\mid \Sigma) \sim \mathcal{W}_{\nu_0}((\nu_0-J-1)^{-1}\Sigma^{-1})\)
and \(\Sigma \sim \mathcal{W}_{d_0}(\Sigma_0)\). By allowing the
\(\Sigma_{kt}\)’s to differ but having them share information across
treatments and trials via \(\Sigma\), this assumption aims to control
the between-treatment and between-trial variations simultaneously.
Since the amount of information shared between treatments and trials
is controlled by \(\nu_0\) \((>J-1)\), it is advised to try multiple
values for \(\nu_0\) and perform a model assessment through the
deviance information criterion (DIC) or the logarithm of the
pseudo-marginal likelihood (LPML). \(\Sigma\) is further decomposed
for Metropolis-Hastings algorithm as
\(\Sigma=\Delta \boldsymbol{\rho}\Delta\) where
\(\Delta = \mathrm{diag}(\delta_1,\ldots,\delta_J)\) is the diagonal
matrix of standard deviations, and \(\boldsymbol{\rho}\) is a
correlation matrix with unit diagonal elements.For univariate network meta-analysis, the design matrix for random effects is restricted to be the selection matrix \(E_k^\top = (e_{t_{k1}}, e_{t_{k2}}, \ldots, e_{t_{k T_k}})^\top\), where \(e_{t_{kl}}=(0,\ldots, 1,\ldots,0)^\top\), \(l=1,\ldots,T_k\), with the \(t_{kl}\)th element set to 1 and 0 otherwise, and \(T_k\) is the number of treatments included in the \(k\)-th trial. Furthermore, we redefine \(\boldsymbol{\gamma}_{k,o}\) to be \(\boldsymbol{\gamma}_{k,o}:= E_k^\top \boldsymbol{\gamma}_k\), a vector of \(T_k\)-dimensional scaled random effects. The random effects \(\boldsymbol{\gamma}_k \sim t_T(\boldsymbol{\gamma},\boldsymbol{\rho},\nu)\), where \(t_T(\boldsymbol{\mu},\Sigma,\nu)\) denotes a multivariate \(t\)-distribution with \(\nu\) degrees of freedom, a location parameter vector \(\boldsymbol{\mu}\), and a scale matrix \(\Sigma\). \(T\) indicates the number of distinct treatments in all trials. The random effects \(\boldsymbol{\gamma}_k\) are scaled since \(\boldsymbol{\rho}\) is a correlation matrix with unit diagonal entries, and the variance components can be modeled as a multiplicative term. That is, with \(\mathbf{W}_k(\mathbf{\phi}) = \mathrm{diag}(\exp(\mathbf{w}_{kt_{k1}}^\top \mathbf{\phi}), \ldots, \exp(\mathbf{w}_{kt_{k T_{k}}}^\top \mathbf{\phi}))\), the model is recast as \[\mathbf{y}_{k} = \mathbf{X}_k^*\mathbf{\theta} + \mathbf{W}_k(\mathbf{\phi})\mathbf{\gamma}_{k,o} + \mathbf{\epsilon}_k,\] where \(\mathbf{X}_k^* = (\mathbf{X}_k, E_k^\top)\), \(\mathbf{\theta} = (\mathbf{\beta}^\top, \mathbf{\gamma}^\top)^\top\), \(\mathbf{\epsilon}_k \sim \mathcal{N}_{T_k}(\mathbf{0},\Sigma_k)\), and \(\Sigma_k = \mathrm{diag}\left(\frac{\sigma_{kt_{k1}}^2}{n_{kt_{k1}}}, \ldots, \frac{\sigma_{kt_{k T_{k}}}^2}{n_{kt_{k T_{k}}}}\right)\). This allows \(\exp(\boldsymbol{w}_{kt}^\top \boldsymbol{\phi})\) to be the standard deviation of \(\gamma_{kt}\). Since the multivariate \(t\)-random effects are not analytically marginalizable, we represent it as a scale mixture of normals as \[\label{eq:nma-hierarchical-re} (\mathbf{\gamma}_{k,o}\mid \lambda_k) \overset{\text{ind}}{\sim}\mathcal{N}_{T_k} \left(\mathbf{0},\lambda_k^{-1}(E_k^\top \mathbf{\rho}E_k) \right),\qquad \lambda_k \overset{\text{iid}}{\sim}\mathcal{G}a\left(\frac{\nu}{2},\frac{\nu}{2} \right), \tag{6}\] where \(\mathcal{G}a(a,b)\) indicates the gamma distribution with mean \(a/b\). Finally, \(\boldsymbol{\theta} \sim \mathcal{N}(\boldsymbol{0},c_{01}\boldsymbol{I})\) and \(\boldsymbol{\phi} \sim \mathcal{N}(\boldsymbol{0},c_{02}\boldsymbol{I})\).
A (boilerplate) template for this class of models is as follows:
<- "y | sd ~ x1 + x2 | w1 + w2 | treat + trial"
f <- bmeta_analyze(formula(f), data = df,
fit prior = list(c01 = [real], c02 = [real], df = [real],
a4 = [real], b4 = [real], a5 = [real], b5 = [real]),
control = list(sample_df = [logical], sample_Rho = [logical],
Rho_stepsize = [real], phi_stepsize = [real], lambda_stepsize = [real]))
Every bracketed expression should be replaced with an instance of the
enclosed class. The hyperparameters in prior
and step sizes in
control
will be clarified in the following modeling options. Note that
all parameters with a step size are sampled through the
Metropolis-Hastings algorithm.
The appeal of considering heavy-tailed random effects and modeling the
variance to be a deterministic linear function of a covariate is that
both extend but still cover the common cases (normal random effects and
no variance modeling) as either the limiting or a special case. Unlike
meta-analysis, there is no single argument (model
) determining the
modeling option. A model is rather specified by a combination of
arguments.
y | sd ~x1 + x2 | treat + trial
.bmeta_analyze
treats \(\nu\) (df
) as part of the prior
specification. Thus, df=Inf
in the prior
argument corresponds to
opting for normal random effects.sample_df
is placed in the
control
argument. If sample_df=TRUE
, the value for df
in
prior
will be used as the initial value, and the related
hyperparameters (a4
, b4
, a5
, and b5
) can be set in prior
.
For obvious reasons, df
cannot be assigned Inf
if
sample_df=TRUE
.The object (fit
) returned from bmeta_analyze
remembers the function
arguments, encapsulates the model specification, and contains the
posterior sample from the MCMC algorithm. Therefore, this object alone
can be passed to other methods to perform subsequent inferences. These
methods include fitted
, hpd
, coef
, model_comp
, and sucra
.
The posterior means, standard deviations, and the HPD intervals are
computed via the fitted()
function. The fitted()
function has two
optional arguments: level
and HPD
. level
determines the
credibility level of the interval estimation. HPD
, a logical
parameter, decides whether a highest posterior density or equal-tailed
credible interval will be produced. It is also possible to obtain the
posterior interval estimates only using hpd()
.
> est <- fitted(fit, level=0.99, HPD = TRUE)
R> hpd <- hpd(fit, level = 0.95, HPD = TRUE) R
coef(fit)
allows users to extract the posterior mean of the
fixed-effect coefficients.
It is crucial to determine a suitable model to base the statistical inference on. Section 4.3 introduces five models for \(\Sigma_{kt}\), and Section 4.4 contains infinitely many models since the degrees of freedom \(\nu\) for the random effects can assume any number on the positive real line including infinity. Such a circumstance calls for a principled way of comparing models as well as evaluating goodness of fit.
The deviance information criterion (Spiegelhalter et al. 2002), or DIC, is defined as follows: \[\mathrm{DIC} = \mathrm{Dev}(\bar{\mathbf{\eta}}) +2 p_D,\] where \(\mathbf{\eta}\) indicates all model parameters for which \(\bar{\mathbf{\eta}} = \mathbb{E}[\mathbf{\eta}\mid D_{\text{obs}}]\). \(\mathrm{Dev}(\mathbf{\eta})\) is the deviance function given by \(\mathrm{Dev}(\mathbf{\eta}) = -2 \log L_{o\mathbf{y}}(\mathbf{\eta}\mid D_{\text{obs}})\), for which \(L_{o\mathbf{y}}\) is the observed-data likelihood associated with \(\mathbf{y}\), and \(p_D\) is defined as \(p_D = \overline{\mathrm{Dev}(\mathbf{\eta})} -\mathrm{Dev}(\bar{\mathbf{\eta}})\) where \(\overline{\mathrm{Dev}(\mathbf{\eta})} =\mathbb{E}[\mathrm{Dev}(\mathbf{\eta})\mid D_{\text{obs}}]\).
> dic <- model_comp(fit, type = "dic", verbose = TRUE, ncores = 3) R
Another Bayesian model selection criterion is the logarithm of the pseudo-marginal likelihood (LPML), defined as the summed logarithm of the conditional predictive ordinates (CPO). The CPO has a “leave-one-out” predictive interpretation, which in meta-analyses is often overlooked—a significant oversight that will undermine model comparison. The aggregate nature of meta-analysis data calls for a redefinition of what is “left out” and what should be the base unit for prediction. With trials as the base unit, the CPO for the \(k\)-th trial is \[\mathrm{CPO}_k = \int L(\mathbf{\eta}\mid D_{o\mathbf{y}})p(\mathbf{\eta}\mid D_{o\mathbf{y}}^{-k},D_{o\mathbf{s}})\,\mathrm{d}\mathbf{\eta},\] where \(D_{o\mathbf{y}}^{-k}\) is \(D_{o\mathbf{y}}\) with the \(k\)-th trial removed, and \(p(\mathbf{\eta}\mid D_{o\mathbf{y}}^{-k},D_{o\mathbf{s}})\) is the posterior distribution based on the data without the \(k\)-th trial. Then, \(\mathrm{LPML} = \frac{1}{K}\sum_{k=1}^K \log (\mathrm{CPO}_k)\).
> lpml <- model_comp(fit, type = "lpml", verbose = TRUE, ncores = 3) R
Treatments included in only one trial violate the predictive
interpretation of the CPO. The corresponding trials thus should be
removed from LPML calculation. model_comp
returns the logarithm of the
CPO of every trial but corrects the LPML should such treatments exist.
For those occasions, a naive sum of the logarithm of the
\(\mathrm{CPO}_k\)’s will not equal the corrected LPML in the returned
object.
The DIC and LPML for the meta-analysis models are relatively
straightforward since the observed-data likelihood is a multivariate
normal density. However, network meta-analysis models require some
technical considerations in evaluating the following observed-data
likelihood that involves an analytically intractable integral when
\(t\)-random effects are used: slow computation speed and numerical
overflow. Observe the following observed likelihood function for the
network meta-analysis model:
\[\begin{aligned}
L_{o\mathbf{y}}(\mathbf{\eta}\mid D_\text{obs}) = \prod_{k=1}^K \int_0^\infty &(2\pi)^{-\frac{T_k}{2}}g(\lambda_k)\left| \lambda_k^{-1}(\mathbf{W}_k(\mathbf{\phi})E_k^\top\mathbf{\rho}E_k \mathbf{W}_k(\mathbf{\phi})) + \Sigma_k \right|^{-\frac{1}{2}}\\
&\times \exp\left\{-\dfrac{(\mathbf{y}_k - \mathbf{X}_k^*\mathbf{\theta})^\top\left[\lambda_k^{-1}\mathbf{W}_k(\mathbf{\phi})E_k^\top \mathbf{\rho}E_k \mathbf{W}_k(\mathbf{\phi})+\Sigma_k \right]^{-1}(\mathbf{y}_k - \mathbf{X}_k^*\mathbf{\theta})}{2}\right\}\,\mathrm{d}\lambda_k,
\end{aligned}\]
where \(g(\lambda_k)\) is the gamma density with shape and rate parameters
\(\nu/2\). The integral is evaluated via double exponential (DE)
quadrature, or equivalently tanh-sinh quadrature
(Takahasi and Mori 1974; Bailey et al. 2005), available in the math header of
the BH package
(Eddelbuettel et al. 2019). DE quadrature is robust to singularities, terminates
fast, and provides high precision (Bailey et al. 2005). We address the slow
speed through “shared memory multiprocessing programming” via OpenMP
(OpenMP Architecture Review Board 2018). OpenMP is a widely used application programming interface
(API) for portable and scalable parallel processing in C, C++, and
Fortran across many operating systems. As long as R is configured for
OpenMP, metapack will deploy parallelism. Unless the argument
ncores
is specified otherwise, model_comp
will use two CPU cores for
parallel computing by default.
To prevent overflow, we take the following steps:
Although the exponential shifting scheme does not warrant preventing every occurrence of numerical overflow, we have observed stable evaluations of the integral for over several thousand batches of simulations.
It is important to diagnose whether the results are consistent with the
assumptions and visualize the findings. metapack provides methods for
these: plot
and sucra
. The plot
method is available for both
meta-analysis and network meta-analysis whereas sucra
is exclusively
for network meta-analysis.
The plot
method will take the fit
object and generate the density
plots and trace plots of \(\mathbf{\theta}\). To see the plots for other
parameters, run the following commands using
coda (Plummer et al. 2006):
> library("coda")
R> posterior <- as.mcmc(data.frame(gammaR = fit$mcmc.draws$gamR,
R+ sig2 = fit$mcmc.draws$sig2))
> plot(posterior) R
Similarly, boa (Smith 2007) can be used for output analysis. The posterior sample of \(\mathbf{\rho}\) comes in three-dimensional arrays, which requires suitable indexing to generate trace plots for the off-diagonal lower-triangular elements. To generate such trace plots, run the following command:
> idx <- lower.tri(fit$mcmc.draws$Rho[,,1])
R> n_idx <- ncol(idx) * (ncol(idx) - 1) / 2
R> posterior <- as.mcmc(data.frame(
R+ rho = t(vapply(1:fit$mcmc$nkeep, function(ikeep) {
+ rho_i <- fit$mcmc.draws$Rho[,,ikeep]
+ rho_i[idx]
+ }, FUN.VALUE=numeric(n_idx)))))
> plot(posterior) R
The surface under the cumulative ranking (SUCRA) curve is useful when
the ranking of the treatments in a network meta-analysis is of interest.
Based on the posterior sample, \(\mathbb{P}(t,r)\) denotes the probability
that the treatment \(t\) is ranked \(1\leq r \leq T\) for \(t=1,\ldots,T\).
Let \(\mathbf{P}\) be the \(T\times T\) discrete rank (row-stochastic) probability matrix whose \((t,r)\)th
element is \(\mathbb{P}(t,r)\). The cumulative probability is then
computed through \(F(t,x) = \sum_{r=1}^x \mathbb{P}(t,r),\) where \(F(t,x)\)
is the probability that the \(t\)-th treatment is ranked \(x\) or better.
Since \(F(t,T)=1\) for every \(t\), the surface under the cumulative ranking
distribution for the \(t\)-th distribution is given by
\[\mathrm{SUCRA}(t) = \dfrac{1}{T-1}\sum_{x=1}^{T-1}F(t,x).\]
sucra
will take the fit
object and return the SUCRA and the discrete
rank probability matrix \(\mathbf{P}\).
> s <- sucra(fit)
R> s$SUCRA
R> s$rankprob R
If only plotting is needed, the storage of the sucra
object can be
bypassed via running plot(sucra(fit))
. Note that SUCRA \(s\) has a
bijection with the mean rank \(r\) of a treatment, \(r = 1 + (1-s)(T-1)\),
where \(T\) is the number of treatments.
metapack includes a data set, cholesterol
, which consists of 26
double-blind, randomized, active, or placebo-controlled clinical trials
on patients with primary hypercholesterolemia sponsored by Merck & Co.,
Inc., Kenilworth, NJ, USA (Yao et al. 2015). The data set can be loaded
by running data("cholesterol")
. The cholesterol
data set has three
endpoints: low density lipoprotein cholesterol (pldlc
), high density
lipoprotein cholesterol (phdlc
), and triglycerides (ptg
). The
percent change from the baseline in the endpoints, variables prefixed by
p-
, are the aggregate responses, followed by the corresponding
standard deviations prefixed by sd-
.
Variable | Description |
---|---|
study |
study identifier |
trial |
trial identifier |
treat |
treatment indicator for Statin or Statin+Ezetimibe |
n |
the number of participants in the study arms |
pldlc |
aggregate percentage change in LDL-C |
phdlc |
aggregate percentage change from baseline in HDL-C |
ptg |
aggregate percentage change from baseline in triglycerides (TG) |
sdldl |
sample standard deviation of percentage change in LDL-C |
sdhdl |
sample standard deviation of percentage change in HDL-C |
sdtg |
sample standard deviation of percentage change in triglycerides (TG) |
onstat |
whether the participants were on Statin prior to the trial |
bldlc |
baseline LDL-C |
bhdlc |
baseline HDL-C |
btg |
baseline triglycerides (TG) |
age |
age in years |
white |
the proportion of white participants |
male |
the proportion of male participants |
dm |
the proportion of participants with diabetes mellitus |
durat |
duration in weeks |
Variable documentation is also available in help("cholesterol")
.
> set.seed(2797542)
R> f_1 <- 'pldlc + phdlc + ptg | sdldl + sdhdl + sdtg ~
R+ 0 + bldlc + bhdlc + btg + age + durat + white + male + dm + ns(n) | treat |
+ treat + trial + onstat'
> fit_ma <- bmeta_analyze(formula(f_1), data = cholesterol,
R+ prior = list(model="NoRecovery"),
+ mcmc = list(ndiscard = 1000, nkeep = 1000),
+ control=list(scale_x = TRUE, verbose=TRUE))
summary
can be used to summarize the posterior sample from the fit
object. summary
will name the variables accordingly in the output,
suffixed by the index \(j\) in the corresonding outcome variable
\(y_{ktj}\). For example, bldlc_1
in the output below corresponds to the
base LDL-C’s coefficient associated with the first endpoint (pldlc
).
The summary table consists of the posterior mean, posterior standard
deviation, the HPD lower bound, and the HPD upper bound. Users may
choose to compute the equal-tailed credible intervals (CI) instead of
the HPD intervals by setting HPD=FALSE
.
> summary(fit_ma, HPD = TRUE, level = 0.95)
R:
Callbmeta_analyze(formula = formula(f_1), data = cholesterol,
prior = list(model = "NoRecovery"),
mcmc = list(ndiscard = 1000, nkeep = 1000), control = list(scale_x = TRUE,
verbose = TRUE))
-effects:
FixedHPD(Lower) HPD(Upper)
Post.Mean Std.Dev 0.1394 0.0938 -0.0532 0.3051
bldlc_1 -0.5146 0.6329 -1.7641 0.7243
bhdlc_1 0.0018 0.0818 -0.1693 0.1607
btg_1 0.3785 0.4389 -0.5308 1.1648
age_1 0.9699 0.5234 -0.1658 1.9979
durat_1 -5.2449 7.4048 -19.7345 8.9764
white_1 -1.2652 12.3948 -25.6807 23.0538
male_1 0.3987 5.3392 -10.4995 9.4673
dm_1 -0.0128 0.0137 -0.0393 0.0134
bldlc_2 -0.0294 0.1280 -0.2887 0.2149
bhdlc_2 0.0228 0.0212 -0.0203 0.0619
btg_2 -0.0963 0.0735 -0.2422 0.0460
age_2 -0.0007 0.0712 -0.1320 0.1382
durat_2 5.5822 1.3044 3.0802 8.1901
white_2 -2.1449 2.6956 -7.5975 2.6164
male_2 -1.0457 1.0923 -3.1816 1.0341
dm_2 0.0141 0.0406 -0.0719 0.0853
bldlc_3 0.0082 0.2637 -0.5278 0.4872
bhdlc_3 -0.0734 0.0467 -0.1598 0.0154
btg_3 -0.1385 0.2043 -0.5077 0.2591
age_3 0.0996 0.2546 -0.3718 0.6080
durat_3 -0.7053 5.2345 -11.0570 8.9591
white_3 14.5482 7.1735 0.0642 28.2959
male_3 5.0535 2.9542 -1.4157 10.5545
dm_3 *(1-2nd)_1 -42.8675 3.2934 -48.8356 -35.8794
(Intercept)*(1-2nd)_1 -12.1435 1.1211 -14.4040 -10.0135
treat*2nd_1 -3.2219 3.0426 -9.3109 2.7419
(Intercept)*2nd_1 -20.0843 1.5235 -23.3169 -17.2637
treat*(1-2nd)_2 5.1425 0.5840 3.8240 6.1036
(Intercept)*(1-2nd)_2 2.0787 0.4718 1.0663 3.0222
treat*2nd_2 0.7346 0.5814 -0.3377 1.8738
(Intercept)*2nd_2 1.3482 0.3116 0.7579 1.9880
treat*(1-2nd)_3 -18.5035 2.1543 -22.3232 -13.7985
(Intercept)*(1-2nd)_3 -4.7726 0.9952 -6.7784 -2.9511
treat*2nd_3 -4.1805 1.2910 -6.7672 -1.7005
(Intercept)*2nd_3 -8.8579 0.9443 -10.6208 -7.0530
treat---------------------------------------------------
*HPD level: 0.95
The suffixed _j
where j
can be 1, 2, or 3 corresponds to the
response endpoint. Since scale_x = TRUE
is equivalent to
scale(<var>, center=TRUE, scale=TRUE)
, the covariates have been
centered, which affects the interpretation of the intercepts. This
allows us to interpret (Intercept*(1-2nd)_1)=-42.8675
as the statin
effect in the first-line studies, where 2nd
represents the indicator
variable for second-line studies evaluating to one if second-line and
zero otherwise. On the other hand, the coefficient estimate -12.1435 for
treat*(1-2nd)_1
is the Statin+Ezetimibe effect, compared to
administering statin alone. For the second-line studies where patients
had already been on statin, (Intercept)*2nd_1=-3.2219
came out
insignificant, according to the 95% HPD interval, as anticipated because
the treatment for this group was merely a continuation of taking statin.
The coefficient estimate -20.0843 for treat*2nd_1
shows that ezetimibe
on top of statin has a greater cholesterol-lowering effect than statin
alone.
print
is similar to summary
but additionally prints the model
specification. The output from print(fit_ma)
is given as follows.
> print(fit_ma, HPD = TRUE, level = 0.95)
R:
Callbmeta_analyze(formula = formula(f_1),
data = cholesterol, prior = list(model = "NoRecovery"),
mcmc = list(ndiscard = 1000, nkeep = 1000),
control = list(scale_x = TRUE, verbose = TRUE))
:
Model
(Aggregate mean)= X_kt * theta + W_kt * gamma_k + N(0, Sigma_kt / n_kt)
y_kt
(Sample Variance)- 1) S_kt ~ Wishart(n_kt - 1, Sigma_kt)
(n_kt
(Random effects)| Omega] ~ N(0, Omega)
[gamma_k :
Priors~ MVN(0, 1e+05 * I_p)
theta ^{-1} ~ Wishart( 2.1 , Omega0)
Omega_j= diag(sig_{tk,11}^2, ..., sig_{tk,JJ}^2)
Sigma_kt ^2 ~ IG( 0.1 , 3.1 )
where sig_{tk,jj}---------------------------------------------------
: 26
Number of trials: 52
Number of arms: 2
Number of treatmentsHPD(Lower) HPD(Upper)
Post.Mean Std.Dev 0.1394 0.0938 -0.0532 0.3051
bldlc_1 -0.5146 0.6329 -1.7641 0.7243
bhdlc_1 0.0018 0.0818 -0.1693 0.1607
btg_1 0.3785 0.4389 -0.5308 1.1648
age_1 0.9699 0.5234 -0.1658 1.9979
durat_1 -5.2449 7.4048 -19.7345 8.9764
white_1 -1.2652 12.3948 -25.6807 23.0538
male_1 0.3987 5.3392 -10.4995 9.4673
dm_1 -0.0128 0.0137 -0.0393 0.0134
bldlc_2 -0.0294 0.1280 -0.2887 0.2149
bhdlc_2 0.0228 0.0212 -0.0203 0.0619
btg_2 -0.0963 0.0735 -0.2422 0.0460
age_2 -0.0007 0.0712 -0.1320 0.1382
durat_2 5.5822 1.3044 3.0802 8.1901
white_2 -2.1449 2.6956 -7.5975 2.6164
male_2 -1.0457 1.0923 -3.1816 1.0341
dm_2 0.0141 0.0406 -0.0719 0.0853
bldlc_3 0.0082 0.2637 -0.5278 0.4872
bhdlc_3 -0.0734 0.0467 -0.1598 0.0154
btg_3 -0.1385 0.2043 -0.5077 0.2591
age_3 0.0996 0.2546 -0.3718 0.6080
durat_3 -0.7053 5.2345 -11.0570 8.9591
white_3 14.5482 7.1735 0.0642 28.2959
male_3 5.0535 2.9542 -1.4157 10.5545
dm_3 *(1-2nd)_1 -42.8675 3.2934 -48.8356 -35.8794
(Intercept)*(1-2nd)_1 -12.1435 1.1211 -14.4040 -10.0135
treat*2nd_1 -3.2219 3.0426 -9.3109 2.7419
(Intercept)*2nd_1 -20.0843 1.5235 -23.3169 -17.2637
treat*(1-2nd)_2 5.1425 0.5840 3.8240 6.1036
(Intercept)*(1-2nd)_2 2.0787 0.4718 1.0663 3.0222
treat*2nd_2 0.7346 0.5814 -0.3377 1.8738
(Intercept)*2nd_2 1.3482 0.3116 0.7579 1.9880
treat*(1-2nd)_3 -18.5035 2.1543 -22.3232 -13.7985
(Intercept)*(1-2nd)_3 -4.7726 0.9952 -6.7784 -2.9511
treat*2nd_3 -4.1805 1.2910 -6.7672 -1.7005
(Intercept)*2nd_3 -8.8579 0.9443 -10.6208 -7.0530
treat---------------------------------------------------
*HPD level: 0.95
For model comparison, the deviance information criterion (DIC) and the
logarithm of the pseudo marginal likelihood (LPML) can be computed using
the model_comp
method. The DIC will also contain
\(\mathrm{Dev}(\bar{\mathbf{\eta}})\) and \(p_D\). Similarly, the LPML will
contain the logarithm of the CPOs, which is omitted.
> dic <- model_comp(fit_ma, "dic")
R> lpml <- model_comp(fit_ma, "lpml")
R> c(dic$dic, dic$Dev, dic$pD)
R1] 827.80726 734.50691 46.65018
[
> lpml$lpml
R1] -428.1813 [
metapack includes another data set, TNM
, which consists of 29
studies, dubbed the Triglycerides Network Meta (TNM) data
(Li et al. 2019). The data set has 73 observations and 15 variables,
which can be loaded via data("TNM")
. The aggregate response variable
is the mean percentage difference in triglycerides (ptg
), paired with
its corresponding standard deviation (sdtg
).
Variable | Description |
---|---|
trial |
trial identifier |
treat |
lovastatin (L), rosuvastatin (R), pravastatin (P), ezetimibe (E), |
simvastatin+ezetimibe (SE), atorvastatin+ezetimibe (AE), | |
lovastatin+ezetimibe (LE), or pravastatin+ezetimibe (PE) | |
n |
the number of participants in the study arms |
ptg |
percentage change from baseline in triglycerides (TG) |
sdtg |
sample standard deviation of percentage change in triglycerides (TG) |
bldlc |
baseline LDL-C |
bhdlc |
baseline HDL-C |
btg |
baseline triglycerides (TG) |
age |
age in years |
white |
the proportion of white participants |
male |
the proportion of male participants |
bmi |
body fat index |
potencymed |
the proportion of medium statin potency |
potencyhigh |
the proportion of high statin potency |
durat |
duration in weeks |
Similarly to cholesterol
, variable descriptions are also available
through help("TNM")
.
LPML in Section 5.0.0.1 is not the only quantity
affected by the treatments only included in a single trial. The
variances of the corresponding treatment effects are nonestimable.
(Li et al. 2019) proposes to group those treatments and allow the
treatments in a group to share the same variance. This grouping scheme
can be easily achieved using match
in R:
> TNM$group <- factor(match(TNM$treat, c("PBO", "R"), nomatch = 0)) R
In the following demonstration, we consider the following model:
> f_2 <- 'ptg | sdtg ~
R+ 0 + bldlc + bhdlc + btg + age + white + male + bmi +
+ potencymed + potencyhigh + durat + ns(n) |
+ scale(bldlc) + scale(btg) + group | treat + trial'
The model can be fit by running
> set.seed(2797542)
R> fit_nma <- bmeta_analyze(formula(f_2), data = TNM,
R+ mcmc = list(ndiscard = 1000, nskip = 1, nkeep = 1000),
+ control=list(scale_x = TRUE, verbose=TRUE))
Again, the model summary can be obtained using either summary
or
print
with minor differences.
> summary(fit_nma)
R:
Callbmeta_analyze(formula = formula(f_2), data = TNM, mcmc = list(ndiscard = 1000,
nskip = 1, nkeep = 1000), control = list(scale_x = TRUE,
verbose = TRUE))
in network meta-regression models
Posterior inference -effects:
FixedHPD(Lower) HPD(Upper)
Post.Mean Std.Dev -0.0071 0.0171 -0.0409 0.0230
bldlc 0.2332 0.2490 -0.2802 0.7027
bhdlc 0.0846 0.0366 0.0006 0.1460
btg -0.0068 0.1028 -0.2003 0.2020
age -7.2148 1.9867 -10.7112 -3.0757
white 1.2323 7.3305 -12.5660 15.6644
male -0.4505 0.3534 -1.1544 0.2184
bmi 6.8278 7.9203 -8.5359 22.4551
potencymed -0.6474 7.9330 -16.9216 14.4191
potencyhigh 0.1880 0.1879 -0.1638 0.5652
durat -24.3175 1.7820 -27.9255 -20.9635
A -30.3604 2.9591 -35.7474 -24.2297
AE -5.2737 6.1207 -17.4431 6.4066
E -11.6521 4.3747 -20.1106 -2.7431
L -26.9507 3.4730 -33.3236 -19.9223
LE -8.3357 4.5065 -16.9311 0.7870
P 1.6170 6.0629 -9.9366 13.6298
PBO -22.7722 3.6333 -29.6129 -15.7540
PE -17.7669 2.0010 -21.3989 -13.3839
R -19.1616 1.2175 -21.7647 -16.9279
S -21.8271 2.0509 -25.8031 -17.6347
SE ---------------------------------------------------
*HPD level: 0.95
We observe that with covariate adjustment, all active treatments (A, AE, E, L, LE, P, PE, R, S, SE) reduce triglyceride (TG) more effectively than the placebo (PBO), although E and P have 95% HPD intervals including zero.
The output from print(fit_nma)
further includes the model
specification, and summary statistics for \(\boldsymbol{\phi}\).
> print(fit_nma)
R:
Callbmeta_analyze(formula = formula(f_2), data = TNM, mcmc = list(ndiscard = 1000,
nskip = 1, nkeep = 1000), control = list(scale_x = TRUE,
verbose = TRUE))
:
Model
(Aggregate mean)= x_kt'theta + tau_kt * gamma_kt + N(0, sigma_kt^2 / n_kt)
y_kt (Sample Variance)
(n_kt - 1) S^2 / sigma_kt^2 ~ chi^2(n_kt - 1)
(Random effects)
[gam | Rho,nu] ~ MVT(0, E_k' Rho E_k, nu)
:
Priors~ MVN(0, c01 * I_p), c01= 1e+05
theta ~ MVN(0, c02 * I_q), c02= 4
phi p(sigma^2) ~ 1/sigma^2 * I(sigma^2 > 0)
p(Rho) ~ 1
---------------------------------------------------
: 29
Number of studies: 73
Number of arms: 11
Number of treatmentsHPD(Lower) HPD(Upper)
Post.Mean Std.Dev -0.0071 0.0171 -0.0409 0.0230
bldlc 0.2332 0.2490 -0.2802 0.7027
bhdlc 0.0846 0.0366 0.0006 0.1460
btg -0.0068 0.1028 -0.2003 0.2020
age -7.2148 1.9867 -10.7112 -3.0757
white 1.2323 7.3305 -12.5660 15.6644
male -0.4505 0.3534 -1.1544 0.2184
bmi 6.8278 7.9203 -8.5359 22.4551
potencymed -0.6474 7.9330 -16.9216 14.4191
potencyhigh 0.1880 0.1879 -0.1638 0.5652
durat -24.3175 1.7820 -27.9255 -20.9635
A -30.3604 2.9591 -35.7474 -24.2297
AE -5.2737 6.1207 -17.4431 6.4066
E -11.6521 4.3747 -20.1106 -2.7431
L -26.9507 3.4730 -33.3236 -19.9223
LE -8.3357 4.5065 -16.9311 0.7870
P 1.6170 6.0629 -9.9366 13.6298
PBO -22.7722 3.6333 -29.6129 -15.7540
PE -17.7669 2.0010 -21.3989 -13.3839
R -19.1616 1.2175 -21.7647 -16.9279
S -21.8271 2.0509 -25.8031 -17.6347
SE 0.4088 0.2716 -0.1610 0.8882
phi1 -0.3248 0.3869 -1.1721 0.2902
phi2 0.2692 0.2239 -0.1859 0.6835
phi3 -1.0862 1.2024 -3.2686 0.9419
phi4 0.5973 0.3407 -0.0341 1.2829
phi5 ---------------------------------------------------
*HPD level: 0.95
The model comparison measures are computed using model_comp
. For
example,
> dic <- model_comp(fit_nma, "dic")
R> c(dic$dic, dic$Dev, dic$pD)
R1] 386.41450 334.72118 25.84666
[
> lpml <- model_comp(fit_nma, "lpml")
R> lpml$lpml
R1] -161.518 [
The plot
method will generate trace plots and density plots of the
fixed-effect coefficients. Figure 2 shows the
trace plots and density plots of treatments R, S, and SE from the
network meta-analysis model, generated by plot(fit_nma)
.
In addition to the MCMC diagnostics, network meta models can be
visualized using SUCRA plots, i.e. plot(sucra(fit_nma))
.
Figure 3 shows the SUCRA plot from the fit_nma
object. The plot
function for SUCRA uses
ggplot2 (Wickham 2016)
and gridExtra
(Auguie 2017) to generate and combine plots.