The metaplus package is described with examples of its use for fitting meta-analysis and meta-regression. For either meta-analysis or meta-regression it is possible to fit one of three models: standard normal random effect,
Meta-analysis is a method of combining the results of different studies to produce one overall result (Jones et al. 2000). Meta-regression is an extension to meta-analysis which allows study-specific effect sizes to change depending on study-specific covariates. For example there may be studies comparing a drug to placebo, with varying doses of the drug used in the different studies. It is possible that the effectiveness of the drug will vary with dose, in a linear or nonlinear relationship, and by including this in the model the unexplained variation is reduced.
One of the difficulties in combining studies is that the differences between studies may be greater than would be indicated by the variation within each study. This is allowed for by the random effect model where the effect for each study has two components: an overall effect and a random component specific to each study, with the random component traditionally assumed to have a normal distribution. The model without a random effect is known as the fixed effect model, which is equivalent to a random effect model with zero variance for the random effect.
One difficulty is that the assumption of a normally distributed random
effect may be unrealistic, with a particular violation that the tails
are heavier than would be expected. While it has been shown that results
are robust to moderate violations of the normality assumption
(Kontopantelis and Reeves 2012), this does not apply to more extreme cases. One
solution to this is to use an alternative to the normal distribution for
the random effect, for example the
The purpose of the
metaplus package
(Beath et al. 2016) is to fit the two robust models with random effects based on
the
The random effect meta-analysis model assumes that the observed
treatment effect
where
An extension, known as meta-regression, to the random effect meta-analysis model is to include covariates to explain the heterogeneity (Jones et al. 2000 51). Incorporating this into the meta-analysis model we obtain
where
In metaplus there are three available random effect distributions:
The probability density function for study
This distribution was introduced as one of a number of distributions
for robust meta-analysis by Lee and Thompson (2008) and Baker and Jackson (2008). This approach
replaces the normal random effect distribution with a
where
This assumes that a study can belong to one of two classes, where each class is a standard random effect model with the same mean but different random effect variance, which is higher for the outlier class (Beath 2014). The robust meta-regression model takes the form
where
with the constraints that
A difficulty with the use of standard maximum likelihood techniques for
random effect models is that they produce biased estimates for the
variance of the random effect, which results in biased estimates of the
standard errors for the parameters of interest, and therefore poor
coverage using Wald-type confidence intervals. The solution for
meta-analysis has been the use of Wald-type confidence intervals
obtained from models fitted using residual maximum likelihood (REML),
but this is difficult for the robust models. However, profile likelihood
based confidence intervals (Pawitan 2001 61) have been found to be
superior (Hardy and Thompson 1996), and these are used for all fitted models. The
profile likelihood based confidence intervals are obtained from routines
based on the mle2
function in the package
bbmle (Bolker and R Core Team 2014) which
provides an extended version of mle
. The
Testing for the need for the robust distributions requires a test of
For both robust models the starting values are important, as the
optimisation used to obtain the maximum likelihood may converge to a
local minimum. For the
For the
For the robust mixture model a generalized EM (GEM) algorithm is used. The usual method for generating starting values for a mixture model using the EM algorithm, as described in McLachlan and Peel (2000 55), is to randomly allocate subjects to each group in the initial E step. This is repeated for a number of random allocations and the resulting model fit with the highest maximum likelihood used as the fitted model. For the outlier models this usually requires a large number of random allocations, and therefore model fits, due to the small number in the outlier class.
The method used in metaplus is to systematically generate the initial outliers in the E step with an increasing number of initial outliers, starting with no outliers. For a given number of outliers in the selected initial set all possible initial sets are fitted with the restriction that each set of initial outliers builds on the best set of initial outliers found for the previous number of outliers. For example if, when considering single initial outliers, study 10 as the initial outlier produces the highest maximum likelihood then study 10 would be included in all pairs of studies when considering models with two initial outliers. When the maximum likelihood does not increase the process is stopped.
The main function available in metaplus is metaplus
, with associated
methods outlierProbs
and testOutliers
specific to metaplus, with
the arguments for each shown in Table 1. The function
metaplus
fits a meta-analysis model to the studies, with results
extracted using summary
, and plotted using plot
. The plot
method
makes use of the forest
method in metafor allowing the same
customisations of the plots. An additional argument specific to plot
in metaplus is extrameta
, which allows for extra meta-analysis
results to be plotted. This allows for different models (i.e. standard
and robust) to be compared, or for meta-regression to show the overall
effect at different values of the covariates. An alternative method of
plotting is to use
forestplot
(Gordon and Lumley 2015) which allows some other customisations, but will require
combining the data from the studies and summaries. The method
testOutliers
tests for the presence of outliers for the robust models
using the parametric bootstrap. The method outlierProbs
determines the
posterior probability of each study being an outlier for the normal
mixture model. The returned object has an associated plot
method to
plot the outlier probabilities. The returned results are shown in
Table 2.
metaplus() arguments |
|
---|---|
yi |
Vector of observed effect sizes corresponding to each study. |
sei |
Vector of observed standard errors corresponding to each study. |
mods |
Data frame of covariates corresponding to each study (only required for a meta-regression model). |
random |
The type of random effect distribution. One of "normal" , "t-dist" , "mixture" , for standard normal, |
label |
The label to be used for this model when producing the summary line on the forest plot. This allows for identification of the model when comparing multiple models. |
plotci |
Should a diagnostic plot for the profile likelihood be made? See the package bbmle documentation for further details. |
justfit |
Should the model only be fitted? If only the model is fitted then profiling and likelihood ratio test statistics are not calculated. This is useful for bootstrapping to reduce computation time. |
slab |
Vector of character strings corresponding to each study. This is used only to label the plots. |
useAGQ |
Should adaptive Gauss-Hermite quadrature be used with the |
quadpoints |
Number of quadrature points for the adaptive Gauss-Hermite quadrature. |
data |
Optional data frame in which to search for other variables. |
outlierProbs() arguments |
|
object |
“metaplus” object. |
testOutliers() arguments |
|
object |
“metaplus” object. |
R |
Number of simulations used in the parametric bootstrap. |
metaplus() |
|
---|---|
results |
Matrix containing columns for estimate, lower and upper 95% confidence interval and justfit = TRUE then only the parameter estimates are returned. |
yi |
Vector of observed effect sizes. |
sei |
Vector of observed standard errors corresponding to each effect size. |
mods |
Data frame of covariates corresponding to each study (only returned from a meta-regression model). |
fittedmodel |
Final model returned from bbmle. |
justfit |
Value of justfit passed to metaplus . |
random |
Type of random effect. |
slab |
Vector of character strings corresponding to each study. This is used to label the forest plot. |
outlierProbs() |
|
outlier.prob |
Vector of posterior probabilities that the study is an outlier corresponding to each study. |
slab |
Vector of labels for the studies. |
testOutliers() |
|
pvalue |
|
observed |
Observed value of the likelihood ratio test statistic. |
sims |
Vector of simulated values of the test statistic under the null hypothesis. |
In the following examples, both robust options are used to demonstrate
the capabilities of the package. In practice it will be required to
choose which model to use when determining the final result. This should
be the better fitting model, which can be determined using either AIC or
BIC. Where the outliers are extreme the
A number of studies have been performed to determine the effectiveness
of intravenous magnesium in acute myocardial infarction, and a
meta-analysis is performed in Sterne et al. (2001). The studies have caused
considerable controversy, as the results of a single large study ISIS-4
(ISIS-4: Collabarative Group 1995) contradicts the results of a meta-analysis. Higgins and Spiegelhalter (2002)
discuss some of the history and some suggested methods from a Bayesian
perspective, Woods (2002) comments on the variability between studies due
to timing of infusion, and Downing (1999) on the higher level of dose used
in ISIS-4, with a more recent meta-analysis by Li et al. (2009). Of interest is
whether, given the heterogeneity between studies, the ISIS-4 study is
unusual. The data have been obtained in the form of log odds ratios for
mortality where negative values correspond to treatment benefit, but if
raw data in the form of number of events per number of patients is
available, then these can be converted using, for example, the escalc
function in the metafor package. The standard random effect
meta-analysis can be performed, and the parameter estimates obtained as
follows:
> mag.meta <- metaplus(yi, sei, slab = study, data = mag)
> summary(mag.meta)
Est. 95% ci.lb 95% ci.ub pvalue
muhat -0.7463 -1.2583 -0.3428 0.000501
tau2 0.2540
logLik AIC BIC
-19.68459 43.36918 44.91436
Adding the argument plotci = TRUE
will produce a plot giving details
of the profile confidence intervals, as shown in
Figure 1. The basis of the plot is that the profile log
likelihood in the region of the maximum likelihood estimate should be
asymptotically quadratic. As differences from a quadratic are difficult
to determine by eye, a transformation is performed to the muhat
. An important variation from the “V”
occurs when either half of the curve may not be monotonic, indicating
that the profile likelihood is multi-modal and if this occurs in a
region affecting the confidence interval then the calculated confidence
interval may be incorrect. It may also be an indication that the model
used is incorrect or that there is insufficient data for the fitted
model.
The forest plot showing the studies and overall effect can be obtained
using plot(mag.meta)
. The metaplus package uses the forest plot
capabilities of the metafor package which allows the arguments for the
forest
plot in metafor to be used when plotting. As the results for
the magnesium studies are log odds ratios it is more useful to produce
plots with units of odds ratios. This can be obtained by annotating the
horizontal axis with odds ratios corresponding to the log odds, and
requesting an exponential transformation for the coefficients, as shown
in the following code, and the plot is shown in
Figure 2. The documentation for the metafor package
should be investigated for further modifications. Under some systems the
characters will not be properly spaced. This can be solved by using the
extrafont (Chang 2014)
package and a fixed width font, for example Courier New
.
> plot(mag.meta, atransf = exp, at = log(c(.01, .1, 1, 10, 100)),
+ xlab = "Odds Ratio", cex = 0.75)
The meta-analysis is repeated using a random = "t-dist"
argument. From the summary the
estimate of vinv
, the inverse degrees of freedom, is zero
corresponding to infinite degrees of freedom, or a normal distribution.
The BIC is also a guide, with an increase for the
> mag.tdist <- metaplus(yi, sei, slab = study, random = "t-dist", data = mag)
> summary(mag.tdist)
Est. 95% ci.lb 95% ci.ub pvalue
muhat -0.7463 -1.2583 -0.3430 0.000501
tau2 0.2540
vinv 0.0000
logLik AIC BIC
-19.68459 45.36918 47.68695
This can be confirmed with the testOutliers
command, which performs a
parametric bootstrap to obtain the null distribution of the likelihood
ratio test statistic for the test that
> summary(testOutliers(mag.tdist))
Observed LRT statistic 0.0 p value 1
The analysis can be repeated using the robust mixture distribution for
the random effect. The variance of both the random effect for standard
studies (tau2
) and for outlier studies (tau2out
) are very close
indicating that there are no outlier studies and this is confirmed by
the outlier test.
> mag.mix <- metaplus(yi, sei, slab = study, random = "mixture", data = mag)
> summary(mag.mix)
Est. 95% ci.lb 95% ci.ub pvalue
muhat -0.7463147 -1.2593989 -0.3427085 0.000777
tau2 0.2539981
tau2out 0.2540892
Outlier prob. 0.0001904
logLik AIC BIC
-19.68459 47.36918 50.45954
> summary(testOutliers(mag.mix))
Observed LRT statistic 0.0 p value 1
This meta-analysis evaluates the effect of CDP choline for cognitive and behavioural disturbances associated with chronic cerebral disorders in the elderly (Fioravanti and Yanagi 2005) using standardised mean differences of memory measures as the outcome. A study (Bonavita 1983) was previously determined to be an outlier by Gumedze and Jackson (2011). A standard random effect meta-analysis will be fitted first, as previously.
> cdp.meta <- metaplus(yi, sei, slab = study, data = cdp)
> summary(cdp.meta)
Est. 95% ci.lb 95% ci.ub pvalue
muhat 0.38944 0.07269 0.76634 0.0218
tau2 0.14666
logLik AIC BIC
-8.198544 20.39709 21.00226
A robust model using the
> cdp.tdist <- metaplus(yi, sei, slab = study, random = "t-dist", data = cdp)
> summary(cdp.tdist)
Est. 95% ci.lb 95% ci.ub pvalue
muhat 1.946e-01 5.296e-02 3.610e-01 0.00899
tau2 4.478e-05
vinv 2.024e+00
logLik AIC BIC
-4.057683 14.11537 15.02312
> summary(testOutliers(cdp.tdist))
Observed LRT statistic 8.3 p value 0.001
As a rough guide, the decrease in AIC and BIC demonstrates that the model is an improvement, and this is confirmed with the outlier test. The fit is repeated using the robust mixture.
> cdp.mix <- metaplus(yi, sei, slab = study, random = "mixture", data = cdp)
> summary(cdp.mix)
Est. 95% ci.lb 95% ci.ub pvalue
muhat 0.1910 0.0563 0.3479 0.00711
tau2 0.0000
tau2out 3.1558
Outlier prob. 0.1237
logLik AIC BIC
-3.007145 14.01429 15.22463
> summary(testOutliers(cdp.mix))
Observed LRT statistic 10.4 p value 0.001
The output from the robust mixture model has an interesting feature. For standard studies the estimated random effect variance is zero, indicating that only the outlier studies are contributing to the heterogeneity. The posterior probability of each study being an outlier can be obtained as:
> cdp.mix.outlierProbs <- outlierProbs(cdp.mix)
and plotted using plot(cdp.mix.outlierProbs)
in
Figure 3. This shows clearly that Bonavita 1983 has a
posterior probability of nearly 1.0 of being an outlier. The other
studies have a non-zero posterior probability of being outliers, as
there is an overlap between the distribution of the standard and outlier
studies, but are relatively close to zero.
Lastly, a forest plot with the results of all three models is generated,
using the extrameta
parameter to add the robust models,
i.e. plot(cdp.meta, extrameta = list(cdp.tdist, cdp.mix))
, and these
are shown in Figure 4, where it can be noted that
Bonavita 1983 has an unusually high value. The effect of the robust
models is to down-weight the Bonavita 1983 study, which has the
consequence of both reducing the overall effect estimate and its
standard error.
This example is a meta-analysis of trials of exercise in the management of depression (Lawlor and Hopker 2001). Higgins and Thompson (2004) used the data as an example of meta-regression using a number of covariates, which will be limited here to a single covariate, the duration of trial. The outcome is effect size calculated using Cohen’s method. First the meta-analysis using standard normal random effect and the robust mixture model are performed. The data will be ordered by duration to assist in identifying a variation from the linear relationship.
> exercise <- exercise[order(exercise$duration), ]
> exercise.meta <- metaplus(smd, sqrt(varsmd), mods = duration, slab = study,
+ data = exercise)
> summary(exercise.meta)
Est. 95% ci.lb 95% ci.ub pvalue
muhat -2.8994 -4.3006 -1.5222 0.000884
tau2 0.1171
duration 0.2078 0.0584 0.3632 0.011570
logLik AIC BIC
-8.133435 22.26687 23.17462
> exercise.mix <- metaplus(smd, sqrt(varsmd), mods = duration, slab = study,
+ random = "mixture", data = exercise)
> summary(exercise.mix)
Est. 95% ci.lb 95% ci.ub pvalue
muhat -2.88472 -4.11082 -1.48262 0.000649
tau2 0.00000
tau2out 0.59398
Outlier prob. 0.25169
duration 0.21086 0.07808 0.34586 0.007052
logLik AIC BIC
-7.69139 25.38278 26.8957
> exercise.testOutliers <- testOutliers(exercise.mix)
> summary(exercise.testOutliers)
Observed LRT statistic 0.9 p value 0.075
> exercise.outlierProbs <- outlierProbs(exercise.mix)
The test for outliers was close to being significant (plot(exercise.outlierProbs)
the outlier probabilities are shown in
Figure 5 where the study by Reuter is an obvious
outlier with a posterior probability greater than 0.9. This study is a
dissertation and was not published in a peer-reviewed journal, and was
not included in a later meta-analysis by Krogh et al. (2011). There is also
strong evidence of the effect of trial duration.
As metaplus does not currently have a predict
method, the
alternative to calculate the effect at each of Weeks 4, 8 and 12 is to
centre the data at those times and fit a meta-regression for each
(Johnson and Huedo-Medina 2011). The intercept for each meta-regression will then be the
estimated mean effect at that time. A model without including the
covariate for study duration is also fitted. The forest plot is shown in
Figure 6. This shows that the effect of exercise
decreases rapidly the longer the trial runs, possibly indicating a
placebo effect that rapidly wears off. It would also be possible to
include the results from the standard random effect models on the plot.
> exercise$duration4 <- exercise$duration - 4
> exercise$duration8 <- exercise$duration - 8
> exercise$duration12 <- exercise$duration - 12
> exercise.nodurn <- metaplus(smd, sqrt(varsmd),
+ label = "Random Mixture (No Duration)", slab = study,
+ random = "mixture", data = exercise)
> exercise.wk4 <- metaplus(smd, sqrt(varsmd),
+ mods = duration4, label = "Random Mixture (Week 4)",
+ slab = study, random = "mixture", data = exercise)
> exercise.wk8 <- metaplus(smd, sqrt(varsmd),
+ mods = duration8, label = "Random Mixture (Week 8)",
+ slab = study, random = "mixture", data = exercise)
> exercise.wk12 <- metaplus(smd, sqrt(varsmd),
+ mods = duration12, label = "Random Mixture (Week 12)",
+ slab = study, random = "mixture", data = exercise)
> plot(exercise.nodurn, extrameta = list(exercise.wk4, exercise.wk8,
+ exercise.wk12), xlab = "Effect size")
The capabilities of the metaplus package have been presented for
fitting both standard normal random effect and robust random effect
models. Using three examples it has been shown how it can test for the
presence of outliers and compare the results of the robust and standard
methods for both meta-analysis and meta-regression. The package has also
been successfully applied to meta-analyses with larger number of
studies, for example Marinho et al. (2009) with 70 studies and 3 definite
outliers, and simulated data with 200 studies. One difficulty with large
number of studies is the increasing computation time, especially for
testOutliers
. This will be improved by the use of parallel processing
as a future enhancement.
The design of the package allows for expansion in other areas. A planned
future functionality is to fit binary data, using likelihood methods
based on the distribution of the binomial responses, rather than the log
odds ratios fitted using a normal distribution which is the method
currently used. The robust methods can then be applied in a similar way
to the current models. A possible future expansion is to allow for other
robust distributions although this doesn’t seem necessary given the
similarity of the results obtained in Baker and Jackson (2008) to those using the
The author would like to thank the two anonymous reviewers for their comments and suggestions that have greatly improved the quality of the manuscript and the metaplus package, and provided suggestions for future enhancements.
metaplus, metafor, bbmle, forestplot, extrafont
ClinicalTrials, MetaAnalysis, Phylogenetics, Robust
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Beath, "metaplus: An R Package for the Analysis of Robust Meta-Analysis and Meta-Regression", The R Journal, 2016
BibTeX citation
@article{RJ-2016-001, author = {Beath, Ken J.}, title = {metaplus: An R Package for the Analysis of Robust Meta-Analysis and Meta-Regression}, journal = {The R Journal}, year = {2016}, note = {https://rjournal.github.io/}, volume = {8}, issue = {1}, issn = {2073-4859}, pages = {5-16} }