Approximate Bayesian computation (ABC) is a popular family of
algorithms which perform approximate parameter inference when
numerical evaluation of the likelihood function is not possible but
data can be simulated from the model. They return a sample of
parameter values which produce simulations close to the observed
dataset. A standard approach is to reduce the simulated and observed
datasets to vectors of summary statistics and accept when the
difference between these is below a specified threshold. ABC can also
be adapted to perform model choice.
In this article, we present a new software package for R, abctools
which provides methods for tuning ABC algorithms. This includes recent
dimension reduction algorithms to tune the choice of summary
statistics, and coverage methods to tune the choice of threshold. We
provide several illustrations of these routines on applications taken
from the ABC literature.
Approximate Bayesian computation (ABC) refers to a family of statistical techniques for inference in cases where numerical evaluation of the likelihood is difficult or intractable, ruling out standard maximum likelihood and Bayesian techniques. It has been successfully applied in a wide range of scientific fields which encounter complex data and models, such as population genetics (Fagundes et al. 2007; Beaumont 2010), ecology (Csilléry et al. 2010), infectious disease modelling (Luciani et al. 2009; Brooks-Pollock et al. 2014), systems biology (Ratmann et al. 2007; Toni et al. 2009) and astronomy (Cameron and Pettitt 2012; Weyant et al. 2013).
ABC performs inference based on simulation of datasets rather than
likelihood evaluation. For this reason it is known as a
likelihood-free method. The simplest ABC algorithm is rejection-ABC.
This simulates parameter values from the prior and corresponding
datasets from the model of interest. Parameters are accepted if the
distance between summary statistics of the simulated and the observed
data is below a threshold,
The remainder of the article is organised as follows. First a review of
relevant ABC algorithms, theory and software is given. Then two data
examples are introduced which will be used for illustration throughout
the paper. The following section describes the summary statistic
selection methods provided by abctools. The final section considers
choice of
The following algorithms perform ABC for parameter inference or model
choice. This is done in a Bayesian framework. Observed data is
represented by
The ABC algorithms require that it is possible to sample from the priors
and models. They also require various tuning choices: a distance
function
Initialisation: For the observed dataset
Main loop:
Draw a parameter vector
Generate data from the model
If
Initialisation: For the observed dataset
Main loop:
Draw a model
Draw a parameter vector
Generate data from the model
If
Both algorithms output a sample from an approximation to the posterior
distribution. That is, for parameter inference the output is
If
Two crucial tuning choices in rejection-ABC are the tolerance
There are several ABC algorithms which are more efficient than
rejection-ABC. These concentrate on simulating from models and parameter
values close to previously successful values. These include Markov chain
Monte Carlo (Marjoram et al. 2003; Sisson and Fan 2011) and sequential Monte Carlo
(SMC) techniques
(Sisson et al. 2007; Beaumont et al. 2009; Toni et al. 2009; Del Moral et al. 2012).
A complementary approach is to post-process ABC output to reduce the
approximation in using
This section details existing software available for ABC, then outlines how abctools provides previously unavailable methodology and how it can be used alongside other software. Existing software is detailed in Table 1.
Name | References | Stand-alone | Platform | Models |
---|---|---|---|---|
abc | (Csilléry et al. 2012) | No (R package) | All | General |
ABCreg | (Thornton 2009) | Yes | Linux, OS X | General |
easyABC | (Jabot et al. 2013) | No (R package) | All | General |
ABCtoolbox | (Wegmann et al. 2010) | Yes | Linux, Windows | Genetics |
Bayes-SSC | (Anderson et al. 2005) | Yes | All | Genetics |
DIY-ABC | (Cornuet et al. 2008, 2010; Cornuet et al. 2014) | Yes | All | Genetics |
msBayes | (Hickerson et al. 2007) | Yes | Linux, OS X | Genetics |
MTML-msBayes | (Huang et al. 2011) | Yes | Linux, OS X | Genetics |
onesamp | (Tallmon et al. 2008) | Yes (web interface) | All | Genetics |
PopABC | (Lopes et al. 2009) | Yes | All | Genetics |
REJECTOR | (Jobin and Mountain 2008) | Yes | All | Genetics |
EP-ABC | (Barthelmé and Chopin 2014) | No (MATLAB toolbox) | All | State space models (and related) |
ABC-SDE | (Picchini 2013) | No (MATLAB toolbox) | All | Stochastic differential equations |
ABC-SysBio | (Liepe et al. 2010) | Yes (Python scripts) | All | Systems biology |
The software varies widely in which ABC algorithms are implemented. Of the two R packages, abc implements ABC-rejection with many methods of regression post-processing, while easyABC implements a wider suite of ABC algorithms but not post-processing. For full details of the other software see the references in Table 1.
Some of the available software packages provide methods for selecting
summary statistics. A projection method based on partial least squares
(Wegmann et al. 2009) is available in ABCtoolbox, and one for model
choice based on linear discriminant analysis (Estoup et al. 2012) in
DIY-ABC. Another category of methods is regularisation techniques,
for example via ridge regression (Blum and François 2010; Blum et al. 2013). Ridge
regression regularisation is implemented in the R package abc; see
(Csilléry et al. 2012) for more details. The abc package also provide a
method to choose
The abctools package has been designed to complement the existing software provision of ABC algorithms by focusing on tools for tuning them. It implements many previously unavailable methods from the literature and makes them easily available to the research community. The software has been structured to work easily in conjunction with the abc package, but the package also has the flexibility to be used with other ABC software. This is discussed below (under “4.4”), along with details of how the package framework can be used to implement further emerging methodology for summary statistic selection and construction.
The first dataset represents data generated from a commonly used model
in population genetics. Specifically, the abctools package contains
the two datasets coal
and coalobs
. The dataset coal
is a matrix of
dimension 100000 x 9
, representing parameters and summaries generated
from an infinite-sites coalescent model for genetic variation (see
Nordborg (2007) for more details). In particular, the parameters of interest
are the scaled mutation rate,
The data coalobs
is a matrix of dimension 100 x 9
, representing
similar instances of summary statistics from the model and associated
parameters; these can be treated as observed data. Similar data were
analysed in simulations in (Joyce and Marjoram 2008) and (Nunes and Balding 2010).
The datasets can be loaded with data(coal)
and data(coalobs)
respectively.
A bigger dataset with
> mycon <- url("http://www.maths.lancs.ac.uk/~nunes/ABC/coaloracle.rda")
> load(mycon)
> close(mycon)
The g-and-k distribution, used in various applications such as finance
and environmental modelling, is a family of distributions which is
specified by its quantile distribution, but does not have a closed form
expression for its density (Rayner and MacGillivray 2002). Data can easily be
simulated by the inversion method. The dataset included in the
abctools package is a matrix of dimension 100000 x 11
consisting of
n = 100000
simulations of 4 parameters (A, B, g and k), together with
7 summary statistics representing the octiles of 1000 independent draws
given the corresponding parameters. Such quantiles have been used for
inference in an ABC context by (Drovandi and Pettitt 2011) and
(Fearnhead and Prangle 2012), amongst others.
The dataset can be loaded using the code:
> mycon <- url("http://www.maths.lancs.ac.uk/~nunes/ABC/gkdata.rda")
> load(mycon)
> close(mycon)
The code used to generate these simulations is available at http://www.maths.lancs.ac.uk/~nunes/ABC/gksim.R.
Identifying an informative and low-dimensional set of summaries to
represent high dimensional data for use in ABC methods is of high
importance for meaningful posterior inference; a number of methods to
achieve this have been proposed in the statistical literature. We assume
there is a prespecified set of input statistics of the data
In what follows we describe the implementations of a number of methods
for choosing summary statistics in the abctools package, namely the
approximate sufficiency algorithm of (Joyce and Marjoram 2008); the
entropy criterion and two-stage methods of (Nunes and Balding 2010), and the
semi-automatic ABC projection technique of
(Fearnhead and Prangle 2012). For summary statistics selection the user
must simulate parameters and data and supply these to the package. The
resulting summary statistics can then be passed to another package to
perform ABC. This form of operation makes abctools particularly suited
to rejection-ABC. Note however, that many of the main routines in this
section have similar arguments, indicative of the flexible and modular
nature of the package. Indeed, the final part of this section discusses
the selectsumm
wrapper function which can be used to implement any of
the methods, as well as using abctools with other user-defined ABC
routines.
As outlined above, the principle of summary subset selection methods is
to select a subset of informative statistics
(Joyce and Marjoram 2008) introduced a method of summary selection based
on a measure of approximate sufficiency. The idea of the sufficiency
criterion is that, if a (sub)set of summaries is sufficient for
The hypothesis test is performed by the abctools function AS.test
.
The function has inputs x1
and x2
, representing approximate
posterior samples for the density without or including the statistic
being tested, respectively. The test returns a Boolean variable (TRUE
or FALSE
) indicating whether the second posterior sample (as
represented by x2
) is sufficiently different from the first posterior
sample x1
.
As an example of this, running the code
> unif.sample <- runif(10000); norm.sample <- rnorm(10000)
> AS.test(x1 = unif.sample, x2 = norm.sample)
[1] TRUE
results in a statement that the two posterior samples x1
and x2
are
judged to be statistically different.
To decide on the final set of summaries, the test is performed as a
sequential search, testing candidate statistics from AS.select
. The main arguments of the function are:
Input statistics corresponding to observed data,
ndatasets x k
.
Simulated parameters (drawn from a prior) which were used to
generate simulated data under the model; a matrix of dimension
nsims x p
.
Input statistics param
; a matrix of dimension nsims x k
.
After performing the summary search procedure, the AS.select
function
returns the final subset of statistics best
component of
the output. if the optional trace
argument is set to TRUE
(the
default), the function will print messages to inform the user about the
summary statistics search.
An example of using the AS.select
function using the coalescent data
described above is shown below.
> data(coal); data(coalobs)
> param <- coal[, 2]
> simstats <- coal[, 3:9]
> obsstats <- matrix(coalobs[1, 3:9], nrow = 1)
> set.seed(1)
> ASchoice <- AS.select(obsstats, param, simstats)
Sumstat order for testing is: 2 3 6 4 1 7 5
Current subset is: empty Test adding: 2
Empty subset not allowed - add
Current subset is: 2 Test adding: 3
No significant change to ABC posterior - don't add
Current subset is: 2 Test adding: 6
No significant change to ABC posterior - don't add
Current subset is: 2 Test adding: 4
No significant change to ABC posterior - don't add
Current subset is: 2 Test adding: 1
No significant change to ABC posterior - don't add
Current subset is: 2 Test adding: 7
Significant change to ABC posterior - add
Consider removing previous summaries
Current subset is: 2 7 Test removing: 2
No significant change to ABC posterior - remove
Current subset is: 7 Test adding: 5
No significant change to ABC posterior - don't add
Selected summaries: 7
> ASchoice$best
[1] 7
The result of the sequential search is that out of the summary subsets
tested, the single summary subset
Another sequential search algorithm for summary statistics in the
abctools package is the flexible minimum criterion function
mincrit
. Essentially, this function cycles through each subset of
summaries in turn, and computes a specified criterion on the ABC
posterior sample produced with that particular set of summaries. The
best subset nn.ent
function. For example, for
the 4th nearest neighbour entropy calculation for a posterior sample
psample
, one would use the command
> nn.ent(psample, k = 4)
The mincrit
function has many of the same arguments as the AS.select
function above, including obs
, param
and sumstats
, see the
mincrit
function documentation in the package abctools for a full
list. Other function arguments include crit
, which specifies the
criterion to minimise. The default for this is nn.ent
. The heuristic
for this criterion as suggested by (Nunes and Balding 2010) is that the entropy
measures how concentrated the posterior is, and thus how much
information is contained within the sample. However, other measures of
spread or informativeness could be used in the crit
argument instead
of nn.ent
.
Since mincrit
performs an exhaustive search of all subsets of AS.select
function, the user can limit the
search to subsets of a maximum size, using the limit
argument.
Internally, this calls the function combmat
to produce subsets on
which to perform the criterion. For example combmat(4)
produces a
matrix of all subsets of size combmat(4, limit = 2)
computes a matrix of all
C1 C2 C3 C4
[1,] 1 0 0 0
[2,] 0 1 0 0
[3,] 0 0 1 0
[4,] 0 0 0 1
[5,] 1 1 0 0
[6,] 1 0 1 0
[7,] 1 0 0 1
[8,] 0 1 1 0
[9,] 0 1 0 1
[10,] 0 0 1 1
In addition, the search can be limited by setting the argument sumsubs
to a particular subset of initial summaries. This has the effect of only
considering subsets containing those statistics. Alternatively, with the
argument do.only
, the user can specify certain summary subsets to
consider. This can either be in matrix format like the output from
combmat
, or a vector of indices indicating rows of combmat(k)
for
which to compute the crit
criterion.
To run the minimum criterion search algorithm, one could do:
> entchoice <- mincrit(obsstats, param, simstats, crit = nn.ent,
+ do.only = 1:30)
This would only consider the first 30 subsets as specified in
combmat(ncol(obsstats))
.
The mincrit
function returns a list object with the following
components:
If do.crit = TRUE
, a matrix representing the computed crit
criterion values.
A matrix representing the best subset (which minimises crit
).
A matrix (or vector) of subsets considered by the search algorithm.
This component reflects the choice of input do.only
.
The index of the initial pool of statistics considered in the
search. By default, this is set to 1:ncol(obsstats)
.
The best subset is judged to be the best
component of the output:
> entchoice$best
[,1] [,2]
20 3 5
As a refinement of the entropy-based summary selection procedure,
(Nunes and Balding 2010) propose running a second summary search based on the best
subset found by minimum entropy. The closest simulated datasets to
dsets
. The second stage selects a subset of summaries which minimises
a measure of true posterior loss when ABC is performed on these
datasets. This is done by comparing the ABC output to the true
generating parameter values by some criterion. The default is
calculating relative sum of squares error (RSSE). Since this second
stage is effectively a search similar in form to that performed by
mincrit
, the functionality of mincrit
is exploited by calling it
internally within stage2
. By default, the posterior loss minimisation
is computed with the function rsse
. The argument init.best
specifies
which subset to use as a basis to perform the second ABC analysis, e.g.,
the best subset chosen by the minimum entropy criterion. Other arguments
to this function mimic those of mincrit
.
An example call for this function is
> twostchoice <- stage2(obsstats, param, simstats, dsets = 25,
+ init.best = 20, do.only = 1:30)
> twostchoice$best
[,1] [,2]
21 3 6
The output object is the same as that of mincrit
, with the exception
that in addition, stage2
also returns the dsets
simulated datasets
deemed closest to the observed data
When the set of input statistics
Linear regression is a crude tool to estimate
Perform an ABC pilot run using summary statistics chosen subjectively or using another method. Use this to determine the region of main posterior mass, referred to as the training region.
Simulate parameters
Fit regressions as detailed above for various choices of
Choose the best fitting regression (e.g., using BIC) and run ABC using the corresponding summaries. For robustness it is necessary to truncate the prior to the training region; our experience is that without such truncation artefact posterior modes may appear outside the training region.
Note that in rejection-ABC the same simulations can be used for the pilot ABC, training and main ABC steps, if desired. Also, step 1 can be omitted and the entire parameter space used as the training region. This is simpler, but empirical evidence shows that in some situations the training step is crucial to good performance (Fearnhead and Prangle 2012; Blum et al. 2013).
abctools provides two functions for semi-automatic ABC. To facilitate
a quick analysis, semiauto.abc
performs a simple complete analysis;
this uses rejection-ABC, avoids selecting a training region (i.e., it
uses the full parameter space instead), and uses a single prespecified
choice of saABC
implements step 3 only. We describe only the former here as the latter
is a very straightforward function. The main arguments of semiauto.abc
are:
Input statistics corresponding to observed data. This is a matrix of
dimension ndatasets x k
. In fact only a subset
satr
.
Simulated parameters (drawn from a prior) which were used to
generate simulated data under the model; a matrix of dimension
nsims x p
.
Input statistics param
; a matrix of dimension nsims x k
.
A list
of functions, representing the vector of transformations
to perform on the features sumstats
, with which to estimate the
relationship to the parameters
Other arguments to the function are the same as mincrit
; see the
saABC
documentation for more details.
To perform semi-automatic ABC using the vector of elementwise
transformations
> saabc <- semiauto.abc(obsstats, param, simstats,
+ satr = list(function(x) {
+ outer(x, Y = 1:4, "^")}))
Alternatively, the same transformations could be specified by setting
satr
to list(function(x) cbind(
x, x^2, x^3, x^4))
. This
alternative way of choosing this argument uses a single function
which
outputs all four transformations as a vector.
The output from the semiauto.abc
function is similar to that of
mincrit
, except that the output object also has a component sainfo
,
containing relevant choices of arguments pertaining to the ABC runs in
steps 1 and 4 above. More specifically, the sainfo
component is a list
with information about the simulations used to perform each of the ABC
runs, as well as the vector of transformations satr
.
An example of semiauto.abc
on the g-and-k dataset is as follows. The
corresponding results and an analysis on another dataset are shown in
Figure 2.
> mycon <- url("http://www.maths.lancs.ac.uk/~nunes/ABC/gkdata.rda")
> load(mycon)
> close(mycon)
> params <- gkdata[, 1:4]
> octiles <- gkdata[, 5:11]
> obs <- octiles[9, ]
> tfs <- list(function(x){cbind(x, x^2, x^3, x^4)})
> saabc <- semiauto.abc(obs = obs, param = params, sumstats = octiles,
+ satr = tfs, overlap = TRUE, saprop = 1,
+ abcprop = 1, tol = 0.001, method = "rejection",
+ final.dens = TRUE)
> dens <- kde2d(saabc$post.sample[, 1], saabc$post.sample[, 2])
> filled.contour(dens, xlab = "A", ylab = "B")
An example on the coal
data is as follows. Results are shown in Figure
3.
> data(coal)
> data(coalobs)
> coalparams <- coal[, 1:2]
> coaldata <- coal[, 3:9]
> coalobs <- coal[1, 3:9]
> mytf <- list(function(x){cbind(x, x^2, x^3, x^4)})
> saabc.coal <- semiauto.abc(obs = coalobs, param = coalparams,
+ sumstats = coaldata, satr = mytf,
+ tol = 0.001, overlap = TRUE, saprop = 1,
+ abcprop = 1, method = "rejection",
+ final.dens = TRUE)
> dens.coal <- kde2d(saabc.coal$post.sample[, 1],
+ saabc.coal$post.sample[, 2])
> filled.contour(dens.coal, xlab = "theta", ylab = "rho")
coal
example, based on
summary statistics chosen by semi-automatic ABC. The true parameter
values are indicated by crosses. selectsumm
convenience wrapperThe summary selection methods described in this section can be used with
the individual functions as described above. Alternatively, the
abctools package contains a convenient generic function selectsumm
,
with which any of the summary statistics choice algorithms can be
performed. The argument ssmethod
can be any of the functions described
above, for example mincrit
. Note that any other arguments to the
ssmethod
function can be passed to selectsumm
easily. In particular,
many of the summary selection routines have common optional arguments,
for example
An optional matrix of true parameters corresponding to the observed
summaries obs
. This is useful if the function is used to test
summary selection techniques on fake observed data (for which you
know the generating parameters).
A function which performs an ABC algorithm, for example the abc
function from the abc R package. Other user-defined functions can
also be supplied; see below for more details. By default, the
ssmethod
function uses the abc
rejection-ABC algorithm, with a
tolerance of tol = 0.01
.
An (optional) integer value indicating whether to limit the search
to subsets of a particular maximum size. For example, limit = 3
would only consider potential subsets of statistics
A logical variable indicating whether the simulation error should be
computed to assess the performance of the selection algorithm. This
is only relevant if obspar
is supplied.
A logical variable. If final.dens = TRUE
, then the final
approximate posterior sample is returned, resulting from the ABC
algorithm (abcmethod
) using the final subset of summaries
A function used to compute the simulation error between the
posterior sample and the generating parameter values obspar
. An
example of such a function included in the abctools package is the
relative sum of squares error (RSSE), computed using the function
rsse
.
Note that the selectsumm
function can perform summary selection for
any number of observed summary vectors; the function implements the
ssmethod
on each row of the obsstats
argument. Examples of the
selectsumm
function call are
> ASchoice <- selectsumm(obsstats, param, simstats, ssmethod = AS.select)
or
> mycon <- url("http://www.maths.lancs.ac.uk/~nunes/ABC/gkdata.rda")
> load(mycon)
> close(mycon)
> param <- gkdata[, 1:2] # A and B parameters
> simstats <- gkdata[, 5:11]
> obsstats <- gkdata[9:10, 5:11] # treated as real data
> entchoicegk <- selectsumm(obsstats, param, simstats, ssmethod = mincrit,
+ crit = nn.ent, limit = 3, final.dens = TRUE,
+ do.err = TRUE, obspar = gkdata[9:10, 1:2])
> entchoicegk$best
S1 S2 S3 S4 S5 S6 S7
23 0 0 0 1 1 0 0
23 0 0 0 1 1 0 0
If do.err = TRUE
, then the inference error (as compared with the truth
in obspar
) is computed using the errfn
function and is also returned
in the err
component of the function output. In addition, if
final.dens = TRUE
the output list element post.sample
will contain
the approximate posterior sample from the ABC inference corresponding to
using abcmethod
ABC inference function. For example, for
the entchoicegk
object, the approximate posterior sample corresponds
to the algorithm abcmethod
using the subset (of size kde2d
from package
MASS (Venables and Ripley 2002):
> dens1 <- kde2d(entchoicegk$post.sample[, 1, 1],
+ entchoicegk$post.sample[, 2, 1])
> dens2 <- kde2d(entchoicegk$post.sample[, 1, 2],
+ entchoicegk$post.sample[, 2, 2])
> filled.contour(dens1, xlab = "A", ylab = "B")
> filled.contour(dens2, xlab = "A", ylab = "B")
The resulting posterior densities are shown in Figure 4.
Any other arguments to be passed to the function specified by the
abcmethod
argument can also be included. For more details on the
optional arguments for the abc
function see (Csilléry et al. 2012).
The flexibility of the abctools package can be exploited by using
user-defined ABC algorithm implementations through the abcmethod
argument to all of the ABC summary choice methods, namely AS.select
,
mincrit
, stage2
and semiauto.abc
, or the convenience wrapper
selectsumm
, described above. The only constraint on the user’s code
for the ABC method is that it must return an object with a component
named either adj.values
or unadj.values
containing the approximate
posterior sample, to (minimally) mimic a return object of class "abc"
.
For example, if one had written a function likefreemcmc
to perform
likelihood-free Markov chain Monte Carlo, one could use this in
combination with a minimal criterion computed on the resulting (MCMC)
posterior samples using the code:
> mcmcabc <- mincrit(obsstats, param, simstats, abcmethod = likefreemcmc)
To use abctools within ABC inference methods implemented by generic
software, simply supply an appropriate R wrapper function to the
abcmethod
argument.
User-defined ABC summary selection methods can be accommodated with the
abctools package. A new projection method, projABC
say, could be
implemented using the wrapper selectsumm
as follows:
> projchoice <- selectsumm(obsstats, param, simstats, ssmethod = projABC)
For the implementation to work, the summary choice function must have
arguments named obsstats
, param
, simstats
for the observed data,
simulated parameters and simulated summaries respectively, as well as
the logical argument final.dens
indicating whether the approximate
posterior sample is to be returned. Optional arguments could also be
passed to projABC
through the selectsumm
wrapper.
The abctools package can also test the accuracy of an ABC analysis, in
particular to help choose the
(Monahan and Boos 1992) and (Cook et al. 2006) showed that an equivalent condition to
the coverage property is that the distribution of
The following code illustrates a typical analysis using the cov.pi
(parameter inference) and cov.mc
(model choice) functions. For the
choice of tolerance
> library(abctools); library(ggplot2)
> data(human)
> ## Summary statistics for bottleneck model:
> stat.italy.sim <- subset(stat.3pops.sim, subset = (models == "bott"))
> ## Interesting epsilon values:
> myeps <- exp(seq(log(0.5), log(10), length.out = 15))
> set.seed(1)
> mytestsets <- sample(1:nrow(stat.italy.sim), 200)
> covout.pi <- cov.pi(param = par.italy.sim, sumstat = stat.italy.sim,
+ testsets = mytestsets, eps = myeps,
+ multicore = TRUE, diagnostics = c("KS", "CGR"),
+ cores = 4)
> qplot(x = eps, y = pvalue, colour = test,
+ data = subset(cabc.out$diag, parameter == "Ne"), log = "y")
> mytestsets <- sample(nrow(stat.3pops.sim), 200)
> covout.mc <- cov.mc(index = models, sumstat = stat.3pops.sim,
+ testsets = mytestsets, eps = myeps,
+ diagnostics = c("freq", "loglik.binary"),
+ multicore = TRUE, cores = 4)
> qplot(x = eps, y = pvalue, colour = test, data = covout.mc$diag,
+ log = "y")
The code analyses the human
dataset supplied in the abc package
(Csilléry et al. 2012), which contains simulated parameter values and summary
statistics for a population genetic model. The cov.pi
function
estimates 200 diag
component of the
output. The left panel of Figure 7 plots bottleneck
model satisfies coverage. Again, coverage holds for large
and small
human
dataset example.
The left-hand graph is for the bottleneck
model. The right-hand graph is for model choice considering the adequacy
of the bottleneck
model
predictions.(Prangle et al. 2014) argue that raw
component of
the output and can be plotted as follows, giving Figure
8. The mc.ci
command is part of abctools.
> par(mfrow = c(1, 2))
> ## nb myeps[8] is 2.2
> hist(subset(covout.pi$raw, eps == myeps[8])$Ne, xlab = "p0",
+ main = "Parameter inference")
> mc.ci(covout.mc$raw, eps = myeps[8], modname = "bott",
+ modtrue = models, main = "Model choice")
human
dataset
example with The left-hand side of Figure 8 shows that for
bottleneck
model.
This article has described the R package abctools. This implements
several techniques for tuning approximate Bayesian inference algorithms.
In particular, the package contains summary statistic selection routines
for the approximate sufficiency method of (Joyce and Marjoram 2008); the
entropy minimisation and two-stage error algorithm proposed by
(Nunes and Balding 2010); and the regression method of (Fearnhead and Prangle 2012).
It also contains methods to choose the acceptance threshold
The authors would like to thank Scott Sisson and Michael Blum for some helpful suggestions when preparing the abctools R package accompanying this manuscript.
Bayesian, Distributions, Econometrics, Environmetrics, MixedModels, NumericalMathematics, Psychometrics, Robust, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nunes & Prangle, "abctools: An R Package for Tuning Approximate Bayesian Computation Analyses", The R Journal, 2015
BibTeX citation
@article{RJ-2015-030, author = {Nunes, Matthew A. and Prangle, Dennis}, title = {abctools: An R Package for Tuning Approximate Bayesian Computation Analyses}, journal = {The R Journal}, year = {2015}, note = {https://rjournal.github.io/}, volume = {7}, issue = {2}, issn = {2073-4859}, pages = {189-205} }