gk: An R Package for the g-and-k and Generalised g-and-h Distributions
Abstract:
The g-and-k and (generalised) g-and-h distributions are flexible univariate distributions which can model highly skewed or heavy tailed data through only four parameters: location and scale, and two shape parameters influencing the skewness and kurtosis. These distributions have the unusual property that they are defined through their quantile function (inverse cumulative distribution function) and their density is unavailable in closed form, which makes parameter inference complicated. This paper presents the gk R package to work with these distributions. It provides the usual distribution functions and several algorithms for inference of independent identically distributed data, including the finite difference stochastic approximation method, which has not been used before for this problem.
Statisticians have long sought for a simple extension to the normal
distribution which can model data subject to skew, heavy tails or both.
One approach is to transform a standard normal random variable
to
where and are location and scale parameters,
introduces asymmetry, and elongates the tails of the
distribution while having little effect near the mode. This paper
considers two such distributions, the -and- and generalised
-and- distributions. These distributions can model many types of
behaviour through just a small number of parameters.
Defining random variables as transformations of is equivalent to
specifying the distribution’s quantile function (defined in the next
section), and distributions of this type are known as quantile
distributions. Work on quantile distributions goes back at least to
Hastings et al. (1947). See Gilchrist (2000) for a book length treatment of their
history and use in statistics. Tukey (1977) proposed the form
(1) and a distribution using it: the original -and-
distribution. Haynes et al. (1997) were the first to use the two distributions
considered in this paper: the -and- distribution and a generalised
form of the -and- distribution. For brevity henceforth
“-and- distribution” will refer to their generalised form. See
Peters et al. (2016) for a thorough review of these and other distributions
based on (1).
This paper presents the gk R
package to work with the -and- and -and- distributions. The
remaining sections covering the following:
A mathematical definition of the distributions.
A description of the package’s functions to perform standard
distributional tasks and how they are implemented.
An exploration of the range of valid parameters for these
distributions, as this has a complicated form. We propose a novel
rule giving “safe” parameter values for the -and-
distribution.
A desctiption of several methods for parameter inference and
corresponding functions supplied by the package.
An illustrative analysis of a real dataset.
A summary.
2 Definitions
The cumulative distribution function (cdf) of a univariate random
variable , , is defined as
. (Later we will often drop subscripts where
they are clear from the context.) The cdf suffices to completely specify
the probability distribution of . It is often the case that the cdf
is not available in closed form but is implicitly defined through its
derivative, the probability density function (pdf). An example of this
is the normal distribution.
For quantile distributions, the cdf is implicity defined through its
inverse, the quantile function where
. The -and- and -and-
distributions use a quantile function of the form
where is the
quantile function and is a vector of
parameters. The functions are:
It is possible to sample from the distributions using the inversion
method, that is, by simulating and
substituting it into the quantile function. Equivalently one can sample
and substitute it into or
i.e. the process described in the introduction based on Equation
(1). In terms of (1),
produces asymmetry and or
elongates tails.
Each distribution has four main parameters: (location), (scale),
(shape parameter mainly affecting skewness), and or (shape
parameter mainly affecting kurtosis). The remaining parameter is
discussed below. When both shape parameters are zero the distribution is
simply . An illustration of the flexible shapes that
the -and- density can take is given in Figure
1. The -and- can produce similar shapes, with
the following exception. The -and- distribution allows negative
values of which can produce lighter tails than a normal
distribution, but also bimodal distributions of potentially limited
usefulness.
Well-defined continuous distributions result from parameter values
producing strictly increasing quantile functions. Determining when this
is true is complicated so discussion is postponed to a later section.
For now note that it is standard to take and fix (which
will be assumed throughout unless mentioned otherwise), and in this case
or guarantees a valid distribution.
Figure 1: Example -and- densities. The first panel fixes
and varies , mainly altering kurtosis. The second fixes and
varies , mainly altering skewness. The third shows two examples with
.
3 Distribution functions
The gk package provides the standard suite of R functions for the
-and- and -and- distributions i.e. random sampling and
calculation of the pdf, cdf and quantile functions. This section
describes how these functions are implemented. It is assumed that
parameters have been chosen such that the quantile function is strictly
increasing. No warning is given when this is not the case as checking
validity is time consuming (see next section).
Quantile function
The qgk and qgh functions calculate the quantile function
. Their implementation is straightforward. First is
calculated using qnorm, then this is passed to an internal function,
z2gk or z2gh, which computes or .
Random sampling
The rgk and rgh functions perform random sampling. This is done by
the method described earlier of sampling draws and
substituting them into or , via the function z2gk or
z2gh.
Cumulative distribution function
The pgk and pgh functions calculate the cdf given input .
They numerically solve , which is guaranteed to have a
unique root for . The required final output is then
where is the cdf. An
alternative approach would be to directly solve
for . However we found this was less
numerically stable for close to 0 or 1.
Our code finds the root for using R’s uniroot command and z2gk
or z2gh for evaluations. The need to run a root
finding algorithm means this function is slow relative to cdf
calculations of standard distributions - see Table 1.
The functions include an argument zscale. Setting this to TRUE
outputs the value which is found rather than . This is used in
the density functions below, and more generally is also useful to retain
numerical precision when has large magnitude.
Probability density function
The dgk and dgh functions calculate the pdf , or the log pdf
if the argument log=TRUE is supplied. The method is based on the
standard probability result that if has density and
is a differentiable 1-1 transformation then the density of is
and denotes the first derivative of .
For quantile distributions we have and
for some function. So the pdf of is
where is the pdf.
Our code to calculate the pdf first finds using pgk or
pgh with zscale=TRUE. Then the pdf or its log is calculated using
formulae (4) and (5) (see Appendix A) for .
The reliance on performing root finding within pgk and pgh means
that dgk and dgh are slow relative to pdf calculations for standard
distributions - see Table 1.
Note that an alternative representation of is where
represents and . Density calculations based
on this approach are described in Rayner and MacGillivray (2002). However we found that
calculating the values required for this approach was occasionally
numerically unstable, as mentioned above.
Cost
Table 1 compares the time to execute gk’s distributional
functions to those for the normal distribution. It illustrates that
random sampling and quantile function calculation are reasonably
efficient, but calculating the cdf and pdf are expensive.
Table 1: Mean times to perform various distributional operations,
evaluated by the
microbenchmark
package (Mersmann 2015). For example the random sampling row compares
rnorm(N), rgk(N,1,2,3,4) and rgh(N,1,2,3,4) for . We also
tried , which gave qualitatively similar results but slightly
better relative efficiency of the gk functions.
Time (microseconds)
Ratio vs normal
Normal
-and-
-and-
-and-
-and-
Quantile function
175
972
445
5.56
2.55
Random sampling
150
921
436
6.15
2.91
cdf
313
143151
116928
457
374
pdf
369
138381
111279
375
302
4 Range of valid parameters
Recall that a valid continuous distribution requires the quantile
function to be strictly increasing. Clearly this property is unaffected
by the choice of and . This section discusses the effects of
and .
Several theoretical results on valid parameters can be derived. It’s
convenient to concentrate on . In this case or
is invalid. Taking or produces valid
distributions when . This is the reason for
taking as standard: it maintains this property while allowing
the skewness factor in (2) and (3) to have a large
effect. For justification of all these results, see Appendix B.
When , the above results completely characterise the range of
valid parameters for the -and- distribution. For the -and-
distribution, there is still some uncertainty for ,
which, as mentioned earlier, corresponds to light tails. For both
distributions, the case where is less clear: even positive
values of or do not guarantee validity. Therefore we provide the
function isValid to test parameter validity numerically.
Validity can be checked by testing whether the minimum derivative of
(2) or (3) is positive. Appendix A shows that it is
equivalent to test whether the functions (6) or (7)
are positive. isValid uses numerical optimisation to minimise these
and returns whether the minimum value is positive. To reduce the
possibility of finding local minima, multiple optimisation starting
points can supplied as a vector to the argument initial_z. However it
is still not guaranteed that the global minimum is found, so there
remains a possibility that the function may produce false positives.
The function can be used as follows to illustrate the region of valid
-and- parameter values for . The results are plotted as
Figure 2.
Figure 2: Validity of parameter values for the -and-
distribution when , calculated using isValid. Also shown is a
quadratic function near the curved part of the boundary
between the regions.
We do not test validity automatically within the package’s other
functions. This is because isValid is relatively computationally
expensive and not guaranteed to be correct. Therefore particular care
should be taken for or , as the distribution functions
will not provide warnings when invalid parameters are used. A reasonable
region of and values to use in practice with can be
derived from Figure 2. It shows that for some
values are invalid. Apart from a narrow strip near
, this invalid region’s boundary is roughly quadratic, as
illustrated by the curve on the figure.
Based on this analysis,
seems a reasonable sufficient condition for parameter validity to use in
practice.
5 Inference functions
The package provides three inference methods for data
which are assumed to be independent and identically
distributed (IID) draws from a -and- or -and- distribution
with unknown parameters. This section describes these methods. An
illustration of their use is provided in the next section. See the gk
help files for a full description of all the arguments available.
MCMC inference
The mcmc function implements inference using Markov chain Monte Carlo
(MCMC). This samples from a Markov chain whose stationary distribution
is the Bayesian posterior of interest for the parameters . We
use a Metropolis-Hastings algorithm, in which a proposed new state of
the chain is sampled by adding a
increment to the current state . A decision to accept or
reject is made based on the prior and likelihood values at
and and a random variable (see steps 3-4 of
Algorithm 1.)
Tuning can be difficult. Haynes and Mengersen (2005), who first used MCMC for
the -and- distribution, did this manually. Instead we use the
adaptive Metropolis (AM) algorithm of Haario et al. (2001) which tunes
automatically during its operation. The resulting s no longer
form a Markov chain, but it has been proved (Saksman and Vihola 2010) that, under
suitable conditions, calculations using them still converge to posterior
quantities as the length of the chain increases. The AM algorithm is
presented as Algorithm 1. Step 1 states the proposal matrix
used in terms of the empirical variance of the past MCMC states. To
calculate this empirical variance efficiently, the code updates it each
time a new state is observed. As a default we specify tuning choices
and .
Like other Bayesian methods, MCMC requires a prior density for the
parameters, , to be specified. This must be supplied by the
user. For computational convenience this should be supplied in the form
of a function get_log_prior which takes a vector of parameters as
input and returns the log prior density. We allow the user to
reparameterise , using rather than , via the logB
argument. This can improve MCMC efficiency when the posterior for is
concentrated on values close to zero.
For IID data the likelihood is
, the product each
observation’s pdf. Evaluating this for the -and- or -and-
distributions using the pgk or pgh command requires calls to
numerical optimisation. Therefore MCMC becomes computationally expensive
for even moderately large datasets.
Input: observations , prior density , number of
iterations to perform , initial state , initial
variance matrix , pre-tuning period , tuning
parameter .
Loop over :
If let . Otherwise let
,
where is the variance of
.
Sample
Sample and let
.
If let . Otherwise let
.
Output: sample .
Algorithm 1: The Adaptive Metropolis MCMC
algorithm
ABC inference
The abc function implements inference by approximate Bayesian
computation (ABC). This is a method for approximate Bayesian inference
which avoids evaluating the likelihood function. It is especially useful
when the likelihood function is unavailable or, as for quantile
distributions, is expensive to compute. ABC is based instead on finding
parameter values which produce simulated data similar to the
observations. The abc function implements a simple version of ABC,
Algorithm 2. Here a simulation is accepted if it has one of
the smallest distances to the observations. Distance refers to a
weighted version of Euclidean distance between vectors of simulated and
observed summary statistics. Details of the weighting are given in the
algorithm’s description. (For large, abc avoids high memory
requirements by running several batches of Algorithm 2. Each
batch uses and returns the best simulations. The overall
best best simulations are then found and returned. The weights
calculated in the first batch are reused in the others.)
Like mcmc, abc is a Bayesian method and requires a prior
distribution for to be provided. It is convenient for this to
be provided in a different form to the mcmc case. A function rprior
should be supplied which a single numeric input and returns that many
samples from the prior distribution as rows of a matrix.
Input: observations , prior distribution ,
summary statistic function , number of simulations to perform
, number of samples to output .
Calculate observed summaries .
For sample parameters from the prior.
For simulate summary statistics given
parameters . Let denote the th component of
.
For (where ) calculate the empirical
variance of the values.
For let
.
Find the smallest values and return the corresponding
s.
ABC produces samples from an approximation to the Bayesian posterior
distribution. The quality of the approximation depends in a complex way
on the choice of summary statistics and the tuning parameters and
. For more background on ABC see the review paper by Marin et al. (2012) and
the handbook of Sisson et al. (2017). Two general R packages for ABC which
implement more advanced methods are
abc(Csilléry et al. 2012) and
EasyABC(Jabot et al. 2013).
Using ABC for the -and- and -and- distributions was proposed
by Allingham et al. (2009) and has been investigated in many subsequent papers.
Following Drovandi and Pettitt (2011) we offer three choices of summary statistics
which can be selected through the sumstats argument: (1) the full
order statistics; (2) octiles of the observations, ;
(3) robust estimates of the moments based on the octiles:
Many more sophisticated approaches to choosing ABC summary statistics
have been proposed (Blum et al. 2013), but these are a simple starting point.
For summaries (2) or (3) we follow Fearnhead and Prangle (2012) and speed up step 3
of Algorithm 2 by using the fact that the octiles (or close
approximations) can be simulated quickly without the need to simulate a
full dataset. Suppose are -and- or
-and- variables, and let
denote the order statistics. We replace with
where rounds
to the nearest integer. Now we need to simulate 7 order statistics from
the -and- or -and- distribution. To do so we simulate
corresponding order statistics of the distribution
using the exponential spacings method (Ripley 1987). This is
implemented by the orderstats function. The uniform order statistics
are then substituted into .
FDSA inference
The fdsa function performs inference using finite difference
stochastic approximation (FDSA). FDSA, originally due to Kiefer and Wolfowitz (1952),
attempts to find minimising a loss function
by iteratively calculating estimates
Each iteration moves the estimate in the
opposite direction to an estimate of the loss gradient, based on finite
difference calculations.
We use FDSA for maximum likelihood estimation of IID observations. In
this setting can be taken to be the negative log
likelihood,
The gradient of can be estimated using only a
small subset of the data, so FDSA has the potential to scale up to large
datasets better than MCMC, while avoiding the approximation error of
ABC. Unlike ABC and MCMC, we are not aware of FDSA having previously
been used for the -and- and -and- distributions.
The -and- and -and- distributions have some parameter
constaints (e.g. , ). Also we found setting further
constaints from preliminary analyses sometimes helps FDSA behave well.
Therefore we use a version of FDSA for bounded minimisation from
L’Ecuyer and Glynn (1994), presented as Algorithm 3.
Input: initial state , choice of and
sequences, function which calculates an
unbiased estimate of , number of iterations to
perform , vectors of (possibly infinite) upper and lower
parameter bounds .
Loop over .
Calculate by performing the following steps for
.
Let be a -dimensional vector whose th component
is 1 and others are zero.
Let and
.
(Here is a projection operator. Its output is
such that is the closest value to in
. The subscripts
represent th components.)
The unbiased estimate of required by Algorithm
3, , can be taken to be the sum
of a random sample of negative log likelihood terms multiplied by
. Hence for a vector y containing a random subsample of
observations (sometimes referred to as a batch),
can be calculated using
-sum(dgk(y,A,B,g,k,log=TRUE))*n/m (or similar for the -and-
distribution). Variance reduction in step 1c of Algorithm 3
is possible by coupling the two estimates (Kushner and Yin 2003). Hence we use
the same random subsample of data for all
calculations in an iteration of step 1.
FDSA convergence requires that the gain sequences and must
satisfy certain conditions. Following Spall (1998) we take
and
. This leaves several tuning
choices, which can be selected by the user, or left at default values
which we provide. Following Kleinman et al. (1999) we use default values
and . Following Spall (1998) our default for
is an estimate of the standard deviation of
using some preliminary simulations. We
provide defaults and but it is recommended to manually
tune these to produce rapid convergence. This may require several short
pilot runs of the algorithm. The fdsa function allows and
to be vectors, in which case operations in Algorithm 3 are
interpreted as elementwise where necessary. This allows the user to tune
gain sequences differently for each parameter. As for mcmc, we allow
the user to reparameterise , using rather than , via
the logB argument, which can improve FDSA efficiency when the MLE
value of is close to zero.
Under weak assumptions, FDSA converges to a local minimum of
(Kushner and Yin 2003). In our experience the likelihood
for the -and- and -and- distributions is usually unimodal,
so there is little danger of converging to an incorrect mode.
Nonetheless it may be a useful check on the results to rerun the
algorithm from various starting points or compare with the output of
another algorithm.
An alternative to FDSA is simultaneous perturbation stochastic
approximation (SPSA) (Spall 1998). Here each iteration makes a finite
difference estimate of the derivative of the loss function when moving
in a random direction from . An update moves a
distance (negatively) proportional to this estimated derivative in the
selected direction. Each SPSA iteration requires fewer likelihood
estimates than FDSA, and it is asymptotically more efficient
(Kushner and Yin 2003). However we found in exploratory work that for our
application the SPSA updates were dominated by improving and
estimates, and the remaining parameters were learned very slowly.
6 Illustration
We illustrate gk’s inference methods on the Garch exchange rate
dataset from the Ecdat
package (Croissant 2016). This consists of 1967 daily US dollar
exchange rates against other currencies from 1980 to 1987. We
concentrate on the exchange rate with Canadian Dollars. Let denote
the exchange rate on day . The log return is defined as
. Figure 3 is a time
series plot of the log returns. Figure 7 shows a
histogram and a quantile-quantile plot indicating that the tails are
heavier than those of a normal density.
We focus on using the -and- distribution to model the log returns
under an IID assumption. For models also including time series structure
see for example Drovandi and Pettitt (2011). The full code for the analysis below can
be run via the fx function.
The ABC and MCMC analyses which follow are Bayesian and require
specification of a prior. We use a uniform prior for ease of comparison
to the maximum likelihood results from FDSA. For MCMC we are able to use
an improper uniform prior. For ABC a proper prior is required so we
bound the parameters as follows , , , .
We restrict and to magnitude 1 at most, as we believe log
returns of this magnitude are highly unlikely. The and
parameters are given wider support which can capture a broad range of
distributional shapes.
Figure 3: Log returns for US dollar / Canadian Dollar exchange
rates.
This simulated parameter vectors and accepting the best . We
used moment estimator summary statistics, described earlier, which can
be simulated quickly without the need to simulate an entire dataset. As
a result this analysis took only 6 minutes.
The resulting approximate posterior samples are shown in Figure
6. Figure 7 shows density and
quantile-quantile plots under the mean parameter values. These reveal a
very poor fit to the data. However this short ABC analysis does provide
reasonable tuning choices for the other methods.
We found that using the original parameterisation caused high variance
in our gradient estimates. This is because the log-likelihood surface
becomes extremely steep for close to 0. Therefore we reparameterised
to . The initial FDSA state was set to equal the ABC means.
The FDSA steps sizes were tuned by trial-and-error.
Figure 4 shows a trace plot of the FDSA algorithm output. A
pilot run with is shown in black. Parameters
and do not converge over iterations. However
they have smooth curves, indicating that there is relatively little
noise in their gradient estimates and so larger steps could be taken. In
contrast converges quickly and then oscillates noisily. This
indicates that a smaller step size could be used to average out this
noise more effectively without endangering convergence. Therefore for
the final run we used
.
The final FDSA analysis took 17 minutes. The final states were
, ,
and . Figure 7 shows density and
quantile-quantile plots under these parameter values. These are a much
better fit to the data than the ABC results.
Next we use the FDSA results to help tune an MCMC algorithm, which
quantifies the uncertainty in the parameter values.
Figure 4: Output from the FDSA algorithm to infer -and-
parameters for exchange rate log returns. Black shows output from a
pilot run with . Red shows output from the final
run with .
Again we used a log reparameterisation for . To achieve an improper
uniform prior on the original parameterisation, we used a prior density
proportional of 𝟙 on
(where 𝟙 represents an
indicator function). Our initial parameter vector was the final FDSA
state. We use the variance matrix of the last 1000 FDSA states to select
the initial MCMC proposal variance.
Figure 5 shows a trace plot of the MCMC algorithm output.
For the first few hundred iterations small proposals are made, at least
for , and , but the proposal variance quickly adapts and
the remainder of the output appears to have converged. Exploratory work
showed that taking a poor initial state meant MCMC is very slow to
converge, because the variance matrix adapts to the transient state of
the algorithm. Hence tuning based on FDSA output is very useful.
The MCMC analysis took 39 minutes. Figure 6 parameter
histograms and figure 7 shows density and
quantile-quantile plots. These are similar to the FDSA fit. Note that
the density plot is based on mean parameter values from the MCMC output
(after discarding the first half of the output as burn-in and
transforming values back to the original parameterisation).
Figure 5: States of an MCMC algorithm to infer -and- parameters
for exchange rate log returns.Figure 6: Parameter inference for fitting the -and- distribution
to exchange rate log returns. The top row shows the ABC posterior sample
and the bottom row the MCMC posterior sample, which requires much more
concentrated parameter scales. FDSA estimates of the MLEs are shown by
crosses on the -axis.Figure 7: (Left) Histogram of exchange rate log returns, and fitted
-and- densities. (Right) Quantile-quantile (QQ) plots of fitted
-and- densities. QQ plots are shown for 30 vectors of parameters
sampled from the second half of the MCMC
output.
Summary
The ABC analysis is quick but produces a poor fit. However it helps tune
the FDSA method which finds a good estimate of the MLE in a reasonable
time. Further computational effort using MCMC provides a Bayesian fit.
Figure 7 shows that the -and- distribution fits
the data better than a normal distribution, but still does not fit the
most extreme observations. Further improvements might be possible by
using more flexible distributions, for example allowing different
parameters for the upper and lower tails (Peters et al. 2016).
7 Discussion
This paper has reviewed the -and- and -and- distributions,
and introduced the gk package to work with them. The package includes
the usual distributional functions, although the pdf and cdf functions
are slow due to relying on numerical root-finding. Another function
tests the validity of different parameter combinations, and this was
used to produce a novel result on which parameters are valid for the
-and- distribution (i.e. it is appears to be sufficient that
.) The package also
provides several methods for inference of IID data under these
distributions, and their use has been illustrated above. The methods
include a FPSA algorithm which can find MLEs for large datasets in a
reasonable time and has not been applied to this problem before.
8 Appendix A: Formulae
The derivatives of the functions are as follows:
where
Observe that each function has the same sign and roots as the
corresponding function.
9 Appendix B: Range of valid parameters - theory
This appendix proves theoretical results quoted earlier about which
parameter values produce valid -and- and -and-
distributions.
First note that the defining functions in (2) and
(3) both have the property that
. Therefore
any behaviour produced by can be replicated with and a
different choice of . So for simplicity it suffices to concentrate on
.
For the remainder of this appendix, distributional validity will
correspond to a strictly increasing quantile function. This property is
generally violated if , as there are two solutions to :
and a solution to (The only
exception is the special case of .) Also taking or
is invalid, as in either case , which is continuous, has a positive
gradient at but limits of zero.
Finally it is shown that non-negative values of or produce valid
distributions provided that (Rayner and MacGillivray 2002).
From Appendix A it suffices to derive the values of such that
– representing either or
– is guaranteed to be positive for
or . Note that is a continuous function of
, and . So a sufficient condition for validity is that no
solution to exists. Rearranging using (6)
and (7) gives
For or , can only take values in with
attained by . Hence (8) gives if and
only if , and we concentrate on this case from now on. We wish to
find the minimum positive solution for . Since is increasing in
it suffices to concentrate on its largest value, . The problem
reduces to minimising
for .
Numerically this gives , as shown in Figure
8.
This article is converted from a Legacy LaTeX article using the
texor package.
The pdf version is the official version. To report a problem with the html,
refer to CONTRIBUTE on the R Journal homepage.
Footnotes
References
D. Allingham, R. A. R. King and K. L. Mengersen. Bayesian estimation of quantile distributions. Statistics and Computing, 19(2): 189–201, 2009. URL https://doi.org/10.1007/s11222-008-9083-x.
M. G. B. Blum, M. A. Nunes, D. Prangle and S. A. Sisson. A comparative review of dimension reduction methods in approximate bayesian computation. Statistical Science, 28(2): 189–208, 2013. URL https://doi.org/10.1214/12-sts406.
K. Csilléry, O. François and M. G. B. Blum. Abc: An R package for approximate Bayesian computation (ABC). Methods in ecology and evolution, 3(3): 475–479, 2012. URL https://doi.org/10.1111/j.2041-210X.2011.00179.x.
C. C. Drovandi and A. N. Pettitt. Likelihood-free Bayesian estimation of multivariate quantile distributions. Computational Statistics & Data Analysis, 55(9): 2541–2556, 2011. URL https://doi.org/10.1016/j.csda.2011.03.019.
P. Fearnhead and D. Prangle. Constructing summary statistics for approximate Bayesian computation: Semi-automatic ABC. Journal of the Royal Statistical Society, Series B, 74: 419–474, 2012. URL https://doi.org/10.1111/j.1467-9868.2011.01010.x.
H. Haario, E. Saksman and J. Tamminen. An adaptive Metropolis algorithm. Bernoulli, 223–242, 2001. URL https://doi.org/10.2307/3318737.
C. Hastings Jr., F. Mosteller, J. W. Tukey and C. P. Winsor. Low moments for small samples: A comparative study of order statistics. The Annals of Mathematical Statistics, 413–426, 1947. URL https://doi.org/10.1214/aoms/1177730388.
M. A. Haynes, H. L. MacGillivray and K. L. Mengersen. Robustness of ranking and selection rules using generalised -and- distributions. Journal of Statistical Planning and Inference, 65(1): 45–66, 1997. URL https://doi.org/10.1016/s0378-3758(97)00050-5.
M. Haynes and K. Mengersen. Bayesian estimation of g-and-k distributions using MCMC. Computational Statistics, 20(1): 7–30, 2005. URL https://doi.org/10.1007/BF02736120.
F. Jabot, T. Faure and N. Dumoulin. EasyABC: Performing efficient approximate Bayesian computation sampling schemes using R. Methods in Ecology and Evolution, 4(7): 684–687, 2013. URL https://doi.org/10.1111/2041-210x.12050.
J. Kiefer and J. Wolfowitz. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3): 462–466, 1952. URL https://doi.org/10.1007/978-1-4613-8505-9_4.
N. L. Kleinman, J. C. Spall and D. Q. Naiman. Simulation-based optimization with stochastic approximation using common random numbers. Management Science, 45(11): 1570–1578, 1999. URL https://doi.org/10.1287/mnsc.45.11.1570.
H. J. Kushner and G. G. Yin. Stochastic approximation and recursive algorithms and applications. Springer, 2003. URL https://doi.org/10.1007/b97441.
P. L’Ecuyer and P. W. Glynn. Stochastic optimization by simulation: Convergence proofs for the GI/G/1 queue in steady-state. Management Science, 40(11): 1562–1578, 1994. URL https://doi.org/10.1287/mnsc.40.11.1562.
J.-M. Marin, P. Pudlo, C. P. Robert and R. J. Ryder. Approximate Bayesian computational methods. Statistics and Computing, 22(6): 1167–1180, 2012. URL https://doi.org/10.1007/s11222-011-9288-2.
G. W. Peters, W. Y. Chen and R. H. Gerlach. Estimating quantile families of loss distributions for non-life insurance modelling via L-moments. Risks, 4(2): 14, 2016. URL https://doi.org/10.2139/ssrn.2739417.
G. D. Rayner and H. L. MacGillivray. Numerical maximum likelihood estimation for the g-and-k and generalized g-and-h distributions. Statistics and Computing, 12(1): 57–75, 2002. URL https://doi.org/10.1023/A:1013120305780.
E. Saksman and M. Vihola. On the ergodicity of the adaptive Metropolis algorithm on unbounded domains. The Annals of Applied Probability, 20(6): 2178–2203, 2010. URL https://doi.org/10.1214/10-aap682.
S. A. Sisson, Y. Fan and M. Beaumont, eds. Handbook of Approximate Bayesian Computation. Chapman & Hall/CRC, 2017. URL https://doi.org/10.1201/9781315117195.
J. C. Spall. Implementation of the simultaneous perturbation algorithm for stochastic optimization. IEEE Transactions on aerospace and electronic systems, 34(3): 817–823, 1998. URL https://doi.org/10.1109/7.705889.
J. W. Tukey. Modern techniques in data analysis. In Proceedings of the NSF-sponsored regional research conference, 1977. Southern Massachusetts University.
Reuse
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
Citation
For attribution, please cite this work as
Prangle, "gk: An R Package for the g-and-k and Generalised g-and-h Distributions", The R Journal, 2020
BibTeX citation
@article{RJ-2020-010,
author = {Prangle, Dennis},
title = {gk: An R Package for the g-and-k and Generalised g-and-h Distributions},
journal = {The R Journal},
year = {2020},
note = {https://rjournal.github.io/},
volume = {12},
issue = {1},
issn = {2073-4859},
pages = {7-20}
}