The OneStep package proposes principally an eponymic function that numerically computes Le Cam’s one-step estimator, which is asymptotically efficient and can be computed faster than the maximum likelihood estimator for large datasets. Monte Carlo simulations are carried out for several examples (discrete and continuous probability distributions) in order to exhibit the performance of Le Cam’s one-step estimation procedure in terms of efficiency and computational cost on observation samples of finite size.
In the statistical experiments generated by i.i.d. observation samples, the sequence of maximum likelihood estimators (MLE) is known to be asymptotically efficient under very general assumptions and consequently presents the fastest convergence rate and the lowest possible asymptotic variance.
Although the sequence of MLE is asymptotically efficient, it is generally not expressed in a closed form and requires time consuming numerical computations. On the other hand, the other generic estimation procedures which can sometimes be computed faster, as the method of moments, do not generally reach the optimal asymptotic variance.
In R, the package fitdistrplus (see (Delignette-Muller and Dutang 2015)) is commonly
used to infer the parameters of univariate probability distributions.
For non-censored datasets, fitdistrplus allows four different
estimation methods: maximum likelihood, moment matching, quantile
matching, and maximum goodness-of-fit estimation. The unified approach
is provided by fitdist()
which returns an S3-object having usual
generic functions (plot
, summary
, logLik
, …) as well as
dedicated fitdist
-functions (gofstat
, bootdist
, …).
Other packages with similar purposes are EstimationTools and DistributionFitR, providing MLE for non-censored data. Many other packages also provide estimation procedure on the basis of one function per probability distribution; see, e.g., packages univariateML, propagate. A final package, which is worth mentioning, is fitter, which fits a set of probability distributions on a given dataset sequentially.
However, as soon as the Fisher information matrix is sufficiently
regular with respect to the parameter to be estimated, Le Cam’s one-step
estimation procedures can be achieved (see (Le Cam 1956)). They are based on
an initial sequence of guess estimators and a single Newton step or a
Fisher scoring step on the loglikelihood function. Namely, they write
The sequence of Le Cam’s one-step estimators presents certain advantages over the sequence of MLE and over the initial sequence of estimators (method of moments, quantile matching method, etc.) in terms of computational cost and asymptotic variance. It is much less computationally expensive than the MLE, while it has the same rate and the same asymptotic variance. Since there is no full numerical optimization (but only one computation of the Newton step or the Fisher scoring step), the procedure is faster and appropriate for very large datasets. On the other hand, it is asymptotically optimal in terms of asymptotic variance, which is generally not the case for the initial sequence of guess estimators.
For several probability distributions (Gaussian, Exponential, Lognormal,
Poisson, Geometric), a fast computable sequence of estimators is already
asymptotically efficient, and no correction is needed. When the Fisher
information is given in a closed form (Gamma, Beta,
Le Cam’s one-step estimation procedures are implemented in the package OneStep, and the eponymic function onestep is described in this paper and used on different examples. Monte Carlo simulations are performed for several examples in order to exhibit the performance of Le Cam’s one-step procedure on samples of finite size. They exhibit their asymptotic efficiency simultaneously with their better computational cost for several probability distributions.
onestep
Let
Le Cam’s one-step estimation procedure is based on an initial sequence of guess estimators and a Fisher scoring step or a single Newton step on the loglikelihood function. For the novice user, the function onestep automatically chooses the best procedure to be used. There is, consecutively, a single command for the user, which is
onestep(data, distr)
The function onestep presents several procedures internally depending on whether the initial sequence of guess estimators is in a closed-form or not and whether the score and the Fisher information matrix can be elicited in a closed-form.
Here is the list of the procedures taken into account in the onestep function:
Distributions for which the MLE is already explicit: norm
, exp
,
lnorm
, invgauss
for continuous probability distributions, and
pois
, geom
for discrete probability distributions. For this
class, the explicit MLE is returned.
Distributions for which the initial sequence of guess estimators,
the score and the Fisher information matrix have been elicited in a
closed-form: gamma
, beta
, chisq
with an initial sequence of
moment estimators, cauchy
with an initial sequence of quantile
matching estimators, and weibull
with an initial sequence of
graphical plot estimators. For this class, Le Cam’s one-step
procedure (1) is applied.
Distributions for which the initial sequence of guess estimators,
the score, and the Hessian have been elicited in a closed-form:
nbinom
. For this class, Le Cam’s one-step procedure (2)
is applied.
Distributions for which the initial sequence is numerically computed
on a subsample, but the score and the Fisher information matrix have
closed-form: pareto
. For this class, Le Cam’s one-step procedure
(1) is executed.
For all other distributions, if the density function is well defined, the numerical computation of the Newton step in Le Cam’s one-step procedure (2) is proposed with an initial sequence of guess estimators, which is the sequence of maximum likelihood estimators computed on a subsample.
As is described, all explicit distributions of the mmedist
function
from fitdistrplus have been corrected in the onestep function,
except unif
and logis
. The example unif
is a famous example of the
singular behavior of the MLE. The correction of logis
is of very small
gain from the method of moments estimator to the MLE. For these two
distributions, the method of moments is returned.
However, the package also offers several new explicit computations as
cauchy
, chisq
and weibull
and applies to distributions coming from
the actuar package (see (Dutang et al. 2008)) such as invgauss
and pareto
.
The function onestep
allows the user to propose its own initial guess
estimation in the one-step procedure by specifying the parameter init
in the command
onestep(data, distr, init)
The user can consequently use different initial guess estimators for the
aforementioned classical distributions (moment estimators, estimators
based on the characteristic function, quantile matching estimators,
Bayesian estimators, mode-type estimators, graphical methods, etc.).
Several examples can be found in the documentation of the onestep
function.
Monte Carlo simulations are done for several examples (discrete and continuous probability distributions) in order to exhibit the performance of Le Cam’s one-step estimation procedure in terms of efficiency and computational cost on observation samples of finite size.
For the assessment of the efficiency, the proximity of the renormalized
statistical errors goftest
package. Note that a value of the statistic below 0.7434 corresponds to
accept the null (and the equivalence to theoretical Gaussian asymptotic
distribution) with an error of type I equal to 1%.
Timing performance (given in seconds) is done with the proc.time
function ("elapsed"
time) on a laptop with an Intel Core i7 2.7 GHz
processor with 8GB RAM. The onestep
function is compared to the MLE
computed with mledist
of the fitdistrplus package and to the
sequence of initial guess estimators (for instance, the moment estimator
is computed with mmedist
for the gamma
, beta
, and nbinom
distributions).
The functions benchonestep
and benchonestep.replicate
were used to
compare the performance of estimators (see documentation of the
OneStep package).
For the majority of the "closed formula" cases, the initial sequence
of guess estimators of the unknown parameter
Let
However, other
It had been shown in (Le Cam 1956) that for a
The first example is the joint estimation of the shape parameter
Let us denote
We can see in Figure 1 (and in Table
1 with the Cramer-Von Mises statistics) that the
sequence of Le Cam’s one-step estimators (LCE) reaches efficiency and
naturally overperforms the initial sequence of ME in terms of asymptotic
variance. Moreover, this sequence is faster to be computed on
MLE | ME | LCE | |
---|---|---|---|
Computation time (s) | 1559.74 | 29.72 | 37.46 |
CvM statistic alpha | 0.63 | 1.04 | 0.23 |
CvM statistic beta | 0.47 | 0.86 | 0.19 |
As it was mentioned in the introduction, the major advantage of the
sequence of one-step estimators is that it is computed faster than the
maximum likelihood estimator for large datasets. We illustrate this fact
in Table 2, where the average
computation times (over 10 Monte Carlo simulations) are done for
different sample sizes
10^ 3 |
10^ 4 |
10^ 5 |
10^ 6 |
10^ 7 |
10^ 8 |
10^ 9 |
|
---|---|---|---|---|---|---|---|
MLE | 0.0121 | 0.1193 | 1.0056 | 10.4735 | 99.4203 | 2339.0437 | 31600.8313 |
LCE | 0.0007 | 0.0053 | 0.0252 | 0.2361 | 2.3152 | 40.0440 | 540.7393 |
The second example is the joint estimation of the first shape parameter
Let us denote
It is worth mentioning that for the Beta distribution
Here again, the sequence of Le Cam’s one-step estimators naturally overperforms the initial sequence of ME in terms of asymptotic variance (see Figure 2 and the next Table for CvM statistics).
It is computed faster than the sequence of MLE, as shown in Table 3.
MLE | ME | LCE | |
---|---|---|---|
Computation time (s) | 2324.99 | 43.26 | 52.02 |
CvM statistic alpha | 0.22 | 0.06 | 0.05 |
CvM statistic beta | 0.34 | 0.11 | 0.20 |
The third example is the joint estimation of the location parameter
It is worth mentioning that the Cauchy distribution has no first moment,
and consecutively its unknown parameter
As mentioned previously, the sequence of Le Cam’s one-step estimators overperforms the initial sequence of quantile matching estimators (QME) in terms of asymptotic variance (see Figure 3 and the next Table for CvM statistics).
It is computed faster than the sequence of MLE, as shown in Table 4.
MLE | QME | LCE | |
---|---|---|---|
Computation time (s) | 225.32 | 7.40 | 27.59 |
CvM statistic m | 0.08 | 0.42 | 0.08 |
CvM statistic d | 0.04 | 0.10 | 0.04 |
The fourth example is the joint estimation of the size parameter
For this distribution,
In this discrete case, the sequence of Le Cam’s one-step estimators (2) defined with the Hessian still overperforms the initial sequence of ME in terms of asymptotic variance (see Figure 4 and the next Table for CvM statistics). Let us mention that the CvM statistics are computed with the estimated variances for MLE and LCE.
It also overperforms the sequence of MLE in terms of computation time, as shown in Table 5.
MLE | ME | LCE | |
---|---|---|---|
Computation time (s) | 2038.33 | 39.59 | 104.75 |
CvM statistic r | 0.18 | 0.71 | 0.19 |
CvM statistic mu | 0.05 | 0.05 | 0.05 |
The fifth example is the joint estimation of the shape parameter
Moreover, it is possible to define a closed-form initial sequence of
graphical plot estimators (based on the explicit form of the cumulative
distribution function). Let us denote
The sequence of Le Cam’s one-step estimators overperforms the initial sequence of OLS-based estimators in terms of asymptotic variance (see Figure 5 and the next Table). Since OLS presents a small bias for samples of finite size, its CvM statistics are bigger in the next Table.
It also overperforms the sequence of MLE in terms of computation time for large datasets, as shown in Table 6.
MLE | OLS | LCE | |
---|---|---|---|
Computation time (s) | 568.83 | 32.64 | 82.64 |
CvM statistic tau | 0.13 | 17.26 | 0.06 |
CvM statistic beta | 0.59 | 2.33 | 0.59 |
As soon as the ME is not in closed form or no other closed form estimators can be elicited, the use of a numerical ME as an initial sequence of guess estimator in Le Cam’s one-step procedure is difficult to justify. Indeed, no gain will be given in terms of time computation and the use of the MLE is finally equivalent. We propose therefore to use in the onestep function an improvement of the classical Le Cam procedure.
Indeed, it can be shown (see (Kamatani and Uchida 2015) or (Kutoyants and Motrunich 2016)) that for a
This result allows one to use the numerical computation of the MLE on a
subsample (of size
Then the sequence of Le Cam’s one-step estimators given by (2)
is also consistent, asymptotically normal and efficient (in the Fisher
sense) with
The choice of the exponent control
, for instance in the
call
The example of Pareto II distribution is interesting and shows the gain in terms of variance (with respect to the initial sequence of guess estimators) and in terms of computation time in comparison with the MLE.
This improved method is also used when the distribution does not belong
to the closed-formula family. The initial sequence of maximum likelihood
estimators is computed on a subsample with the mledist
function of the
fitdistrplus package. The score and the Hessian in the Newton step of
the Le Cam procedure (2) are numerically computed with the
functions grad
and hessian
of the numDeriv package (see
(Gilbert and Varadhan 2019)).
The sixth example is the joint estimation of the shape parameter
In this example neither the ME (when it exists for
Let
Here again, the sequence of Le Cam’s one-step estimators naturally overperforms the initial sequence of MLE computed on a subsample in terms of asymptotic variance (see Figure 6 and the next Table for the CvM statistics).
It is also computed faster than the sequence of MLE, as shown in Table 7.
MLE | MLEsub | LCE | |
---|---|---|---|
Computation time (s) | 1013.00 | 393.42 | 414.08 |
CvM statistic alpha | 0.25 | 4.72 | 1.44 |
CvM statistic sigma | 0.43 | 5.54 | 1.42 |
For the generic example, we study the Weibull distribution again and
force it to be numerically computed with the function parameter
method="numeric"
in order not to use the closed form processing.
We recall that, in the generic procedure, the score and the Hessian in
the Newton step of the Le Cam procedure (2) are numerically
computed with the functions grad
and hessian
of the numDeriv
package. By default, the initial sequence of maximum likelihood
estimators is computed on a subsample if size
The numerically computed sequence of Le Cam’s one-step estimators
reaches asymptotic efficiency for a simulation of
MLE | LCE | GLCE | |
---|---|---|---|
Computation time (s) | 568.83 | 82.64 | 405.10 |
CvM statistic tau | 0.13 | 0.06 | 0.25 |
CvM statistic beta | 0.59 | 0.59 | 1.08 |
We would like to thank Yury Kutoyants and Marius Soltane for fruitful discussions on the topic. We also thank the referees for their valuable comments that improve the paper.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Brouste, et al., "OneStep : Le Cam's One-step Estimation Procedure", The R Journal, 2021
BibTeX citation
@article{RJ-2021-044, author = {Brouste, Alexandre and Dutang, Christophe and Mieniedou, Darel Noutsa}, title = {OneStep : Le Cam's One-step Estimation Procedure}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {13}, issue = {1}, issn = {2073-4859}, pages = {383-394} }