In recent years, composite models based on the lognormal distribution have become popular in actuarial sciences and related areas. In this short note, we present a new R package for computing the probability density function, cumulative density function, and quantile function, and for generating random numbers of any composite model based on the lognormal distribution. The use of the package is illustrated using a real data set.
Two-piece composite distributions arise in many areas of the sciences. The first two-piece composite distribution with each piece described by a normal distribution appears to have been used by (Gibbons and Mylroie 1973). Recently, two-piece composite distributions with the first piece described a lognormal distribution (which we refer to as composite lognormal distributions) have proved popular in insurance and related areas. (Cooray and Ananda 2005) introduced such distributions recently, but one of the referees has pointed out that composite lognormal distributions have been known before at least as early as (Barford and Crovella 1998) in areas like modeling workloads of web servers.
(Cooray and Ananda 2005) showed that composite lognormal distributions can give better fits than standard univariate distributions. They illustrate this fact for the Danish fire insurance data by introducing the composite lognormal-Pareto distribution, a distribution obtained by piecing together lognormal and Pareto probability density functions (pdfs). (Scollnik 2007) improved the composite lognormal-Pareto distribution by using mixing weights as coefficients for each pdf replacing the constant weights applied earlier by (Cooray and Ananda 2005). (Scollnik 2007) also employed generalized Pareto distribution in place of Pareto distribution used by (Cooray and Ananda 2005). (Pigeon and Denuit 2011) provided further extensions of composite lognormal distributions by choosing the cutoff point (the point at which the two pdfs are pieced together) as a random variable. The most recent extensions of composite lognormal distributions are provided in (Nadarajah and Bakar 2012).
Composite lognormal distributions have attracted considerable attention in spite of being introduced only in 2005. Some applications in the last three years include: estimation of insurance claim cost distributions (Bolancé et al. 2010), inflation and excess insurance (Fackler 2010), large insurance claims in case reserves (Lindblad 2011), statistical mechanics (Eliazar and Cohen 2012), and modeling of mixed traffic conditions (Dubey et al. 2013).
The aim of this short note is to present a new contributed package CompLognormal for R that computes basic properties for any composite lognormal distribution. The properties considered include the pdf, cumulative distribution function (cdf), quantile function, and random numbers. Explicit expressions for these properties are given in Section “Properties”. Two illustrations of the practical use of the new R package are given in Section “Illustrations”.
Let
The cdf corresponding to ((1)) is
We are interested in a two-piece composite distribution, where the first
piece is specified by the lognormal distribution. So, we take
respectively. We shall refer to the distribution given by ((4)), ((5)), and ((6)) as the composite lognormal distribution. Random numbers from the composite lognormal distribution can be generated as
The pdf of two-piece composite distributions are in general not
continuous or differentiable at the cutoff point
\tag{8} $$ The smoothness conditions in ((8)) are imposed for technical reasons—the respective parametric models then become more tractable.
Maximum likelihood estimation is a common method for estimation. Suppose
we have a random sample
The new package CompLognormal, available from CRAN, computes ((4)), ((5)), and ((6)) for any given composite lognormal distribution. It also generates random numbers from the specified composite lognormal distribution. Table 1 summarizes the functions implemented in the package CompLognormal along with their arguments.
Quantity | Calling sequence |
---|---|
dcomplnorm(x, spec, sigma=1, theta=1, ...) |
|
pcomplnorm(x, spec, sigma=1, theta=1, ...) |
|
qcomplnorm(p, spec, sigma=1, theta=1, ...) |
|
rcomplnorm(n, spec, sigma=1, theta=1, ...) |
The input argument spec
(a character string) specifies the
distribution of the second piece of the composite lognormal
distribution. The distribution should be one that is recognized by R. It
could be one of the distributions implemented in the R base package or
one of the distributions implemented in an R contributed package or one
freshly written by a user. In any case, there should be functions
dspec
, pspec
, qspec
and rspec
, computing the pdf, cdf, qf and
random numbers of the distribution.
Some examples of spec
are: spec = "norm"
, meaning that
spec = "lnorm"
, meaning that
spec = "exp"
, meaning
that
dcomplnorm
, pcomplnorm
, qcomplnorm
and rcomplnorm
can also take
additional arguments in the form of ...
. These arguments could give
inputs (for example, parameter values) for the distribution of the
second piece of the composite lognormal distribution. For example, if
spec = "norm"
then ...
can include mean = 1, sd = 1
to mean that
spec = "lnorm"
then ...
can include meanlog = 1, sdlog = 1
to mean
that spec = "exp"
then ...
can include rate = 1
to mean that
The code for dcomplnorm
, pcomplnorm
and qcomplnorm
currently uses
constructs of the form
do.call(paste("d", spec, sep = ""), list(z}, \code{...))
This technique was necessary as in the R base package distributions are
no data type but rather are given by the respective four constituent
functions <prefix><name>
, where <prefix>
is one of r
, d
, p
,
q
and <name>
is the name of the distribution. A way to circumvent
such constructions would be a data type “distribution”. This has been
available on CRAN for quite a while within the distr
family of
packages, with corresponding S4 classes for distributions. Recently,
another approach based on S5-classes has been pursued in the package
poweRlaw (Gillespie 2013). We
leave these as future work.
We describe two simple illustrations of the new package CompLognormal. The following packages should be loaded (after installing them if necessary) in advance:
library(CompLognormal)
library(actuar)
library(SMPracticals)
library(evd)
library(fitdistrplus)
library(stats4)
The illustration presented here plots the pdf and the cdf of the composite lognormal-loglogistic distribution for varying parameter values.
curve(dcomplnorm(x, "llogis", 0.4, 0.5, shape = 1, scale = 0.8), xlim = c(0, 5),
ylim = c(0, 0.7), xlab = "x", ylab = "f(x)", n = 250, col = "black", lty = 1)
d1 <- dcomplnorm(0.5, "llogis", 0.4, 0.5, shape = 1, scale = 0.8)
segments(0.5, 0, 0.5, d1, col = "black", lty = 2)
curve(dcomplnorm(x, "llogis", 0.3, 0.2, shape = 0.2, scale = 0.5), add = TRUE,
col = "red", lty = 1)
d2 <- dcomplnorm(0.2, "llogis", 0.3, 0.2, shape = 0.2, scale = 0.5)
segments(0.2, 0, 0.2, d2, col = "red", lty = 2)
curve(dcomplnorm(x, "llogis", 0.5, 0.4, shape = 0.5, scale = 0.5), add = TRUE,
col = "blue", lty = 1)
d3 <- dcomplnorm(0.4, "llogis", 0.5, 0.4, shape = 0.5, scale = 0.5)
segments(0.4, 0, 0.4, d3, col = "blue", lty = 2)
legend(1.5, 0.5, legend =
c(expression(paste(sigma==0.4, ",", theta==0.5, ",", shape==1, ",", scale==0.8)),
expression(paste(sigma==0.3, ",", theta==0.2, ",", shape==0.2, ",", scale==0.5)),
expression(paste(sigma==0.5, ",", theta==0.4, ",", shape==0.5, ",", scale==0.8))),
col = c("black", "red", "blue"), lty = 1)
curve(pcomplnorm(x, "llogis", 0.4, 0.5, shape = 1, scale = 0.8), xlim = c(0, 5),
ylim = c(0, 1), xlab = "x", ylab = "F(x)", n = 250, col = "black", lty = 1)
d1 <- pcomplnorm(0.5, "llogis", 0.4, 0.5, shape = 1, scale = 0.8)
segments(0.5, 0, 0.5, d1, col = "black", lty = 2)
curve(pcomplnorm(x, "llogis", 0.3, 0.2, shape = 0.2, scale = 0.5), add = TRUE,
col = "red", lty = 1)
d2 <- pcomplnorm(0.2, "llogis", 0.3, 0.2, shape = 0.2, scale = 0.5)
segments(0.2, 0, 0.2, d2, col = "red", lty = 2)
curve(pcomplnorm(x, "llogis", 0.5, 0.4, shape = 0.5, scale = 0.5), add = TRUE,
col = "blue", lty = 1)
d3 <- pcomplnorm(0.4, "llogis", 0.5, 0.4, shape = 0.5, scale = 0.5)
segments(0.4, 0, 0.4, d3, col = "blue", lty = 2)
Figure 1 shows the pdfs and cdfs of the composite
lognormal-loglogistic distribution. The parameters, shape
and scale
, are the shape and scale parameters of
the loglogistic distribution, as defined in the R base package. The
vertical lines in the figures correspond to the values for
The illustration presented here fits the composite lognormal-Fréchet distribution to the Danish fire insurance data by the method of maximum likelihood, see ((9)). The data were obtained from the R package SMPracticals (Davison 2012).
x <- danish[1:2492]
nlm(function(p) {
-sum(dcomplnorm(x, "frechet", sigma = exp(p[1]),
theta = exp(p[2]), scale = exp(p[3]), shape = exp(p[4]), log = TRUE))
}, p = c(0, 0, 0, 0))
The output will be
$minimum
[1] 3859.293
$estimate
[1] -1.7178497 0.1176224 -0.2872701 0.4130180
$gradient
[1] -0.0008301586 0.0031436684 0.0001900844 -0.0004574758
$code
[1] 1
$iterations
[1] 23
The number hessian = TRUE
to the nlm
command.
We have chosen the composite lognormal-Fréchet distribution for
simplicity of illustration. The purpose of this illustration was not to
find the “best fitting” model for the Danish data. Of course, many other
composite lognormal distributions can be expected to give better fits
than the composite lognormal-Fréchet distribution. (Nadarajah and Bakar 2012) modeled the
Danish data using a large class of composite lognormal distributions,
including the composite lognormal-Burr, the composite lognormal-inverse
Burr, the composite lognormal-
In Illustration 2, we used nlm
for optimization of the likelihood
function. R has many other model fitting routines like fitdistr
from
the package MASS (Venables and Ripley 2002),
fitdist
from the package
fitdistrplus
(Delignette-Muller et al. 2013), mle
from the package stats4, and MLEstimator
,
MDEstimator
from the package
distrMod (Kohl and Ruckdeschel 2013). For
instance, fitdist
can be used to estimate the parameters of the
composite lognormal-Fréchet distribution as follows:
dclnormf <- function(x, logsigma, logtheta, logscale, logshape) {
dcomplnorm(x, spec = "frechet", sigma = exp(logsigma), theta = exp(logtheta),
scale = exp(logscale), shape = exp(logshape))
}
pclnormf <- function(q, logsigma, logtheta, logscale, logshape) {
pcomplnorm(q, spec = "frechet", sigma = exp(logsigma), theta = exp(logtheta),
scale = exp(logscale), shape = exp(logshape))
}
qclnormf <- function(p, logsigma, logtheta, logscale, logshape) {
qcomplnorm(p, spec = "frechet", sigma = exp(logsigma), theta = exp(logtheta),
scale = exp(logscale), shape = exp(logshape))
}
fitdist(danish[1:2492], "clnormf", start = list(logsigma = -1.718,
logtheta = 0.118, logscale = -0.287, logshape = 0.413))
The output will be
Fitting of the distribution ' clnormf ' by maximum likelihood
Parameters:
estimate Std. Error
logsigma -1.7180280 0.05737326
logtheta 0.1176365 0.02413085
logscale -0.2877565 0.16564834
logshape 0.4130318 0.03704017
Also mle
can be used to estimate the parameters of the composite
lognormal-Fréchet distribution as follows:
nllh <- function(p1, p2, p3, p4) {
-sum(dcomplnorm(danish[1:2492], spec = "frechet",
sigma = exp(p1), theta = exp(p2), scale = exp(p3), shape = exp(p4), log = TRUE))
}
mle(nllh, start = list(p1 = -1.718, p2 = 0.118, p3 = -0.287, p4 = 0.413))
The output will be
Call:
mle(minuslogl = nllh, start = list(p1 = -1.718, p2 = 0.118, p3 = -0.287,
p4 = 0.413))
Coefficients:
p1 p2 p3 p4
-1.7178735 0.1176081 -0.2870383 0.4130586
We have developed a new R package for computing quantities of interest for any composite lognormal distribution. The computed quantities include the pdf, cdf, qf, and random numbers. Although the package is specifically designed for composite lognormal distributions, it can be easily altered for any other composite distribution.
The authors would like to thank the Editor and the two referees for careful reading and for their comments which greatly improved the paper.
CompLognormal, CRAN, poweRlaw, SMPracticals, MASS, fitdistrplus, distrMod
ActuarialScience, Distributions, Econometrics, Environmetrics, MixedModels, NumericalMathematics, Psychometrics, Robust, Survival, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nadarajah & Bakar, "CompLognormal: An R Package for Composite Lognormal Distributions", The R Journal, 2013
BibTeX citation
@article{RJ-2013-030, author = {Nadarajah, S. and Bakar, S. A. A.}, title = {CompLognormal: An R Package for Composite Lognormal Distributions}, journal = {The R Journal}, year = {2013}, note = {https://rjournal.github.io/}, volume = {5}, issue = {2}, issn = {2073-4859}, pages = {97-103} }