This article describes the R package RNGforGPD, which is designed for the generation of univariate and multivariate generalized Poisson data. Some illustrative examples are given, the utility and functionality of the package are demonstrated; and its performance is assessed via simulations that are devised around both artificial and real data.
It is well known that the variance of a Poisson variable equals to its
mean. However, over- and under- dispersion in count data could make this
mean-variance equality assumption unrealistically simplistic. This
situation is often caused by the heterogeneity in the population, while
we implicitly assume that the weights assigned to each event are equal
when we employ the regular Poisson distribution. This problem can be
addressed by modeling count data using the generalized Poisson
distribution (GPD), which enables us to assign varying weights to events
(Satterthwaite 1942). The GPD includes a dispersion parameter
As discussed in the book of (Consul 1989), the GPD has two parameters, rate and dispersion. It can be regarded as a mixture of Poisson distributions according to (Joe and Zhu 2005).
The GPD can be described mathematically as follows: Let X be a discrete
random variable defined over non-negative integers, and let
and zero otherwise, where
Besides, if we regard the weights of each event in a time period as independent and formulate the summation of the weights as a characteristic function, the distribution function of the sum of weights has all the properties of the GPD after the application of the Fourier transformation (Satterthwaite 1942).
According to (Vernic 2000), the pmf for the multivariate GPD can be
derived using the multivariate reduction method. In her derivation, an
The applications of the GPD in science and business vary in a wide range that spans life insurance, physics, genetic biology, and public health. (Satterthwaite 1942) mentioned a case where insurance companies model the average financial cost per claim with the GPD, which allows the weights (costs of different claims) to be heterogeneous. (Vernic 1997) modeled the joint distribution of the yearly frequencies of hurricanes affecting the first and the third zones of the north Atlantic coastal states in the USA as a bivariate GPD.
(Consul and Famoye 2006) gave an example regarding the induction and
restitution process of chromosomes. Chromosomes can be damaged in the
induction process, and repaired in the restitution process. The
dispersion parameter
Although the importance of generating the multivariate generalized Poisson data is evident, there has not been a comprehensive computational tool specifically targeted for this particular distribution. (Demirtas 2017) compared a few random variate generation techniques for univariate GPD, and mentioned the potential for the generation of multivariate GPD variates via correlation mapping procedure in a similar fashion to the method of (Yahav and Shmueli 2012), which is concerned with correlated regular Poisson data generation. The R package RNGforGPD (Li et al. 2020) is developed to provide the accommodating tools for the expanded versions of the methods that appear in (Demirtas 2017) and (Yahav and Shmueli 2012), where the augmentation is mostly about allowing over- and under-dispersion for count data in a correlated setting.
The rest of the article is organized as follows: In Section 2, we outline the algorithms for the generation of univariate and multivariate generalized Poisson data. In Section 3, we describe the technical details of the R package RNGforGPD. In Section 4, we present the results of simulation studies that are designed around both artificial and real data. Finally, we conclude the paper with a brief discussion in Section 5.
First, we describe the prerequisites for the generation of GPD data. Then, we discuss the five algorithms for generating univariate GPD data, and the algorithm for generating multivariate GPD data, which is based on an adaptation of (Yahav and Shmueli 2012)’s method for generating multivariate regular Poisson data. In addition, we provide guidance on how to choose an appropriate data generation function for generating univariate GPD data at the end of this section.
In general, an appropriate choice from the five algorithms in generating
univariate GPD data depends on the values of the rate (
Algorithms | Steps | Notes |
Inversion |
|
This algorithm is a
general purpose
univariate random
number generation
method that depends
on the recursive
relationship between
consecutive GPD
probabilities:
for |
Branching |
|
This algorithm is a
distribution
specific algorithm
and it only works
for positive
|
Normal Approximation |
|
This algorithm uses the first two moments and a continuity correction in generating univariate GPD data. |
Build-Up |
|
The cumulative distribution function (cdf) is built up by the recursive computation of the mass probabilities (Kemp 1981). |
Chop-Down |
|
The generated uniform variate is decreased by an amount equal to the cdf (Kemp 1981). |
(Yahav and Shmueli 2012) developed an algorithm for generating
multivariate Poisson data using an improved version of the NORTA method
(NORmal To Anything). The NORTA method can be used for generating
multivariate regular Poisson data by simulating a
In our package, we keep the corrections of correlation matrix and adapt
the calculations of
(1) Compute the intermediate correlation matrix
(2) Generate a
(3) For each value
(4) For each
where
The resulting vector
In designing our package, we give our users the freedom to choose their preferred methods for generating univariate GPD data. However, users who are not familiar with the mechanisms of generating univariate GPD might not know how to choose the best method for their simulation scenarios. (Demirtas 2017) evaluated the relative advantages and disadvantages of the five methods for generating univariate GPD data in terms of unbiasedness, variability, and speed in simulation studies. We find the results he found are instructive on choosing the appropriate methods for package users, so we summarize his findings as follows:
The RNGforGPD package provides functions for generating univariate and multivariate data that follow the generalized Poisson distribution. The package is available via the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=RNGforGPD. Once the package has been appropriately installed on a local machine, the results presented in this paper can be reproduced. The R code used in this manuscript can be accessed at https://demirtas.people.uic.edu/RNGforGPD_paper_LDC_R_Journal.R.
This package includes two data generating functions: GenUniGpois
and
GenMVGpois
. The data generating functions are supported by five core
functions: CmatStarGpois
, ComputeCorrGpois
, Corr-
NNGpois
, QuantileGpois
, and ValidCorrGpois
, that provide essential
support to the two data generating functions. Throughout this article,
we use the following input arguments as given in Table 2:
Input Argument | Description |
---|---|
sample.size |
Number of rows to be generated in multivariate generalized Poisson data. |
no.gpois |
Dimension of the multivariate generalized Poisson distribution. |
n |
Number of data points to be generated in univariate generalized Poisson data. |
p |
Percentile of the generalized Poisson distribution. |
cmat.star |
Intermediate correlation matrix to be used in generating multivariate generalized Poisson data. |
corMat |
A positive definite target correlation matrix whose entries are within the valid limits. |
theta.vec |
A vector of rate parameters in the multivariate generalized Poisson distribution. |
lambda.vec |
A vector of dispersion parameters in the multivariate generalized Poisson distribution. |
theta |
Rate parameter in the univariate generalized Poisson distribution. |
lambda |
Dispersion parameter in the univariate generalized Poisson distribution. |
method |
Method to be used in generating univariate generalized Poisson data. |
details |
Boolean parameter for users to decide whether to display the specified and empirical values of parameters. |
verbose |
Boolean parameter for users to decide whether to display traces or not. |
Table 3 summarizes each function in the RNGforGPD package:
Function type | Function name | Description |
---|---|---|
GenUniGpois |
Generates univariate GPD variables | |
GenMVGpois |
Generates univariate GPD variables | |
CmatStarGpois |
Computes the intermediate correlation matrix | |
ComputeCorrGpois |
Computes correlation bounds | |
CorrNNGpois |
Adjusts the target correlation | |
QuantileGpois |
Computes the quantile for GPD | |
ValidCorrGpois |
Validates the correlation matrix |
Their functionality, in the context of generalized Poisson data generation, is described in the next several subsections:
GenUniGpois
GenUniGpois
generates univariate data that follow the GPD with
pre-specified rate and dispersion parameters using appropriate methods
according to different values of theta
, lambda
, n
, details
, and
method
as input arguments. A warning will be displayed if the method
chosen by user is inappropriate considering the characteristics of each
method. For example, the Normal Approximation method does not work
well for
GenMVGpois
GenMVGpois
, also referred to as the "engine" function in our
package, generates multivariate GPD data. It generates multivariate data
that follow the GPD with pre-specified rate parameter vector, dispersion
parameter vector, and an intermediate correlation matrix. Its
functionality depends on all the other functions in the package (except
for GenUniGpois
). Besides, it requires the rmvnorm
function from the
mvtnorm (Genz et al. 2020)
package, the is.positive.definite
function from the
corpcor (Schafer et al. 2017)
package, and the nearPD
function from the
Matrix (Bates and Maechler 2019) package.
It takes sample.size
, no.gpois
, cmat.star
, theta.vec
,
lambda.vec
, and details
as input arguments. The cmat.star
argument
is the intermediate correlation matrix, and is later used to obtain the
target correlation matrix using the inverse cdf transformation method in
GenMVGpois
. This argument needs to be executed using the
CmatStarGpois
function before it can be used by the GenMVGpois
function.
Generation of the multivariate GPD data is more complex than that of the univariate GPD data due to the restrictions on the correlation matrix. These requirements can also be verified by the core functions as explained below.
CmatStarGpois
and CorrNNGpois
CmatStarGpois
function computes an intermediate correlation matrix,
that will be used to obtain the target correlation matrix, using the
inverse cdf transformation method in GenMVGpois
. Because the target
correlation matrix has to be positive definite and its entries must be
within the correlation bounds, therefore CmatStarGpois
requires the
functionality of both ValidCorrGpois
and CorrNNGpois
.
ValidCorrGpois
checks the validity of the values of pairwise
correlations including positive definiteness, symmetry, and correctness
of the dimensions. CorrNNGpois
adjusts the realized correlation to the
target correlation bounds.
The following example shows the use of CorrNNGpois
for adjusting the
realized correlation to the targeted correlation bounds, and
CmatStarGpois
for computing intermediate values of pairwise
correlations between three GPD variates.
set.seed(3406)
CorrNNGpois(c(0.1, 10), c(0.1, 0.2), 0.5)
#> [1] 0.8016437
lambda.vec <- c(-0.2, 0.2, -0.3)
theta.vec <- c(1, 3, 4)
M <- c(0.352, 0.265, 0.342)
N <- diag(3)
N[lower.tri(N)] <- M
TV <- N + t(N)
diag(TV) <- 1
cstar <- CmatStarGpois(TV, theta.vec, lambda.vec, verbose = FALSE)
cstar
#> [,1] [,2] [,3]
#> [1,] 1.0000000 0.3943785 0.2946171
#> [2,] 0.3943785 1.0000000 0.3601862
#> [3,] 0.2946171 0.3601862 1.0000000
If the intermediate correlation matrix is not positive definite, the nearest positive definite matrix will be used.
QuantileGpois
QuantileGpois
function computes the quantile for the generalized
Poisson distribution for specified values of percentile, QuantileGpois
for computing the quantile of a generalized
Poisson distribution given
QuantileGpois(0.98, 1, -0.2, details = TRUE)
#> x = 0, P(X = x) = 0.3678794, P(X <= x) = 0.3678794
#> x = 1, P(X = x) = 0.449329, P(X <= x) = 0.8172084
#> x = 2, P(X = x) = 0.1646435, P(X <= x) = 0.9818519
#> When lambda is negative, we need to account for truncation error
#> The adjusted CDF are: 0.3746792 0.8323133 1
#> [1] 2
The corresponding cdf are adjusted to account for truncation error when
ComputeCorrGpois
and ValidCorrGpois
Theoretically, correlation bounds (both Pearson and Spearman
correlations) for pairwise random variables are between -1 and 1.
However, correlation bounds in practice are often narrower than their
theoretical limits due to the restrictions imposed by the marginal
distributions. Given vectors of ComputeCorrGpois
computes the pairwise correlation bounds between any
pair of generalized Poisson variables using the Generate, Sort, and
Correlate (GSC) algorithm described in (Demirtas and Hedeker 2011). It is
also an integral part of ValidCorrGpois
function that examines whether
values of pairwise correlation matrix fall within the limits imposed by
the marginal distributions. Besides, ValidCorrGpois
checks positive
definiteness, symmetry, correctness of the dimensions of the input
correlation matrix. The following example demonstrates the use of both
functions:
set.seed(86634)
ComputeCorrGpois(c(3, 2, 5,4), c(0.3, 0.2, 0.5, 0.6), verbose = FALSE)
#> $min
#> [,1] [,2] [,3] [,4]
#> [1,] NA -0.8441959 -0.8523301 -0.8040863
#> [2,] -0.8441959 NA -0.8364747 -0.7861681
#> [3,] -0.8523301 -0.8364747 NA -0.7966635
#> [4,] -0.8040863 -0.7861681 -0.7966635 NA
#> $max
#> [,1] [,2] [,3] [,4]
#> [1,] NA 0.9838969 0.9937024 0.9869316
#> [2,] 0.9838969 NA 0.9872343 0.9819031
#> [3,] 0.9937024 0.9872343 NA 0.9941324
#> [4,] 0.9869316 0.9819031 0.9941324 NA
ValidCorrGpois(matrix(c(1, 0.9, 0.9, 1), byrow = TRUE, nrow = 2),
c(0.5, 0.5), c(0.1, 0.105), verbose = FALSE)
#>[1] TRUE
The following diagram shows the dependencies of the functions in the RNGforGPD package, the arrows suggest that the function on the tail of each arrow depends on the function on its head:
In this section, we present three examples that hinge upon one
artificial and two real data-based scenarios. We demonstrate the
functionality of the package through simulating GPD data based on the
parameter estimates (rate parameter
The most intuitive way to check the legitimacy of the simulation results
is to compare the estimated rate and dispersion parameters to the
specified quantities. Moreover, to further verify the simulation results
for multivariate GPD, we calculate the first four moments via simulated
data and compare them with the theoretical (specified) moments since
most real life distributions are typically characterized by their first
four moments. The expressions for the mean (
As can be seen from the variance expression,
In this example, we generate a four-dimensional Poisson data of size 2,000 based on 1,000 replications.
One of its marginal random variables is distributed as a regular Poisson distribution, and the other three follow the GPD with different rate and dispersion parameters. The specifications on the rate and dispersion parameters of the four Poisson distributions are listed below:
As we are generating multivariate GPD data, we need to specify a positive definite correlation matrix whose entries are within the feasible lower and upper bounds. The specified correlations and the empirical results that are obtained through averaging the correlation matrix across 1,000 replications are shown below:
Specified | |||||
---|---|---|---|---|---|
1.0000 | 0.1521 | 0.2652 | 0.2428 | ||
0.1521 | 1.0000 | -0.6475 | 0.1645 | ||
0.2652 | -0.6475 | 1.0000 | -0.2522 | ||
0.2428 | 0.1645 | -0.2522 | 1.0000 | ||
Empirical | |||||
1.0000 | 0.1520 | 0.2676 | 0.2445 | ||
0.1520 | 1.0000 | -0.6499 | 0.1654 | ||
0.2676 | -0.6499 | 1.0000 | -0.2517 | ||
0.2445 | 0.1654 | -0.2517 | 1.0000 |
Below is the table of empirical
Parameter | Comparison | Variable 1 | Variable 2 | Variable 3 | Variable 4 |
---|---|---|---|---|---|
Specified | 2.0000 | 3.0000 | 5.0000 | 55.0000 | |
Rate ( |
Empirical | 1.9994 | 3.0041 | 4.9930 | 55.1155 |
Specified | 0.0000 | 0.4000 | 0.5000 | -0.2500 | |
Dispersion ( |
Empirical | 0.0004 | 0.3996 | 0.5004 | -0.2527 |
We can see that the empirical GenMVGpois
function are very
close to the specified true parameters. The table below compares their
first four moments:
Moments | Comparison | Variable 1 | Variable 2 | Variable 3 | Variable 4 | |
Specified | 2.0000 | 5.0000 | 10.0000 | 44.0000 | ||
Mean |
Empirical | 2.0001 | 5.0036 | 9.9949 | 43.9975 | |
Specified | 2.0000 | 13.8889 | 40.0000 | 28.1600 | ||
Variance |
Empirical | 2.0016 | 13.8809 | 40.0505 | 28.0374 | |
Specified | 0.7071 | 1.3416 | 1.2649 | 0.0603 | ||
Skewness |
Empirical | 0.7079 | 1.3397 | 1.2669 | 0.0595 | |
Specified | 0.5000 | 2.8667 | 2.6000 | -0.0091 | ||
Kurtosis |
Empirical | 0.5018 | 2.8581 | 2.6085 | -0.0092 |
Moments | Variable 1 | Variable 2 | Variable 3 | Variable 4 |
---|---|---|---|---|
Mean |
0.01% | 0.07% | 0.05% | 0.01% |
Variance |
0.08% | 0.06% | 0.13% | 0.44% |
Skewness |
0.11% | 0.14% | 0.16% | 1.29% |
Kurtosis |
0.36% | 0.30% | 0.33% | 1.72% |
Tables 6 and 7 show that the empirical first four moments of the generated GPD data are very close to the specified ones, indicating the algorithm of generating the data properly captures the true parameter values with negligible deviations.
To further illustrate the precision of the algorithm, we use the entries (1, 2), (1, 3), (2, 3), (3, 4) of the empirical correlation matrices to generate the plot as shown in Figure 1 using the R package ggplot2 (Wickham 2016). The dashed lines represent the specified correlation values, the figure shows that the empirical correlation values across 1,000 iterations fluctuate around the specified ones within reasonably small ranges.
The epilepsy data (Thall and Vail 1990) available from the R package robustbase (Maechler et al. 2020) were collected from a randomized clinical trial investigating the treatment effect of an anti-epileptic drug called Progabide, which was originally conducted by (Leppik 1985). In this clinical trial study, 59 patients suffering from simple or complex partial seizures were randomized to groups receiving either the anti-epileptic drug Progabide or a placebo in addition to standard chemotherapy. The baseline number of seizures occurred was measured for each patient followed by four successive post-randomization clinic visits with the number of seizures occurring over the previous two weeks reported. Although all patients were crossed over to the other treatment, we are only interested in modeling the number of seizures at baseline and the four pre-crossover follow-up responses for each patient as a realization of GPD using our package. The number of seizures that occur on a patient follows a Poisson distribution by assuming that each patient is independent of each other, and we regard them as fixed unit "intervals". In this scenario, the GPD is more appropriate than the ordinary Poisson distribution in modeling the count data since the patients generally do not exhibit the characteristics of homogeneity in real life.
The data set has a relatively small sample size of 59, and we set the
number of replications as 1,000. First, we simulate univariate GPD data
based on the baseline seizure counts measured. The true rate
Parameters | Specified | Branching | Inversion | Build-Up | Chop-Down | Normal |
---|---|---|---|---|---|---|
Rate |
6.4904 | 6.7701 | 6.7548 | 6.8478 | 6.8594 | 7.9017 |
Dispersion |
0.7921 | 0.7817 | 0.7821 | 0.7773 | 0.7770 | 0.7597 |
The first column lists the true values of rate and dispersion parameters calculated by the method of moments. Overall, the empirical rate parameters overestimate the true rate parameters, and the empirical dispersion parameters underestimate the true dispersion parameters. We note that the Normal Approximation method performs badly, perhaps due to the limitation of its approximation to the Poisson distribution with a small rate parameter.
We can infer from the above simulation results that
Since under a small sample size, the sample variance
To alleviate this over/under estimation problem and demonstrate the
performance of our package, we use the true
Parameters | Specified | Branching | Inversion | Build-Up | Chop-Down | Normal |
---|---|---|---|---|---|---|
Rate |
6.4904 | 6.4973 | 6.4953 | 6.6879 | 6.7043 | 7.8181 |
Dispersion |
0.7921 | 0.7919 | 0.7920 | 0.7845 | 0.7841 | 0.7619 |
We note that the Build-up and Chop-Down methods do not work as well as the Branching and Inversion methods, which capture the true rate and dispersion parameters of the data. The Normal Approximation method still does not perform well in the large sample scenario.
(Deb and Trivedi 1997) conducted research on 4,406 individuals, aged 66 and
over, who are covered by Medicare (a public insurance program). The data
are available as DebTrivedi.rda
in the R package
MixAll (Iovleff 2019). For
this data set, we consider the multivariate generation of two mutually
exclusive measures of utilization variables, one pathologic variable,
and one demographic variable: visits to a physician in an office setting
(OFP), visits to a physician in a hospital outpatient setting (OPP)
adjusted by adding 1 to avoid computational complexities, the number of
chronic diseases and conditions (NUMCHRON), and the years of education
received (SCHOOL). The simulation results based on 1,000 replications
are shown below:
Parameter | Comparison | OFP | SCHOOL | OPP + 1 | NUMCHRON |
---|---|---|---|---|---|
Specified | 2.0529 | 8.8291 | 0.6342 | 1.4188 | |
Rate ( |
Empirical | 2.0545 | 8.8345 | 0.6357 | 1.4191 |
Specified | 0.6445 | 0.1420 | 0.6378 | 0.0799 | |
Dispersion ( |
Empirical | 0.6442 | 0.1416 | 0.6373 | 0.0801 |
Specified | |||||
---|---|---|---|---|---|
1.0000 | 0.0644 | 0.0681 | 0.2619 | ||
0.0644 | 1.0000 | -0.0122 | -0.0658 | ||
0.0681 | -0.0122 | 1.0000 | 0.1008 | ||
0.2619 | -0.0658 | 0.1008 | 1.0000 | ||
Empirical | |||||
1.0000 | 0.0646 | 0.0743 | 0.2688 | ||
0.0646 | 1.0000 | -0.0120 | -0.0660 | ||
0.0743 | -0.0120 | 1.0000 | 0.1066 | ||
0.2688 | -0.0660 | 0.1066 | 1.0000 |
Moments | Comparison | OFP | SCHOOL | OPP + 1 | NUMCHRON |
Specified | 5.7744 | 10.2903 | 1.7508 | 1.5420 | |
Mean |
Empirical | 5.7745 | 10.2922 | 1.7528 | 1.5427 |
Specified | 45.6871 | 13.9781 | 13.3426 | 1.8215 | |
Variance |
Empirical | 45.6190 | 13.9688 | 13.3244 | 1.8231 |
Specified | 2.6794 | 0.4665 | 4.7475 | 1.0152 | |
Skewness |
Empirical | 2.6767 | 0.4660 | 4.7371 | 1.0155 |
Specified | 11.8495 | 0.2979 | 37.1841 | 1.2852 | |
Kurtosis |
Empirical | 11.8255 | 0.2972 | 37.0193 | 1.2865 |
Moments | OFP | SCHOOL | OPP + 1 | NUMCHRON | |
---|---|---|---|---|---|
Mean |
0.00% | 0.02% | 0.11% | 0.05% | |
Variance |
0.15% | 0.07% | 0.14% | 0.09% | |
Skewness |
0.10% | 0.11% | 0.22% | 0.03% | |
Kurtosis |
0.20% | 0.26% | 0.44% | 0.09% |
Graphical tools further verify the simulation results: In Figure 2, we see that a few randomly selected empirical correlations (entries (1, 2), (1, 4), (2, 4)) across 1,000 replications fluctuate around the specified quantities within reasonably small ranges, suggesting that the algorithm is consistent and efficient in capturing the desired values.
We evaluate the performance of the package through a comparison between
the theoretical and empirical first four moments and parameters, as we
reported earlier in the manuscript. As correctly indicated by a
reviewer, an additional assessment that involves identifying the
proximity between theoretical and empirical pmf’s may provide further
support for the software tool under consideration. In this spirit, we
show two plots (Figure 3 and Figure 4) that
visually validate the feasibility of our multivariate GPD generating
algorithm. In Figure 3, we compare the theoretical and
empirical marginal probabilities using each of the four random variables
specified in the artificial data example. The green and red dots
represent the theoretical and empirical probabilities, respectively. The
theoretical probabilities are calculated based on the pmf of each
univariate GPD when its rate and dispersion parameters are specified.
The empirical probabilities are marginally extracted from the
multivariate data simulated in the artificial data example. We observe
that the empirical and theoretical probabilities align closely, which
suggests that our algorithm can accurately simulate data that follow the
specified GPD marginally. In Figure 4, we show a comparison
plot between specified and empirical correlations in an attempt to
examine if our multivariate GPD generating algorithm can reasonably
simulate the specified correlations. We specify a bivariate GPD using
the variable 2 and variable 4, as previously defined in the artificial
data example. The specified correlations range from -0.89 to 0.86 with
an increasing step size of 0.05, which are within the lower and upper
correlation bounds, as verified using the ComputeCorrGpois
function.
For each specified correlation, we generate a sample of 100 observations
with 50 replications, and calculate the average empirical correlation.
The plot shows that empirical correlations closely approximate the
specified correlations for all scenarios.
Throughout this paper, we have demonstrated the functionality and performance of the RNGforGPD package. This package is an accurate and computationally efficient tool in random generation of both univariate and multivariate data that follow the generalized Poisson distribution. Overall, the performance of the algorithm is decent. Deviations between the specified and average empirical parameter values are within tolerable limits in both the univariate and multivariate data generation cases. Our simulation studies suggest that this package successfully implements the algorithms in both artificial and real life scenarios as long as there are no specification errors and the correlations are within the feasible limits (Demirtas and Hedeker 2011). In situations such complications occur, appropriate warning or error messages will be generated to alert the user. The simulation results we present can be regarded as a compelling evidence for capturing the characteristics of both rate and dispersion parameters (naturally the first four moments that are functions of these quantities) as well as the true association structure in the multivariate cases with only minor differences. In summary, the RNGforGPD package provides a valuable tool for investigators who need a generalized Poisson data generation mechanism in their research.
RNGforGPD, mvtnorm, corpcor, Matrix, ggplot2, robustbase, MixAll
Cluster, Distributions, Econometrics, Finance, NumericalMathematics, Phylogenetics, Robust, Spatial, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Li, et al., "RNGforGPD: An R Package for Generation of Univariate and Multivariate Generalized Poisson Data", The R Journal, 2021
BibTeX citation
@article{RJ-2021-007, author = {Li, Hesen and Demirtas, Dr. Hakan and Chen, Ruizhe}, title = {RNGforGPD: An R Package for Generation of Univariate and Multivariate Generalized Poisson Data}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {12}, issue = {2}, issn = {2073-4859}, pages = {173-188} }