Simulating Correlated Binary and Multinomial Responses under Marginal Model Speciﬁcation: The SimCorMultRes Package

We developed the R package SimCorMultRes to facilitate simulation of correlated categorical (binary and multinomial) responses under a desired marginal model speciﬁcation. The simulated correlated categorical responses are obtained by applying threshold approaches to correlated continuous responses of underlying regression models and the dependence structure is parametrized in terms of the correlation matrix of the latent continuous responses. This article provides an elaborate introduction to the SimCorMultRes package demonstrating its design and usage via three examples. The package can be obtained via CRAN.


Introduction
Fitting marginal models with correlated binary or multinomial responses is required in many applications in which the responses are assumed to be correlated.The obvious instance of such studies is longitudinal studies (Diggle et al., 2002) where the categorical responses for each subject are collected across time points and form a cluster.For each cluster, the associated covariates are also recorded as they might influence the true marginal probabilities.Ordinary regression models designed for independent responses might not lead to consistent estimators of the marginal regression parameters or of their standard errors.For this reason, many authors have developed and proposed procedures for estimating the regression parameters of a marginal model with categorical responses that are robust to misspecification of the dependence structure, including maximum likelihood methods (Fitzmaurice and Laird, 1993;Glonek and McCullagh, 1995), copula approaches (Masarotto and Varin, 2012), quasi-least squares approaches (Shults and Chaganty, 1998), generalized quasi-likelihood methods (Sutradhar and Das, 1999;Sutradhar, 2003) and generalized estimating equations (GEE) approaches (Lipsitz et al., 1991;Chaganty and Joe, 2004;Touloumis et al., 2013).Although the asymptotic properties of these methods are well-established, the evaluation of their performance in finite samples under misspecification of the correlation structure relies on simulations.The crucial step of these empirical studies is to simulate correlated categorical responses that satisfy a desired marginal model and dependence structure specification.
Motivated by this, we present the R package SimCorMultRes (Touloumis, 2016) which makes it easy to simulate correlated categorical responses under a given marginal model and dependence structure configuration.The package implements marginal models for correlated binary responses (two response categories) as well as for correlated multinomial responses (three or more response categories) while taking into account the nature of the response categories (ordinal or nominal).In summary, the correlated binary/multinomial responses are obtained as realizations of an underlying continuum.This means that latent regression models with correlated continuous responses are utilized so as to generate the correlated categorical responses that satisfy the desired marginal model specification.The categorical responses are obtained by applying threshold approaches to the correlated continuous responses.In order to avoid theoretical pitfalls outlined in the next paragraph, the desired dependence structure is expressed in terms of the correlation matrix of the latent responses.To the best of our knowledge, SimCorMultRes is the first package in R that allows direct simulation of correlated categorical responses under a marginal model specification with categorical and/or continuous covariates.
To fully appreciate the features of SimCorMultRes, we briefly compare it with two R packages: i) GenOrd (Barbiero and Ferrari, 2015), that implements the methods presented by Ferrari and Barbiero (2012) and its features being discussed in greater detail in Barbiero and Ferrari (in press), and ii) Mul-tiOrd (Amatya and Demirtas, 2016), that is described in Amatya and Demirtas (2015) and relies on the simulation techniques proposed by Demirtas (2006).These packages are designed to simulate random vectors of correlated binary or ordinal responses subject to fixed but common marginal probabilities across all subjects and a predefined correlation matrix for the correlated categorical responses.Therefore, unlike SimCorMultRes, it is not straightforward to utilize GenOrd or MultiOrd for simulating categorical responses conditional on a regression model specification for the marginal probabilities, especially when the marginal probabilities vary across subjects.In addition, SimCorMultRes has the The R Journal Vol.8/2, December 2016 ISSN 2073-4859 unique feature (to the best of our knowledge) to simulate correlated nominal responses.Another difference between SimCorMultRes and the R packages GenOrd and MultiOrd is that the former requires the association among the categorical responses to be directly expressed via their correlation matrix and that the joint specification of the marginal probabilities and of the correlation matrix leads to a valid joint distribution for the correlated categorical responses.A necessary condition for this is that the so-called Frétchet-Hoeffding bounds are satisfied, which can be verified by employing the method of Demirtas and Hedeker (2011).As noted by one of the reviewers, both GenOrd and Multi-Ord have built-in mechanisms to check the restrictions imposed by the Frétchet-Hoeffding bounds.
Unfortunately, even if these restrictions are met, it is still theoretically possible that a legitimate joint distribution does not exist for the correlated categorical responses (Bergsma and Rudas, 2002).To circumvent this difficulty, the methodology implemented in SimCorMultRes always defines the joint distribution of the correlated categorical responses in terms of the joint distribution of correlated latent random variables and thus, it allows the user to generate correlated categorical responses under any configuration of the marginal probabilities provided that the user-defined correlation matrix of the latent continuous responses is positive definite, a condition that can be more easily verified.
The remainder of this paper is organized as follows.First we present the theoretical background of the threshold approaches implemented in SimCorMultRes.In particular, we introduce the general two-stage algorithm for simulating correlated categorical responses, focusing on the threshold approaches that give rise to the marginal models with correlated categorical responses and on the modified version of the NORmal To Anything (NORTA) method (Cario and Nelson, 1997), the default simulation method of correlated latent random variables in SimCorMultRes.Next, we describe the core and utility functions of the package.Then, we demonstrate the use of SimCorMultRes by considering the problems of evaluating two estimation methods for marginal models with correlated nominal multinomial responses, of assessing the quality of an approximation that links the uniform local odds ratios structure with the correlation parameter of an underlying bivariate normal distribution, and of simulating correlated categorical random variables under no marginal model specification.Finally, we summarize the features of SimCorMultRes and discuss future extensions.

Theoretical background
In this section, we introduce the threshold approaches that give rise to marginal models with correlated binary, ordinal or nominal responses.Since the thresholds are applied to correlated continuous responses, simulation of correlated continuous responses is required.This step can be performed in various ways, e ġ˙, directly from an appropriate multivariate distribution, by utilizing distributional properties about the sum or the difference of random vectors or by employing copula approaches.Herein we discuss a simple and straightforward simulation method that is based on the NORTA method, and we present a general algorithm that combines the threshold approaches with the modified NORTA method, enabling us to generate correlated categorical responses subject to a marginal model specification in a unified manner.However, we underline that the use of the NORTA method is optional in the general algorithm and that it can be replaced with another simulation method/technique as long as the distributional restrictions regarding the correlated continuous variables that are imposed by the thresholds are met.
For notational ease, adopt a longitudinal set-up for generating the correlated binary or multinomial variables.Let Y it be the random variable of subject i (i = 1, . . ., N) at time t (t = 1, . . ., T) and let x it denote the associated covariates vector.To be consistent with the notation in the majority of the literature, let Y it ∈ {0, 1} when there are two response categories and let Y it ∈ {1, 2, . . ., J ≥ 3} for at least three categories.

Binary responses
Suppose the aim is to simulate correlated binary variables such that the marginal probabilities satisfy the model where β t0 is the intercept and β t is the covariates parameter vector at time t, respectively, and where F is a cumulative distribution function (c.d.f.).Now, consider the multivariate latent regression model The R Journal Vol.8/2, December 2016 ISSN 2073-4859 where µ B it = β t x it and {e B i : i = 1, . . ., N} are independent random vectors such that e B it ∼ F for all i and t.Under these assumptions, generation of binary responses under the threshold gives rise to the marginal model (1), where I (A) denotes the indicator function of the event A. This approach is a straightforward extension of the gold-standard simulation method proposed by Emrich and Piedmonte (1991), in the sense that it also permits marginal modeling of the univariate probabilities through covariates.Implementation of the method of Emrich and Piedmonte (1991) can be found in the orphaned R package mvtBinaryEP (By and Qaqish, 2011).

Ordinal responses
Options for marginal modelling of correlated ordinal responses include the marginal cumulative link model and the marginal continuation-ratio model In both models, F is a c.d.f. and β t is the parameter vector at time t when the corresponding (J − 1) category-specific intercepts β t10 , β t20 , . . ., β t(J−1)0 are excluded.
First, consider the marginal cumulative link model (2) and suppose the multivariate latent regression model holds, where µ O1 it = −β t x it , and {e O1 i : i = 1, . . ., N} are independent random vectors such that e O1 it ∼ F for all i and t.To generate an ordinal response Y it that satisfies model (2), one can categorize U B it by using the corresponding category-specific intercepts according to the threshold This threshold approach extends the approach discussed in McCullagh (1980) from cumulative link models with independent ordinal responses to marginal cumulative link models with correlated ordinal responses.
Next, consider the marginal continuation-ratio model (3) and suppose the following multivariate latent regression model holds where itJ for all i and t, and {e O2 i : i = 1, . . ., N} are independent random vectors such that: 1. e O2 itj ∼ F for all i, t and j, 2. e O2 itj and e O2 itj are independent for all j = j (local independence assumption).
The marginal continuation-ratio model (3) arises by applying the threshold to the components of U it 's in a sequential order.This approach extends the latent variable representation described in Tutz (1991) which gives rise to the continuation-ratio model for independent ordinal responses (see Agresti, 2013).

Nominal responses
Consider the marginal baseline-category logit model where β tj0 and β tj is the j-th category-specific intercept and parameter vector at time t, respectively.For identifiability reasons, restrictions such as β tJ0 = 0 and β tJ = 0 for all t are required, which imply that β * tj0 = β tj0 and β * tj = β tj for all t and for all j = 1, . . ., J − 1.Note that model ( 4) relates with the baseline-category logit model (see Agresti, 2013) and hence it is appropriate for marginal modelling of correlated nominal responses.
To connect the marginal baseline-category logit model ( 4) with underlying regression models, consider the multivariate latent regression model it and e NO it = e NO it1 , . . ., e NO itJ for all i and t, and {e NO i : i = 1, . . ., N} are independent random vectors such that: 1. e NO itj follow a standard extreme distribution for all i, t and j, 2. the assumption of choice independence is met at each measurement occasion, that is e NO itj and e NO itj are independent for all j = j .
The threshold extends the principle of maximum random utility (McFadden, 1974) and it generates correlated nominal responses that give rise to the marginal baseline-category logit model ( 4).
Simple version of the NORTA method Li and Hammond (1975) proposed a simple method for generating continuous random vectors with given marginal distributions and a prescribed correlation matrix.Cario and Nelson (1997) introduced the NORTA method which essentially modifies the approach of Li and Hammond (1975) to account for any type of marginal distributions (discrete, continuous or mixed).Here, we describe a simple version of the NORTA method in which the desired marginal distributions are continuous and identical which is required by all the threshold approaches implemented in SimCorMultRes.
Let F be the c.d.f. of the target marginal distribution.To generate a p-variate random vector W = W 1 , . . ., W p with correlation matrix cor (W) = R W such that W k ∼ F for all k = 1, . . ., p, the following NORTA transformation can be utilized: 1. Generate a random vector Z = Z 1 , . . ., Z p from a standard multivariate normal distribution with correlation matrix cor (Z) = R Z .The elements of R Z are calculated by solving numerically p (p − 1) /2 equations, such that each equation relates cor (Z k , Z k ) with cor (W k , W k ) for all k < k .The exact formulae are given by Li and Hammond (1975).

Apply the transformation
where Φ is the cumulative distribution of the standard normal distribution.
If F = Φ, then the second step of the above modified NORTA algorithm is not needed.Otherwise, the correlation matrices R Z and R W are expected to differ.In fact, Cario and Nelson (1997) showed that under mild conditions it is possible to have R Z ≈ R W .For example, if F is the cumulative distribution function of the standard logistic distribution (which might be the case in the marginal models for correlated binary and ordinal responses), then R Z ≈ R W due to the well-known approximation Φ (x) = F (xπ/3) for all x ∈ .This simplifies the computational task as the p (p − 1) /2 equations are not needed to be solved and issues regarding non-existence of a valid correlation matrix R Z for a given choice of the correlation matrix R W (Li and Hammond, 1975) are avoided provided that R Z is positive-definite under mild conditions (Cario and Nelson, 1997).

General generative process
We propose a simple and efficient two-staged general algorithm for generating correlated categorical responses: Stage 1. Marginal model specification: Provide the covariates, the regression parameters and the link function (if required) of the desired marginal model ( 1), ( 2), (3), or (4).
Stage 2. Simulation of continuous random vectors via the NORTA method and threshold approach: Define the desired dependence structure by fixing R Z to generate the continuous random vectors U i 's from the multivariate latent regression model implied by the marginal model specification selected in Stage 1. Apply the corresponding threshold approach to obtain the correlated binary, ordinal or nominal responses.
In the marginal models described above, we usually choose F to be the c.d.f. of a standard normal, logistic or extreme value distribution.In either of these cases, it can be shown that the simulated categorical responses are independent if and only if R Z is the identity matrix, which is true if and only if the random variables in the latent random vectors e i 's are independent (Cario and Nelson, 1997).
For all other forms of R Z , correlated categorical responses will be generated.
Expressing the association structure in terms of R Z ensures the existence of a joint distribution for the correlated categorical responses regardless of the marginal model specification which is not the case when the association is expressed directly via the correlation matrix of the correlated categorical responses.This well-known fact has been mentioned by Bergsma and Rudas (2002) among others, and it has been exemplified in the case of correlated binary and multinomial responses by Chaganty and Joe (2004), Chaganty and Joe (2006) and Touloumis et al. (2013), respectively.The simplest scenario where adopting a common correlation matrix for the correlated categories responses across subjects is problematic is when the linear predictor in the marginal model is allowed to vary freely on the real line.In this case, only the identity matrix is a feasible value for the correlation matrix.
As mentioned before, the proposed version of the NORTA method is not the only option to simulate continuous random vectors in Stage 2 and instead, alternative simulation techniques can be easily employed.However, the user must be cautious in order to respect the corresponding marginal distributional assumptions and the assumption of local independence or choice independence whenever the marginal models (3) or (4) are used, respectively.
We emphasize that the proposed algorithm can also handle the situation in which no marginal model specification is provided.For more details, please refer to the third example below.

Description of SimCorMultRes
SimCorMultRes contains four core functions (rbin, rmult.bcl,rmult.clmand rmult.crm)that enable the user to generate correlated categorical responses and two utility functions (rnorta and rsmvnorm) initially designed for internal use in the core functions.We describe in detail the arguments and the output of the core and utility functions.

Core functions
Each core function in SimCorMultRes simulates correlated categorical responses under a marginal model specification.In particular, rbin simulates correlated binary responses that satisfy the marginal model (1), rmult.clmsimulates correlated ordinal responses that satisfy the marginal cumulative link model (2), rmult.crmsimulates correlated ordinal responses that satisfy the marginal continuationratio model (3) and rmult.bclsimulates correlated nominal responses that satisfy the marginal baseline-category logit model (4).
The common cluster size (clsize) of the subjects is required in all core functions.The ncategories argument in rmult.bclindicates the number of nominal response categories.The number of ordinal response categories in rmult.clmand rmult.crm is indirectly defined by the intercepts argument.It contains the values of the threshold parameters which can be provided either as a T × (J − 1) matrix or as a vector of length J − 1.In the first case, the (t, j)-th element of intercepts corresponds to β tj0 and in the second case, it is assumed that β tj0 = β j0 for all t in the marginal models (2) or (3).The intercepts argument is also employed in rbin to specify whether the intercepts in the marginal model (1) are time-dependent.If β t0 = β 0 for all t, then intercepts should be a single number that reflects the value of β 0 .Otherwise, it should be a vector of size T with the t-th element equal to the value of β t0 .
The values for the marginal regression parameters (betas) should be provided as a numeric vector whenever β t = β for all t in models (1), ( 2) or (3), and whenever β tj = β j and β tj = β j for all t in The R Journal Vol.8/2, December 2016 ISSN 2073-4859 model ( 4).In all other cases, betas should be provided as a matrix with T rows such that the t-th row contains the value of the marginal parameter vector at time t.It is important to emphasize that (category-specific) intercept values should not be included in betas unless the function rmult.bcl is used.
The functional relationship of the covariates in the marginal model (xformula) is specified similarly as in other regression models with the single difference that no response variable should be provided.The covariates defined in xformula can be imported via the xdata argument in "long" format, meaning that each row contains all the subject-specific covariates information at a given time.When xdata is missing, then the covariates are extracted from the environment that the core function is called.
The link argument in rbin, rmult.clmor rmult.crmdetermines the c.d.f.F in the marginal models (1), ( 2) or (3) respectively, i. e., the link function.Options for the link function include the probit ("probit"), the logit ("logit"), the complimentary log-log ("cloglog") and the cauchit ("cauchit").It is worth mentioning that there is no link argument in the function rmult.bclbecause the marginal distribution of the latent continuous random variables e NO itj 's is always the standard extreme value distribution.
In all core functions, the latent random vectors e i 's can be either simulated using the proposed NORTA approach or provided by the user via the rlatent argument.In the first case, the correlation matrix R Z of the multivariate normal distribution (cor.matrix) in the modified NORTA method and the link argument, wherever present, are required.Checks are carried out to ensure that cor.matrix is a positive-definite correlation matrix and whenever rmult.crmor rmult.bcl is employed, cor.matrix is forced to satisfy the restrictions of the latent dependence structure that are implied by the threshold approach associated with models (3) or (4), respectively.In the case where the preferred simulation method is not the NORTA method, rlatent should contain the values of the latent random vectors while cor.matrix and link are ignored.Examples of using the rlatent argument can be found in the help files and the vignette of SimCorMultRes.
The output of any core function is displayed as a list with three items: (i) a matrix with the simulated responses such that the (i, t)-th element corresponds to the realization of Y it (Ysim), (ii) a data frame (simdata) that contains the simulated responses (y), the covariates specified by xformula, subjects' identities (id) and the measurement occasions (time), and (iii) the NORTA generated or user-defined latent random vectors (rlatent).

Utility functions
The utility function rnorta offers a more general implementation of the NORTA method described earlier.The user needs to specify the number of random vectors (R), the correlation matrix R Z of the multivariate normal distribution (cor.matrix) and the names of the quantile functions of the desired marginal distributions (distr).The optional qparameters argument permits users to consider parameter values for the marginal distributions other than the default (obtained when qparameters = NULL).The function returns R random vectors with marginal distributions specified by distr (and qparameters) when cor.matrix is the correlation matrix of the multivariate normal distribution in the NORTA method.We highlight that rnorta has been extended to handle situations that are beyond the scope of simulation of correlated categorical responses subject to a marginal model specification.Unlike the simple version of the NORTA method needed for our purposes, rnorta does not require marginal distributions to be identical.In fact, any univariate discrete or continuous distribution whose quantile function is available in R can be employed in distr provided that the required R package is available.
The function rsmvnorm generates R random vectors from a multivariate normal distribution with mean vector the zero vector and covariance matrix cor.matrix.
Note that an error message is returned whenever cor.matrix in functions rnorta or rsmvnorm is not a positive-definite correlation matrix.

Empirical illustration
We now illustrate the use of SimCorMultRes to: i) evaluate the performance of GEE approaches for estimating the regression parameters of a marginal baseline-category logit model, ii) to verify approximations that relate a uniform local odds ratios structure to the correlation coefficient of a bivariate normal distribution (Goodman, 1979) and, iii) to simulate correlated categorical random variables with fixed arbitrary univariate probabilities that are not subject to a marginal model specification.

Parameter estimation of marginal models
The motivation behind the creation of SimCorMultRes lies on evaluating statistical methods that estimate the regression coefficients of marginal models with correlated binary or multinomial responses.
To exemplify this, we employ two GEE models for estimating a marginal model with correlated nominal responses: i) the local odds ratios GEE approach (Touloumis et al., 2013) and ii) the independence "working" model, which treats all observations as independent when solving the estimating equations.
Although the two competing GEE models are asymptotically equally efficient, in the sense that they both produce consistent estimators for the marginal regression parameters and of their standard errors, the regression coefficient estimators of the independence "working" model are expected to be slightly less precise than those of the local odds ratios GEE approach in small and moderate sample sizes due to the fact that the independence "working" model does not account for the dependence among the correlated responses (Touloumis et al., 2013).
Table 1 summarizes the simulation results by displaying the simulated mean and standard error of the regression estimates from the two competing GEE models and the simulated relative efficiency (SRE) for each regression parameter of model ( 5).For a given coefficient of model ( 5), the SRE criterion was defined as the ratio of the simulated mean square error of the corresponding Monte Carlo estimate based on the local odds ratios GEE approach to that based on the independence "working" model.Values of the SRE criterion greater (less) than 1.0 imply that the local odds ratios GEE approach is more (less) efficient than the independence "working" model in estimating this specific regression parameter.As expected, the two GEE models seem to estimate consistently the marginal model ( 5), with the local odds ratios GEE approach being 4.05%-9.27%more efficient in estimating each regression coefficient.

Uniform association model and bivariate normal distribution
Let f ab denote the observed frequency of the cell (a, b) in a two-way contingency table and let F ab be the corresponding expected frequency under some model, for a = 1, . . ., A and b = 1, . . ., B. Goodman (1979) proposed the uniform association model where the parameters ν, {κ a : a = 1, . . ., A}, {λ b : b = 1, . . ., B} and φ are identifiable once restrictions, such as sum to zero constraints (Agresti, 2013), are applied to {κ a : a = 1, . . ., A} and {λ b : b = 1, . . ., B}.The association between the row and column variables is modelled parsimoniously by assuming a common value φ for the (A − 1) × (B − 1) log local odds ratios.The key property of the uniform association model is that φ relates to the correlation parameter ρ of an underlying bivariate normal distribution (Goodman, 1979)  where η = (2φ) −1 .The validity of these approximations has been explored only for ρ = 0.5 by Goodman (1979).Here, we perform a more detailed empirical investigation by considering a grid of values for ρ, namely ρ = 0.05, 0.10, . . ., 0.95.
For each value of ρ, we simulated 1000 random vectors from a bivariate normal distribution with mean vector the zero vector and covariance matrix the correlation matrix 1 ρ ρ 1 .
In a similar fashion as in Goodman (1979), correlated ordinal responses were generated by applying the threshold approach linked to model ( 2), with F = Φ and equi-distanced category-specific intercepts (β 10 , β 20 , β 30 , β 40 , β 50 , β 60 , β 70 ) = (−3, −2, −1, 0, 1, 2, 3).The sampling scheme does not involve any covariates, that is β t = 0 and x it = 0 for all t.Next, we cross-classified the correlated simulated responses to obtain a 8 × 8 contingency table and we estimated φ by fitting the uniform association model ( 6).We repeated this procedure 10000 times: > library("SimCorMultRes") > set.seed(123 The produced warnings() reflect the fact that we have added 0.001 to each cell of the two-way contingency table whenever an observed zero count occurred to ensure the existence of the maximum likelihood estimator of φ (Birch, 1963).We estimated the underlying correlation parameter ρ with ρ φ obtained by replacing φ in (8) with its Monte Carlo counterpart φ.The following R commands were run to obtain Figure 1.

Simulating correlated categorical responses under no marginal model specification
For completeness' sake, we illustrate how to utilize SimCorMultRes in order to generate correlated categorical random variables conditional on a desired dependence structure and known marginal probabilities that are not determined by a regression model.
Suppose the goal is to simulate 5000 trivariate vectors and a common uniform local odds ratio structure holds for all i = 1, . . ., 5000, t < t and j, j = 1, 2, 3.The above sampling scheme can be reparametrized in terms of the threshold approach related to the marginal cumulative link model (2) while utilizing the conclusions of the previous example to obtain the desired dependence structure.
To this direction, first define ] for all t (t = 1, 2, 3) and j (j = 1, 2, 3) as the category-specific intercepts of a marginal cumulative probit model with no covariates: The R Journal Vol.8/2, December 2016 ISSN 2073-4859 Next, for the pairwise dependence structure, approximate the desired pairwise uniform local odds ratios φ 12 , φ 13 , φ 23 via the correlation parameters of an underlying trivariate normal distribution with mean vector the zero vector.Since the desired pairwise local odds ratios are all equal (φ 12 = φ 13 = φ 23 = 2), we may assume that the corresponding correlation parameters are all equal.This common correlation parameter ρ can be sufficiently approximated by equation ( 8), which suggests that ρ ≈ 0.5543: > CommomLOR <-log(2) > eta <-1/(2 * CommomLOR) > rhophi <-(sqrt(1 + eta^2) -eta) * 13/12 > rhophi [1] 0.5543136 To sum up, the desired correlated multinomial responses can be simulated under a cumulative probit model with no covariates and an exchangeable correlation matrix for the underlying trivariate normal distribution with correlation parameter equal to 0.5543: > cor.matrix <-toeplitz(c(1, rhophi, rhophi)) > simdata <-rmult.clm(clsize= clsize, intercepts = intercepts, betas = 0, + xformula = ~x, link = "probit", cor.matrix = cor.matrix)This approach can also be employed to generate correlated binary random variables with known marginal probabilities provided that the desired correlation structure of the binary responses can be expressed in terms of a correlation matrix in the NORTA method.In this case SimCorMultRes is essentially implementing the simulation method of Emrich and Piedmonte (1991) without performing the first step of their algorithm.

Summary
We have presented the R package SimCorMultRes that simulates correlated binary or multinomial random variables conditional on a marginal model specification while expressing the dependence structure via the correlation structure of latent random variables.We outlined the underlying theory that SimCorMultRes is based on and illustrated the use of the package with three examples.To the best of our knowledge, SimCorMultRes is the first R package that targets specifically on the generation of correlated binary, nominal or ordinal responses under marginal model specification.
In some instances, it could also be used to simulate correlated categorical responses even when no model specification is provided for the marginal probabilities by exploiting the relationship of association measures for discrete variables and the bivariate normal distribution.This can be achieved by following a similar approach as the one adopted in the third example herein.The results in this paper were obtained using SimCorMultRes version 1.4.1 and R 3.3.1.
Although the NORTA method is the default tool for simulating the latent random vectors denoted by e i 's, it is extremely important to emphasize that these can be provided by the user via the rlatent argument in the core functions.For example, generating correlated binary responses under a marginal logit model specification and with an exchangeable correlation matrix, can be accomplished by taking the difference of two independent random vectors from the multivariate Gumbel distribution each with correlation matrix the desired correlation matrix.This approach can be found in standard textbooks, such as Balakrishnan (1992).A working example, can be found in the vignette of this package.
The R Journal Vol.8/2, December 2016 ISSN 2073-4859 A future direction is to increase the scope of marginal regression models for nominal and ordinal responses, e.g., by including threshold approaches that give rise to a marginal adjacent-categories logit model and allowing category-specific regression parameters in the marginal models with ordinal responses.

Figure 1 Figure 1 :
Figure1displays the absolute difference between the true correlation parameter ρ and ρ φ.In general, approximation (8) seems to work well for weak correlation patterns, that is when ρ ≤ 0.20.The The simulated category-specific probabilities satisfy the desired marginal configuration