Two-Tier Latent Class IRT Models in R by

In analyzing data deriving from the administration of a questionnaire to a group of individuals, Item Response Theory (IRT) models provide a flexible framework to account for several aspects involved in the response process, such as the existence of multiple latent traits. In this paper, we focus on a class of semi-parametric multidimensional IRT models, in which these traits are represented through one or more discrete latent variables; these models allow us to cluster individuals into homogeneous latent classes and, at the same time, to properly study item characteristics. In particular, we follow a within-item multidimensional formulation similar to that adopted in the two-tier models, with each item measuring one or two latent traits. The proposed class of models may be estimated through the package MLCIRTwithin, whose functioning is illustrated in this paper with examples based on data about quality-of-life measurement and about the propensity to commit a crime.


Introduction
Several fields of human knowledge require the measurement of unobservable constructs (or latent traits) through ad hoc methods based on questionnaires consisting of multiple items having dichotomously or ordered politomously scored response categories. This is the case of measurement of customer satisfaction, quality-of-life, level of physical and/or psychological disabilities, ability in certain subjects, and so on.
Item Response Theory (IRT) models (Hambleton and Swaminathan, 1985; Van der Linden and Hambleton, 1997; are well-known statistical models to deal with these data. In their original formulation, these models are characterized by: (i) unidimensionality (i.e., only one latent trait is assumed to be measured by all items); (ii) a parametric (usually normal) distribution for the latent variables used to represent the trait of interest; and (iii) no effect of individual covariates on this latent trait. These elements often turn out to be restrictive in modern applications and, therefore, several extensions of IRT models have been proposed in the literature. Among the possible extensions, in this paper we consider the class of multidimensional Latent Class (LC) IRT models proposed by Bartolucci (2007) and von Davier (2008); see also . Models of this type are characterized by: (i) multidimensionality, in the sense that more latent traits may be measured by the set of items (Reckase, 2009); (ii) discreteness of the latent variables, so that homogeneous subpopulations (or latent classes; Lazarsfeld and Henry, 1968;Goodman, 1974) of individuals are detected with respect to the constructs measured by the questionnaire; and (iii) possible presence of individual covariates affecting the probabilities to belong to each latent class.
In particular, we focus on a specific extension of IRT models based on within-item multidimensionality (Adams et al., 1997), which is characterized by items affected by more than one latent variable. This is opposed to the more common between-item multidimensionality, where each item may measure only one latent variable as in the original approach of Bartolucci (2007). More in detail, the model here proposed represents a discrete version of the item bifactor model and of the more general two-tier IRT model (Bock et al., 1988;Gibbons and Hedeker, 1992;Gibbons et al., 2007;Cai, 2010;Cai et al., 2011;Reise, 2012;Bonifay, 2015), based on a particular within-item multidimensional formulation with each item loading on at most two latent variables that are mutually uncorrelated. With respect to traditional item bifactor and two-tier models, which assume the normality of the latent variables, the discreteness assumption increases the flexibility of the approach and allows us to cluster individuals in homogeneous latent classes. Formann and Kohlmann (2002) propose a general approach based on latent classes that includes the discrete two-tier model here proposed as special case. However, different from the proposal of these authors, we let the class membership probability depend on individual covariates and we also allow for more flexibility in terms of specification of model link function. Limited to binary items, a recent example of application of the proposed two-tier LC-IRT model is provided in  to jointly study certain students' abilities and the propensity to skipping item responses.
The procedures to estimate the proposed class of two-tier LC-IRT models are implemented in the R package MLCIRTwithin , downloadable from http://CRAN.Rproject.org/package=MLCIRTwithin, whose illustration is the primary focus of the present paper. In particular, we are interested in providing a detailed description of the main functions of this package, named est_multi_poly_within and search.model_within, also through some applications.
The remainder of the paper is organized as follows. In the next section we provide the formulation of the proposed class of two-tier LC-IRT models and, then, some details about likelihood inference for these models. Furthermore, we describe the main functions implemented in the R package MLCIRTwithin for model estimation. In the following, the functioning of the package is illustrated through two applications: the first one concerns the measurement of Heath-related Quality Of Life (HQOL) on cancer patients and the second one is about the measurement of propensity to commit crimes. Some final remarks conclude the work.

The class of models
The proposed class of models is formulated on the basis of two independent vectors of latent variables representing the unobservable individual characteristics measured by the test items. For each unit i = 1, . . . , n, these vectors are denoted by U i = (U i1 , . . . , U iD 1 ) and V i = (V i1 , . . . , V iD 2 ) and are of dimension D 1 and D 2 , respectively. Similarly to the item bifactor model, we assume that each item response Y ij , with i = 1, . . . , n and j = 1, . . . , r, where r is the number of items, may depend on (and then measures) at most two latent variables, under the constraint that these two variables do not belong to the same vector. This is formalized by introducing the disjoint subsets U 1 , . . . , U D 1 and V 1 , . . . , V D 2 of J = {1, . . . , r}, where U d 1 contains the indices of the items depending on latent variable U id 1 and V d 2 is the set of those depending on latent variable V id 2 . Equivalently, Y ij depends on U id 1 if and only if j ∈ U id 1 and on V id 2 if and only if j ∈ V id 2 . Note that, even if the subsets U 1 , . . . , U D 1 cannot overlap, and the same is assumed for V 1 , . . . , V D 2 , the same item j may belong both to a set of the first type and to a set of the second type (within-item multidimensionality); more formally, there may exist d 1 and d 2 such that j ∈ U d 1 and j ∈ V d 2 . In practice, some items belonging to U d 1 , d 1 = 1, . . . , D 1 , will be present also in V d 2 , d 2 = 1, . . . , D 2 . With respect to the specification commonly encountered in the literature on item bifactor and two-tier models, our proposal is more general, as any value of D 1 and D 2 is allowed, whereas D 1 = 1 (or, alternatively, D 2 = 1) in the item bifactor model and D 1 = 2 (or, alternatively, D 2 = 2) in the two-tier model. Moreover, components of U i are allowed to be correlated; the same holds for components of V i .
An illustrative example of the above assumptions is provided in Figure 1, where D 1 = 2, D 2 = 1, and four items out of r = 7 measure two latent traits (item 2 measures dimensions U i1 and V i1 ; items 3, 5, and 6 measure dimensions U i2 and V i1 ); the two dimensions U i1 and U i2 do not share any item.
Adopting a semi-parametric approach for the latent distribution, the first latent vector U i is assumed to have a discrete distribution based on k 1 support points u 1 , . . . , u k 1 and, in absence of individual covariates, common mass probabilities λ 1 , . . . , λ k 1 . Similarly, the distribution of the second latent vector V i has k 2 support points v 1 , . . . , v k 2 and, again in absence of individual covariates, common mass probabilities π 1 , . . . , π k 2 . In both cases, the support points identify classes of individuals that are homogeneous with respect to the latent traits represented by U i and V i . Note that cases with k 1 = 1 or k 2 = 1 detect a special situation in which vector of latent variables U i or V i , respectively, has no role in explaining the observed item responses.
For binary response variables, the measurement model assumes that, for i = 1, . . . , n, j = 1, . . . , r, h 1 = 1, . . . , k 1 , and h 2 = 1, . . . , k 2 , where 1{·} is the indicator function and γ 1j , γ 2j , and β j are suitable item parameters. As usual for IRT models, γ 1j and γ 2j represent the discrimination power of item j with respect to the latent variables in U i and V i , respectively, whereas β j denotes the difficulty level of item j. In the previous expression, u h 1 d 1 denotes the d 1 -th element of u h 1 , whereas v h 2 d 2 denotes the d 2 -th element of v h 2 .
Different from traditional LC models characterized by constant mass probabilities, a more general approach is based on assuming that the probabilities to belong to every latent class defined by the The R Journal Vol. 8/2, December 2016 ISSN 2073-4859 distribution of U i and V i depend on individual covariates, when such covariates are observed. For this aim, we denote the vector of covariates for individual i = 1, . . . , n by X i and we assume that U i and V i are conditionally independent given X i . Moreover, we adopt the following multinomial logit parametrization (Formann, 2007) for each latent vector: , where x i contains the constant term. The vectors of coefficients δ 1h 1 and δ 2h 2 measure the effect of the covariates on the logit to belong to class h 1 = 2, . . . , k 1 and h 2 = 2, . . . , k 2 , with respect to class h 1 = 1 and h 2 = 1, respectively. Alternatively, a global logit formulation may be adopted. This is related to a cumulative logit formulation (Agresti, 2013), where the logits in equations (2) and (3) are substituted with respectively. The main advantage of the global logit parametrization is the easier interpretation of the regression coefficients that now refer to the effect of the covariates on the logit to belong to a specific class (or higher) with respect to a lower class. However, this parameterization requires the latent classes to be ordered according to a specific criterion (e.g., requiring an increasing trend of the support points for a given dimension).
In the case of polytomously scored items with ordered categories indexed from 0 to l j − 1, the model based on Equation (1) may be extended according to a global logit link function, so that a graded response model (Samejima, 1969) Alternatively, using a local logit link function, we may assume a partial credit model (Masters, 1982): In the above expressions, y = 1, . . . , l j − 1 and the difficulty parameter β jy is now specific of item j and response category y. A more parsimonious model is obtained by expressing β jy as the sum of two components (rating scale parametrization; Andrich, 1978), that is, β jy = β j + τ y , j = 1, . . . , r; y = 1, . . . , l j − 1, so that the distance in terms of difficulty from category to category (i.e., τ y ) is the same for all items. Note that the rating scale parametrization is allowed only when items have the same number of response categories (i.e., l j = l, j = 1, . . . , r). For more details about the possible item parametrizations in the presence of ordinal items see  and .
In order to ensure the identification of the proposed class of models, two necessary conditions must hold. First, as usual in the IRT modeling, we must constrain one discriminant index to be equal to 1 and one difficulty parameter to be equal to 0 for each dimension. More in detail, let j d 1 be a specific element of U d 1 and j d 2 a specific element of V d 2 for d 1 = 1, . . . , D 1 and d 2 = 1, . . . , D 2 . Then we assume γ 1j d 1 = 1, γ 2j d 2 = 1, and, when item difficulties are free, β j d 1 1 = 0 and β j d 2 1 = 0, whereas in the presence of a rating scale parametrization we assume β j d 1 = 0, β j d 2 = 0, and τ 1 = 0. In the case of binary items, constraints on difficulties simplify to β j d 1 = 0 and β j d 2 = 0. Generally speaking, j d 1 and j d 2 may be chosen in an arbitrary way, paying attention to select a different item for each dimension. So, in the example illustrated in Figure 1, if we constrain item j = 1 for dimension U i1 and item j = 3 for dimension U i2 , then for dimension V i1 we may constrain any one of the items in the subset V 1 with the only exception of item j = 3. As an alternative to constraining item parameters, we may fix the support points, as in the general diagnostic model of von Davier (2008).
A further identification condition requires that at least one item belongs to one of the subsets U d 1 or The R Journal Vol. 8/2, December 2016 ISSN 2073-4859 to one of the subsets V d 2 ; more formally, the union of U 1 , . . . , U D 1 must be different from the union of V 1 , . . . , V D 2 . In other words, we restrict γ 1j = 0 or γ 2j = 0 for at least one j and the maximum number of items shared by U i and V i is equal to r − 1. Alternatively, we may skip this restrictive condition by specifying in a suitable way linear constraints (e.g., equality restrictions) on some discriminant parameters (for some examples see Cai, 2010;Cai et al., 2011).
To specify in a flexible way constraints on the support points and item parameters, we denote the complete vectors of support points by u = (u 11 , u 12 , . . . , u k 1 D 1 ) for latent variable U i and v = (v 11 , v 12 , . . . , v k 2 D 2 ) for latent variable V i , the complete vectors of item discriminating indices as γ 1 = (γ 11 , . . . , γ 1r ) for items affected by U i (then γ 1j is missing if item j does not belong to U 1 , . . . , U D 1 ) and γ 2 = (γ 21 , . . . , γ 2r ) for items affected by V i (then γ 2j is missing if item j does not belong to V 1 , . . . , V D 2 ), and the complete vector of item difficulties as β = (β 11 , . . . , β r,l r −1 ) (or β = (β 1 , . . . , β r ) in the binary case). The corresponding vectors of free support points and free item parameters are denoted byũ,ṽ,γ 1 ,γ 2 , andβ, respectively. A wide range of linear constraints and fixed values of the parameters are specified through a suitable definition of matrices Z u , Z v , Z γ 1 , Z γ 2 , and Z β and vectors z u , z v , z γ 1 , z γ 2 , and z β , as follows: For instance, according to the usual IRT parametrization with free support points and constraints on the item parameters (i.e., one discriminant index equal to 1 and one difficulty parameter equal to 0 for each dimension), Z u and Z v are identity matrices of dimensions k 1 D 1 × k 1 D 1 and k 2 D 2 × k 2 D 2 , respectively, and z u and z v are null vectors. Moreover, matrices Z γ 1 , Z γ 2 , and Z β are defined as identity matrices without those columns corresponding to the constrained item parameters, whereas z γ 1 and z γ 2 are vectors with ones in correspondence of constrained item discrimination parameters and zeros otherwise; z β is a vector of zeros. Further examples of specification of constraints on model parameters are provided in the sequel, when the functioning of the estimation functions of the proposed R package and an example on criminal data (Example 2) are illustrated.

Likelihood inference
The proposed two-tier LC-IRT model can be estimated by maximizing the marginal log-likelihood where η is the vector of free model parameters, that is, support points of U i and V i , item difficulty and discrimination parameters, and regression coefficients for the covariates; in the previous expression, y i = (y i1 , . . . , y ir ) is the vector of observed item responses for subject i. Due to the local independence assumption, the marginal likelihood L i (y i |x i ) for subject i (or manifest probability of y i ) used in equation (12) is given by: , which depends on (1) in the case of binary items and on (4) or (5) in the case of ordinal items, if a global logit or a local logit parametrization is adopted.
We maximize (η) through the Expectation Maximization (EM) algorithm (Dempster et al., 1977), which is based on alternating two steps until convergence: E-step: the expected value of the complete data log-likelihood (i.e., the log-likelihood for the observed and latent variables) is computed, given the current parameter vector η. In practice, this consists in computing the posterior probability q h 1 h 2 i for h 1 = 1, . . . , k 1 , h 2 = 1, . . . , k 2 , and i = 1, . . . , n; this is the probability that unit i belongs to latent class h 1 , according to the first vector of latent variables, and to latent class h 2 , according to the second vector of latent variables, given the observed data, that is, p(U i = u h 1 , V i = v h 2 |x i , y i ). By the Bayes' theorem we have that The resulting complete data log-likelihood is, in expected value, equal to * (η) = M-step: the parameter vector η is updated by maximizing function (13) obtained at the previous step. Note that single parameter subvectors of η may be updated separately, as this function factorizes in three components involving the mass probabilities λ h 1 (x i ), the mass probabilities π h 2 (x i ), and the conditional response probabilities p h 1 h 2 (y ij ), respectively. Iterative algorithms of Newton-Raphson type are necessary to maximize all components (for details see , and references therein) with the exception of the first two in the case of absence of individual covariates. In fact, in this case we have the following explicit expressions to update the class weights: Similar to the other iterative algorithms, the first iteration of the EM algorithm needs to be initialized through suitable values for the model parameters that can be chosen according to certain deterministic or random rules. A common problem with finite mixture models, and then with the proposed model, is due to the presence of several local maximum points of the log-likelihood function. Therefore, in order to avoid a solution that does not correspond to the global maximum, a good practice consists in repeating the estimation process for a specific model a certain number of times using random starting values and, in the presence of different values of the log-likelihood at convergence, the solution corresponding to the highest log-likelihood value is selected.
A crucial point is that of model selection, mainly as concerns the choice of the number of support points (or latent classes) for both latent vectors (i.e., k 1 and k 2 ). For this aim, a likelihood-ratio test cannot be directly used, as the regularity conditions for having an asymptotic null distribution of χ 2 -type are not satisfied for this type of test when it is applied to compare two models with different values of k 1 and k 2 . We then suggest to rely on suitable forms of penalization of the maximum log-likelihood, such as the Akaike Information Criterion (AIC; Akaike, 1973), which is related to Kullback-Leibler distance between the true density and the estimated density of a model. This criterion is based on the following index: AIC = −2ˆ (η) + 2#par, withˆ (·) denoting the estimated maximum log-likelihood and #par the number of free parameters. Alternatively, we suggest the use of the Bayesian Information Criterion (BIC; Schwarz, 1978) based on the index BIC = −2ˆ (η) + log(n)#par.
According to both these criteria, one should select the model with the minimum value of AIC or BIC.
Other selection criteria may be based on the entropy, whose computation involves the individual posterior probabilities. Entropy is a measure of the capability of the model to provide a neat partition of the sample units, which is computed as based on the posterior probabilities q h 1 h 2 i . If the components are well separated, the posterior probabilities tend to define a clear partition of the units, assuming values close to one, and, as a consequence, the entropy will be close to zero. Usually, the entropy is not directly used to assess the number of support points and, to also account for the goodness of fit of the model, a normalized version of entropy is used by Celeux and Soromenho (1996). This is defined as whereˆ k 1 k 2 is the maximum log-likelihood of the model with k 1 and k 2 support points andˆ 11 is the maximum log-likelihood value of the model with just one component for both latent variables. According to this criterion, the optimal number of components is the one that minimizes the NEC index. Note that NEC is not defined when k 1 = k 2 = 1, in which case NEC = 1 by convention.
In practice, we propose to fit a series of models with similar specifications that distinguish one other for values assigned to k 1 and k 2 and, then, to make comparisons through one or more of the mentioned criteria. In more detail, given k 1 , we consider increasing values of k 2 and, similarly, given k 2 we consider increasing values of k 1 until AIC, BIC, or NEC do not start to increase and, then, the previous value of support points is taken as the optimal one. More in general, for certain values of k 1 and k 2 we suggest that the choice between two competing models is driven by the likelihood-ratio test in the presence of nested models (i.e., when one model is obtained by the other one through constraints on the parameters), whereas the selection criteria above mentioned are suitable in the presence of non-nested models. The likelihood-ratio test is also used to evaluate the global fit of a model, when this is compared with the saturated model, that is, the largest model one can fit. Note that in the context at issue the saturated model is defined only for model specifications without covariates. Other proposals, coming from the literature on logistic regression models and on IRT models, consist of parametric and non-parametric tests that allow us to verify specific hypotheses concerning, among others, the unidimensionality of the questionnaire, the validity of the Rasch paradigm, the validity of the local independence assumption. In addition to the global fit of a model, also item-specific fit statistics, which are usually based on the comparison between observed and expected item responses, are useful to evaluate the goodness of each item and the need of removing it from the questionnaire. For a wide review of the mentioned methods see , Chap. 5.7, and the references therein.
Finally, in order to facilitate the interpretation of the results, we suggest to standardize the estimated support pointsû d 1 h 1 andv d 2 h 2 , so as to obtain latent variables that have mean 0 and variance equal to 1, and coherently transform the estimated item parametersγ 1j ,γ 2j , andβ jy .

The R package MLCIRTwithin
The class of two-tier LC-IRT models previously described may be estimated through the R package MLCIRTwithin; for technical details see the official documentation provided in CRAN .
Before illustrating the main functions in the package at issue, it is worth mentioning some alternative R packages, which estimate models with a formulation resembling the one proposed. A first example is provided by the R package MultiLCIRT , whose functions are similar, in terms of input and output, to those of MLCIRTwithin; however, MultiLCIRT is limited to the estimation of LC-IRT models under between-item multidimensionality, in the sense that items loading on more than one latent trait are not allowed and then a single vector of latent variables U i is used. Moreover, constraints on the item parameters or fixed values for the support points cannot be specified. Package CDM  performs the estimation of the class of cognitive diagnostic models (Tatsuoka, 1983;Jang, 2008;Rupp and Templin, 2008), in which the proposed, discrete, two-tier model may be included. The class of models estimated through CDM may be characterized, among the main options, by normally distributed latent traits or, alternatively, discrete latent traits whose support points may be freely estimated or may be specified as fixed values. As concerns item parameters, item-by-category specific slopes as well as linear constraints are allowed. However, different from our proposal, individual covariates affecting the class membership as well as the specification of a global logit link are not allowed. Moreover, attention must be paid to the interpretation of the latent classes, which are defined in a quite different and general way with respect to our proposal. The estimation of within-item multidimensional IRT models and item bifactor models is also performed through packages mirt (Chalmers, 2012;Chalmers et al., 2016) and flirt (Jeon et al., 2014), under the assumption of normally distributed latent variables. Package mirt also allows for discrete latent variables; however, in such a case just the multidimensional LC model without a classical IRT parametrization (mainly, without item difficulties) and without covariates is estimated. A major flexibility with respect to mirt is provided by package covLCA (Bertrand and Hafner, 2013), which is focused on multidimensional LC models with covariates affecting both the class membership and the manifest variables. Other two packages to mention are lavaan (Rosseel et al., 2015) and OpenMx (Neale et al., 2016) that perform the estimation of the wide class of structural equation models, in which unidimensional and multidimensional IRT models are included, under the assumption of normality of the latent variables. Finally, we mention two general and flexible softwares that may accommodate the estimation of the model here proposed, that is, Mplus (Muthén and Muthén, 2012) and LatentGold (Vermunt and Magidson, 2005): the former is tailored to the estimation of latent variable models under the assumption of normal or discrete latent variables, whereas the latter is focused on LC models. In both cases, the user may formulate IRT models with a variety of features, among which multidimensionality and presence of covariates.

Functions est_multi_poly_within and est_multi_poly_between
The main function of MLCIRTwithin is est_multi_poly_within, which performs the maximum likelihood estimation of the model specified through equations (1) to (5), allowing for several options.
Function est_multi_poly_within requires the following main input arguments: • S: matrix of item response configurations listed row-by-row; items with a different number of categories and missing responses are allowed.
• yv: vector of the frequencies of every row in S; by default, yv is a vector of ones.
• k1: number of latent classes for latent variable U i .
• k2: number of latent classes for latent variable V i .
• X: matrix of covariates affecting the class weights; by default, X is NULL.
• start: method of initialization of the algorithm: "deterministic" (default value) for values chosen according to a deterministic rule, "random" for values randomly drawn from suitable distributions (continuous uniform between 0 and 1 for the class weights and standard normal for the other parameters), and "external" for values provided by the researcher through inputs Phi, ga1t, ga2t, De1, and De2.
• link: type of link function: "global" for global logits as in Equation (4) and "local" for local logits as in Equation (5). With binary items, any type of link function may be specified, resulting in a Rasch (Rasch, 1960) or a two-parameter logit (2PL; Birnbaum, 1968) type model depending on the value assigned to input disc.
• disc: constraints on the discriminating item parameters: FALSE (default value) for parameters γ 1j and γ 2j all equal to one and TRUE for free values. With binary items, option disc = FALSE results in a Rasch model, whereas a 2PL model is obtained when option disc = TRUE.
• difl: constraints on the difficulty item parameters, in the case of ordinal polytomously scored items: FALSE (default value) for unconstrained parameters β jy and TRUE for a rating scale parametrization as in (6). This option is not allowed in the presence of items with a different number of response categories.
• multi1: matrix with one row for each component of U i and elements in each cell indicating the indices of the items measuring the dimension corresponding to that row; the number of rows is D 1 and that of columns equals the number of items in the largest dimension. If dimensions differ in the number of items, zeros are inserted in the empty cells. Each item corresponding to the first column of each row has discriminating index constrained to 1 and difficulty parameter constrained to 0 to ensure model identifiability. For instance, in the presence of 6 items, with items 1 and 2 measuring the first dimension of latent variable U i and the remaining items 3 to 6 measuring another dimension of U i , as in Figure 1, matrix multi1 is specified as (multi1 <-rbind(c(1,2,0,0), c(3,4,5,6))) • multi2: same as multi1 for latent variable V i . For model identifiability, attention must be payed on the item indices in the first column of multi2 that cannot be the same as the indices in the first column of multi1. For instance, in the situation described in Figure 1, the matrix multi2 specified as (multi2 <-c(7, 2:3, 5:6)) [1] 7 2 3 5 6 implies γ 27 = 1 and β 7 = 0. A particular case is when the intersection between matrices multi1 and multi2 is empty: in such a case a between-item multidimensional LC-IRT model is specified, based on two completely independent latent vectors U i and V i .
• fort: if TRUE, Fortran routines are used whenever possible to speed up computation.
• tol: level of tolerance of the algorithm in terms of relative difference between the log-likelihood corresponding to two consecutive algorithm iterations (default value is 10 −10 ).
• disp: if TRUE, the log-likelihood evolution is displayed step-by-step.
• output: if TRUE, additional output arguments are returned.
• out_se: if TRUE, standard errors and variance-covariance matrix for the parameter estimates are returned.
• glob: type of parametrization for the sub-model assumed on the individual-specific latent class weights: FALSE (default value) for a multinomial logit model as in (2)-(3) and TRUE for a global logit model.
• Zth1, Zth2: matrices for the specification of linear constraints on the support points, according to (7) and (8), respectively; by default these are identity matrices with a number of rows (and columns) equal to the total number of support points, that is, k 1 D 1 and k 2 D 2 , respectively.
• zth1, zth2: vectors of length k 1 D 1 and k 2 D 2 , respectively, for the specification of linear constraints and fixed support points, according to (7) and (8), respectively; by default they are null vectors.
Under the default specifications of Zth1, Zth2, zth1, and zth2, the support points are freely estimated and, for the model identification, certain constraints are assumed on the item parameters. On the contrary, to fix the values of the support points, Zth1, Zth2, zth1, and zth2 must be supplied by the user. For instance, in the situation described in Figure 1 under the assumption k 1 = k 2 = 2, we define u = (−1, −1, 1, 1) and v = (−0.5, 0.5) as follows: Zth1 <-matrix(0,2*2,0) zth1 <-c(rep(-1, times=2), rep(1, times = 2)) Zth2 <-matrix(0,2,0) zth2 <-c(-0.5,0.5) • Zga1, Zga2, Zbe: matrices for the specification of linear constraints on the vectors of item parameters γ 1 , γ 2 , and β, as in (9), (10) and (11), respectively. In more detail, the number of rows of Zga1 and Zga2 is equal to the number of non-null entries in multi1 and multi2, respectively, and coincides with the length of vectors γ 1 and γ 2 ; whereas the number of rows of Zbe corresponds to the total number of item difficulties and coincides with the length of vector β. The number of columns of Zga1, Zga2, Zbe is equal to the total number of free parameters, corresponding to the length of vectorsγ 1 ,γ 2 , andβ, respectively. By default these are identity matrices without those columns corresponding to the constrained parameters. For instance, in the situation described in Figure 1 with the usual IRT constraints γ 11 = γ 13 = 1 and γ 27 = 1 resulting from matrices multi1 and multi2 defined above, and β 1 = β 3 = β 7 = 0 in the case of binary items, the following matrices are used by default in function est_multi_poly_within: (Zga1 <-diag (6) • zga1, zga2, and zbe: vectors whose length is equal to the number of rows of Zga1, Zga2, and Zbe, respectively. In other words, length of zga1, zga2, and zbe is given by the number of elements in γ 1 , γ 2 , and β. The suitable specification of these vectors, combined with that of matrices Zga1, Zga2, and Zbe, allows for linear constraints and fixed values of the item parameters, as in (9), (10), and (11). By default, zga1 and zga2 are vectors with elements 1 for each constrained item and 0 otherwise; by default zbe is a null vector. For instance, in the situation depicted in Figure  1 and matrices multi1 and multi2, default values assumed for vectors at issue are zga1 <-c(1, 0, 1, 0, 0, 0) zga2 <-c(0, 0, 0, 0, 1) zbe <-rep(0, times = 7) Any other constraint may be defined by modifying in a suitable way these three vectors and matrices Zga1, Zga2, Zbe. For instance, if we are interested in fixing the difficulty of (binary) item 4 to be equal to 2 (i.e., β 4 = 2), then we define Zbe and zbe as follows: Zbe <-diag (7) zbe <-c(0, 0, 0, 2, 0, 0, 0) Function est_multi_poly_within supplies the following output: • piv1 and piv2: vectors of the estimated weights of latent classes for U i and V i , respectively; in the presence of individual covariates, these are averages of the individual-specific mass probabilities.
• Th1 and Th2: matrices of estimated and constrained support points for each dimension (by row) and each latent class (by column) for U i and V i , respectively.
• Bec: matrix of estimated and constrained item difficulty parameters; exact zeros correspond to identifiability constraints.
• ga1c and ga2c: vectors of estimated and constrained item discriminating parameters for U i and V i , respectively; exact ones correspond to identifiability constraints and NA to items that do not load on the latent variable.
• Th1s, Th2s, Becs, ga1cs, and ga2cs: standardized values of Th1, Th2, Bec, ga1c, and ga2c, respectively; in Th1s and Th2s classes are re-ordered according to the increasing values of the support points for the first dimension.
• piv1s and piv2s: the same as piv1 and piv2, but re-ordered according to Th1s and Th2s.
• fv1 and fv2: vectors indicating the reference items for each dimension of U i and V i , respectively.
• Phi: conditional response probabilities for every item and each pair of latent classes of U i and V i .
• De1 and De2: matrices of estimated regression coefficients for the model on the class weights for U i and V i , respectively, in the presence of individual covariates; for each covariate and the constant term, the number of estimated coefficients is equal to the number of latent classes minus one.
• Piv1, Piv2, Pp1, Pp2, and lkv: optional output (obtained if output = TRUE) referred to the matrices of weights for every covariate configuration for latent variables U i and V i , the matrices of the posterior probabilities for each response configuration and latent class for latent variables U i and V i , and the values of the log-likelihood during the estimation process, respectively.
• XX1dis and XX2dis: design matrices for the covariates affecting the first and the second vector of latent variables, respectively (optional output obtained if output = TRUE).
• lk: value of the log-likelihood at convergence.
• np: number of estimated model parameters.
Some relevant commands to display output from function est_multi_poly_within are based on the S3 methods summary for the main estimates; coef and confint for the point estimates and confidence intervals (at a specified level of confidence) of support points, item parameters, and regression coefficients; logLik for the value of log-likelihood at convergence; and vcov for the estimated variance-and-covariance matrix.
Another relevant function of package MLCIRTwithin is est_multi_poly_between, which performs the maximum likelihood estimation of an LC-IRT model under between-item multidimensionality. The main differences with respect to function est_multi_poly of the R package MultiLCIRT are that the latter does not allow for items with a different number of response categories and refers to a slightly different specification of item difficulties (for details see . Input arguments required by est_multi_poly_between are very similar to those of function est_multi_poly_within. The main difference is that only one vector of latent variables is involved in the model specification. Consequently, the number of latent classes (input k) is common to all the dimensions and the multidimensional structure of the items is specified through one matrix (input multi), having one row for each dimension. Constraints on model parameters are also possible through a suitable definition of arguments Zth, zth, Zbe, zbe, Zga, and zga, whose functioning is the same as the corresponding arguments in function est_multi_poly_within. The function provides as its main output argument a vector of estimated average weights of the latent classes (output piv) and a matrix of estimated support points for each dimension and each latent class of the latent trait before (output Th) and after the standardization (output Ths). Besides, a matrix of difficulty item parameters (outputs Bec and, in the case of standardization, Becs), a vector of discriminating indices (output gac and, in the case of standardization, gacs), and a matrix of regression coefficients (output De) are provided, other than other output arguments similar to those above described for est_multi_poly_within, included the S3 methods.
In fact, the model corresponding to out2 involves two completely independent latent variables, having incidentally the same number of latent classes: thus, an individual belonging to a specific class (say, class 1) according to the first latent variable may belong to any latent class under the second latent variable (say, class 2). On the contrary, model out1 involves only one latent variable decomposed in two dimensions: thus, belonging to a given latent class under one dimension implies belonging to the same class under the other dimension. Overall, model out2 has k0−1 free parameters more than model out1.

Functions search.model_within and search.model_between
As outlined in Section "Likelihood inference," the selection of a two-tier LC-IRT model may be a quite demanding procedure, requiring the choice of the number of support points for the latent variables and a check for the possible presence of local maxima. Function search.model_within allows us to search for the global maximum of the log-likelihood of a model with a specific formulation (in terms of multidimensional structure, link function, and constraints on the item parameters) given a vector of possible number of latent classes to try for.
In practice, function search.model_within applies function est_multi_poly_within a given number of times for each pair of values for k 1 and k 2 , initializing the estimation algorithm with deterministic and random values of the model parameters and holding, for each pair of k 1 and k 2 , that model with the highest value of the log-likelihood at convergence. To make the entire process computationally less demanding, the search of the global maximum may be performed with a relatively large tolerance level for checking convergence of the estimation algorithm. Then, in order to improve the precision of parameters estimates, the estimates provided by the model with the best value of the log-likelihood are used as starting values in the last step of the model selection process, using an augmented tolerance level. Note that when k 1 = 1 or k 2 = 1 the model estimation is actually performed by the function est_multi_poly_between, which is automatically retrieved by search.model_within.
• kv1 and kv2: vectors of number of latent classes to try for latent variable U i and V i , respectively; single values are also allowed for.
• tol1 and tol2: tolerance levels (default value are 10 −6 and 10 −10 , respectively) for checking convergence of the algorithm as relative difference between consecutive log-likelihoods. The value of tol1 is used for checks based on deterministic and random starting values, whereas the value of tol2 is used for improving the precision of estimates for the model with the best log-likelihood level.
• nrep: constant value that drives the number of estimations of each model with random starting values, given by nrep(k 1 k 2 − 1); the default value for nrep is 2. In the case of nrep equal to 0, only the estimation with deterministic starting values is performed.
Note that if single values for kv1 and kv2 are specified and nrep equals 0, function search.model_within performs just one call of function est_multi_poly_within (or function est_multi_poly_between if kv1 or kv2 equal 1) with option start = "deterministic".
Function search.model_within supplies the following output: • aicv, bicv, and necv: vectors of AIC, BIC, and NEC indices, respectively, for each of the estimated models.
• errv: trace of any error occurred during the estimation process.
• lkv: values of log-likelihood at convergence for each of the estimated models.
• out.single: output of each single model, similar to the output of est_multi_poly_within, with the addition of values of k 1 (output k1); k 2 (output k2); and the sequence of log-likelihoods (output lktrace) for the deterministic start, for each random start, and for the final estimation provided by a tolerance level equal to tol2 (if tol2 > tol1).
Finally, we outline that a function with input and output arguments similar to those of the function search.model_within is available to perform the model selection in the case of between-item multidimensional LC-IRT models. This function is named search.model_between and it relies on est_multi_poly_between.

Examples in R
In the following we illustrate package MLCIRTwithin through two data analysis examples. In Example 1 we describe the model selection procedure, as well as the interpretation of the output, considering a set of ordered items measuring two latent variables. In Example 2, the specification of constraints on the support points and on the item parameters is illustrated through the analysis of data concerning repeated item responses along two time occasions. The detailed software scripts to implement the two examples, named Example1.R and Example2.R, are available in the Supplementary Online Material at https://sites.google.com/site/bartstatistics/sm_mlcirtwithin.zip.

Example 1: analysis of multidimensionality
Data set SF12_nomiss, already provided in the R package MLCIRTwithin, refer to a sample of 493 oncological Italian patients who were asked to fill in the Italian validated Short Form 12 version 2 questionnaire (SF-12; Stewart and Ware, 1992;Ware et al., 2002) concerning the assessment of HQOL. The questionnaire is comprised by 12 items having five ordered response modalities, except items 2 and 3 having only three modalities; a high score means a worse level of HQOL and vice-versa (note that, in the original scoring system, modalities of items 9 and 10 are reversed). Also the age is available for each patient. In the following we show the first few records of the data set and the related summaries.

# For the description of each item see the online documentation ?SF_nomiss
According to the main current literature (see, mainly, Ware et al., 2002), the SF-12 questionnaire may be used to properly evaluate two main aspects of HQOL: physical and emotional. The standard scoring algorithm for summarizing these two latent dimensions is based on an orthogonal factor analysis, on the basis of which positive and negative weights are assigned to each item. More in detail, items 1 to 5 and item 8 have positive weights for physical HQOL and negative weights for emotional HQOL, whereas items 6, 7, 9, 11, and 12 have negative weights for physical HQOL and positive weights for emotional HQOL; item 10 has positive weights for both components. According to this scoring system, the scores of physical and emotional HQOL result by a suitable weighted average of the item responses.
A low score on physical HQOL has the following meaning (Ware and Gandek, 1998): substantial limitations in self-care, physical, social and role activities; severe bodily pain; frequent tiredness; health rated as poor. On the contrary, a high level of physical component corresponds to: no physical limitations, disabilities or decrements in well-being; high energy level; health rated as excellent. As regards the emotional component, a low score implies: frequent psychological distress, social and role disability due to emotional problems; health rated poor. On the other hand, a high level of emotional component corresponds to: frequent positive affect; absence of psychological distress and limitations in usual social activities due to emotional problems; health rated excellent.
In order to select the optimal model, values of BIC index are displayed in the following 36-by-4 matrix, having one row for each model and one column for each multidimensional structure: # BIC indices BIC <-cbind(out1$bicv, out2$bicv, out3$bicv, out4$bicv) colnames(BIC) <-c("Type 1", "Type 2", "Type 3", "Type 4") k1 <-rep(1:6, times = 1, each = 6) k2 <-rep(1:6, times = 6) BIC <-cbind(k1, k2, BIC) BIC On the basis of the above matrix we observe that the minimum value of the BIC index, that is, 13196.32, is displayed in column 2, denoting the multidimensional structure of type 2, and row 28, which refers to models with k 1 = 5 and k 2 = 4 latent classes. The output of the selected model is contained in the object out2$out.single [[28]]. We also observe that the same number of support points is selected for the other multidimensional structures with shared items (i.e., structures of types 3 and 4). [1] 28 After selecting the number of latent classes and the type of within-item multidimensional structure, we estimate again the selected model out with option out_se = T for the computation of standard errors. To reduce the computational time, we use function est_multi_poly_within with option start = "external" and specified as input for options Phi, ga1c, ga2c, De1, and De2 the corresponding output from the selected model.
### Display output of the estimated model # summary(out) # coef(out) # confint(out) # Estimates of support points and average mass probabilities for physical HQOL lv1 <-rbind(out$Th1, t(out$piv1)) rownames(lv1) <-c("Physical HQOL", "Average prob.") round(lv1, 3) According to the estimated model, patients are clustered in 5 latent classes denoting different levels of physical HQOL (output out$Th1) and in 4 latent classes denoting different levels of emotional HQOL (output out$Th2). To simplify the interpretation of the latent classes, it is useful to re-order and standardize the estimated support points. For this aim, function est_multi_poly_within provides the values of support points, which are standardized according to equations (14)-(18) and re-ordered according to increasing values of the first dimension of the corresponding latent variable (outputs out$Th1s and out$Th2s with related weights out$piv1s and out$piv2s, respectively).
# Standardized support points of physical HQOL lv1s <-rbind(out$Th1s, t(out$piv1s)) rownames(lv1s) <-c("Stand. Physical HQOL", "Average prob.") round(lv1s, 3) We observe that classes 5, 4, and 1 of physical HQOL and classes 1 and 4 of emotional HQOL collect patients with negative levels of the related latent trait, whereas patients with levels of HQOL above the mean belong to the remaining classes (i.e., classes 2 and 3 for physical HQOL and classes 3 and 2 for emotional HQOL). We also observe that the distribution of physical HQOL is strongly skewed to negative values, whereas that of emotional HQOL is symmetric (Figure 2).  As concerns item parameters, the discriminating indices and the corresponding standard errors and confidence intervals are displayed as follows: # Item discriminating parameters and related standard errors gamma1 <-cbind(out$ga1c, out$sega1c) colnames(gamma1) <-c("gamma1", "st.err.") round ( In the previous output, the first two tables show the estimates of item discriminating parametersγ 1j andγ 2j , respectively, and the related standard errors, whereas the last table shows the corresponding inferior and superior limits of the confidence intervals at 95% level. Entries denoted by NA refer to those items that do not load on the corresponding latent variable. We observe that discriminating parameters are generally highly significant, with two notable exceptions: discriminating indices for items 9 and 11 referred to physical HQOL (i.e., parameters γ 19 and γ 1,11 ). This result suggests that these two items do not contribute in a significant way to the measurement of physical HQOL.
We also highlight relevant differences among items in terms of difficulty, as results by the related standardized item parameters:  11 -5.514 -2.704 0.827 3.667 12 -6.066 -3.124 0.258 2.707 In particular, positive responses to items related only (items 4 and 5) or mainly (item 1) to physical HQOL require a level of the latent trait in average higher than items related only (items 6 and 7) or mainly (items 9, 11, and 12) to emotional HQOL. In other words, the emotional component of HQOL interferes less than physical component of HQOL on work, daily activities, and social life.

Example 2: analysis of repeated item responses
An interesting example of a real problem that can be solved through the discrete two-tier models is encountered when a questionnaire is used on multiple waves to measure the same latent variable at each time occasion. In such a situation, following Cai (2010) we distinguish a "specific" dimension U id 1 for each item within the questionnaire to capture the dependence between an individual's responses to the same item d 1 (d 1 = 1, . . . , D 1 , D 1 = r) at the different occasions, and we introduce one "primary" dimension V id 2 for each wave d 2 (d 2 = 1, . . . , D 2 ), representing the latent variable of interest at a given occasion. Through a suitable choice of identifiability and equality constraints on the item parameters, it is possible to estimate the support points of V i and, then, obtain a class-specific time trajectory of the latent variable measured by the questionnaire. The resulting multidimensional structure is illustrated in Figure 3 for the case D 2 = 2. To illustrate specification and estimation of the discrete two-tier model for longitudinal data we refer to a simulated data set about 10 different types of crime committed by a cohort of 10,000 hypothetical subjects on 6 waves; each crime corresponds to a binary item, which assumes value 1 if the crime was committed and 0 otherwise. The latent variable of interest denotes the propensity to commit a crime at each time occasion. The data set is available from the R package LMest (Bartolucci and Pandolfi, 2016).

library(MLCIRTwithin)
### Load and prepare data library(LMest) data(data_criminal_sim) data_criminal_sim [1:12,] id sex time y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 In order to simplify the illustration of the application, we keep the first two waves and five types of crime, as follows: Moreover, in order to estimate the model at issue through the R package MLCIRTwithin, we need to reshape the data from "long" to "wide" format; we also aggregate records corresponding to the same patterns. Note that, after reshaping the data set, the total amount of items is 10, that is, 5 items observed at time 1 (j = 1, . . . , 5) and the same 5 items observed again at time 2 (j = 6, . . . , 10).
### Define the multidimensional structure multi1 <-matrix(0, nrow=5, ncol=2) multi2 <-matrix ( # Number of latent classes k1 <-2 k2 <-2 The next step of the model specification consists in fixing some support points to identify the model and adding equality constraints on difficulties and discriminating indices to properly account for the longitudinal data structure, similarly to Cai (2010). In more detail, we fix the support points of the latent variable U i , that is, u = (−1, −1, −1, −1, −1, 1, 1, 1, 1, 1) , and on the first dimension of the latent variable V i , whereas the support points on the second dimension of V i are freely estimated, that is, v = (−1, v 12 , 1, v 22 ) : in such a way, we estimate how the propensity to commit a crime changes over the time. Moreover, we constrain difficulty and discriminating parameters of each item at time 1 to be equal to the parameters of the same item at time 2, that is, β j = β 5+j , γ 1j = γ 1,5+j , γ 2j = γ 2,5+j , with j = 1, . . . , 5. (zth1 <-c(rep(-1, times = nrow(multi1)), rep(1, times = nrow(multi1)))) [1] -1 -1 -1 -1 -1 1 1 1 1 1 The final step consists in estimating the model according to the structure specified above. The main difference with respect to the model presented in the previous section (Example 1) is that now all items are shared by both the latent variables U i and V i and suitable constraints on the model parameters are introduced through arguments Zth1, zth1, Zth2, zth2, Zbe, Zga1, and Zga2, according to equations (7)- (11).
out <-est_multi_poly_within(S = S, yv = yv, k1 = k1, k2 = k2, X = X, link = "global", disc = TRUE, multi1 = multi1, multi2 = multi2, disp = TRUE, output = TRUE, out_se = TRUE, Zth1 = Zth1, zth1 = zth1, Zth2 = Zth2, zth2 = zth2, Zbe = Zbe, Zga1 = Zga1, Zga2 = Zga2) Estimates of support points and item parameters may be displayed through the usual methods  We observe that 91.8% of the individuals belong to class 1, which is characterized by the smallest propensity to commit a crime, whereas the remaining 8.24% of individuals are allocated in class 2. Both classes present a tendency to increase the propensity to commit a crime from time 1 to time 2. Indeed, the estimated support points of the propensity to commit a crime at time 2 (i.e., estimates of v 12 and v 22 ) are higher than the corresponding values at time 1, for both the latent classes (−0.693 vs −1 for class 1 and 1.679 vs 1 for class 2). Moreover, the conditional probabilities of observing a given crime is higher for items observed at time 2, that is, items 6 to 10, with respect to the same items observed at time 1, that is, items 1 to 5, mainly in the case of latent class 2. For instance, for an individual belonging to class 2 the conditional probability of observing a crime of type 1 equals 10.3% at time 1 and it increases to 24.2% at time 2.

Summary
In this paper we illustrate the R package MLCIRTwithin, whose main function est_multi_poly_within implements an Expectation Maximization based approach to estimate the parameters of two-tier latent class item response theory (IRT) models. This class of models, which extends in a flexible way the class of basic IRT models, is based on two independent vectors of latent variables. Moreover, two main assumptions hold: (i) items are allowed to measure one or two latent variables (within-item multidimensionality) and (ii) latent variables are assumed to have a discrete distribution with a finite number of support points, which identify homogeneous latent classes of individuals, and related mass probabilities that may depend on individual covariates.
We illustrate the R package through two examples based on data about the measurement of healthrelated quality of life in cancer patients (Example 1) and about the measurement of the propensity to commit a crime in two time occasions (Example 2). The first example investigates the multidimensional structure of the questionnaire and is focused on the interpretation of the estimated model parameters. The second example illustrates how to treat longitudinal item responses through the specification of suitable constraints on support points and item parameters.