Item response theory (IRT) models for unfolding processes use the responses of individuals to attitudinal tests or questionnaires in order to infer item and person parameters located on a latent continuum. Parametric models in this class use parametric functions to model the response process, which in practice can be restrictive. MUDFOLD (Multiple UniDimensional unFOLDing) can be used to obtain estimates of person and item ranks without imposing strict parametric assumptions on the item response functions (IRFs). This paper describes the implementation of the MUDFOLD method for binary preferential-choice data in the R package mudfold. The latter incorporates estimation, visualization, and simulation methods in order to provide R users with utilities for nonparametric analysis of attitudinal questionnaire data. After a brief introduction in IRT, we provide the methodological framework implemented in the package. A description of the available functions is followed by practical examples and suggestions on how this method can be used even outside the field of psychometrics.
In this paper we introduce the R package mudfold (Balafas et al. 2019), which implements the nonparametric IRT model for unfolding processes MUDFOLD. The latter, was developed by Van Schuur (1984) and later extended by Post (1992) and Post and Snijders (1993). IRT models have been designed to measure mental properties, also called latent traits. These models have been used in the statistical analysis of categorical data obtained by the direct responses of individuals to tests and questionnaires. Two response processes that result in different classes of IRT models can be distinguished. The cumulative (also called monotone) processes and the unfolding (also called proximity) processes in the IRT framework differ in the way that they model the probability of a positive response to a question from a person as a function of the latent trait, which is termed as item response function (IRF).
Cumulative IRT models also known as Rasch models (Rasch 1961), assume that the IRF is a monotonically increasing function. That is, the higher the latent trait value for a person, the higher the probability of a positive response to an item (Sijtsma and Junker 2006). This assumption makes cumulative models suitable for testing purposes where latent traits such as knowledge or abilities need to be measured. The unfolding models consider nonmonotone IRFs. These models originate from the work of Thurstone (1927, 1928) and have been formalized by Coombs (1964) in his deterministic unfolding model. In unfolding IRT the IRF is assumed to be a unimodal (single ‘peak’) function of the distance between the person and item locations on a hypothesized latent continuum. Unimodal IRFs imply that the closer an individual is located to an item the more likely is that he responds positively to this item (Hoijtink 2005). Unfolding models can be used when one is interested to measure bipolar latent traits such as preferences, choices, or political ideology, which are generally termed as attitudes (Andrich 1997). Such type of latent traits when they are analyzed using monotone IRT models usually result in a multidimensional solution. In this sense, unfolding models are more general than the cumulative IRT models (Stark et al. 2006; Chernyshenko et al. 2007) and can be seen as a form of quadratic factor analysis (Maraun and Rossi 2001).
Parametric IRT (PIRT) models for unfolding processes exist for dichotomous items (Hoijtink 1991; Andrich and Luo 1993; Maydeu-Olivares et al. 2006), polytomous items (Roberts and Laughlin 1996; Luo 2001) as well as for bounded continuously scored items (Noel 2014). Typically, estimation in PIRT models exploits maximum likelihood methods like the marginal likelihood (e.g. Roberts et al. 2000) or the joint likelihood (e.g. Luo et al. 1998), which are optimized using the expectation-maximization (EM) or Newton type of algorithms. Unfolding PIRT models that infer model parameters by adopting Bayesian Markov Chain Monte Carlo (MCMC) algorithms (Johnson and Junker 2003; Roberts and Thompson 2011; Lee et al. 2019; Liu and Wang 2019) are also available. PIRT models however, make explicit parametric assumptions for the IRFs, which in practice can restrict measurement by eliminating items with different functional properties.
Nonparametric IRT (NIRT) models do not assume any parametric form for the IRFs but instead introduce order restrictions (Sijtsma 2005). These models have been used to construct or evaluate scales that measure among others, internet gaming disorder (Finserås et al. 2019), pedal sensory loss (Rinkel et al. 2019), partisan political preferences (Hänggli 2020), and relative exposure to soft versus hard news (Boukes and Boomgaarden 2015). The first NIRT model was proposed by Mokken (1971) for monotone processes. His ideas were used for the unfolding paradigm by Van Schuur (1984) who designed MUDFOLD as the unfolding variant of Mokken’s model. MUDFOLD was extended by Van Schuur (1992) for polytomous items and Post (1992) and Post and Snijders (1993) derived testable properties for nonparametric unfolding models that were adopted in MUDFOLD. Usually, NIRT methods employ heuristic item selection algorithms that first rank the items on the latent scale and then use these ranks to estimate individual locations on the latent continuum. Such estimates for individuals’ ideal-points in unfolding NIRT have been introduced by Van Schuur (1988) and later by Johnson (2006). NIRT approaches can be used for exploratory purposes, preliminary to PIRT models, or in cases where parametric functions do not fit the data.
IRT models can be fitted by means of psychometric software implemented
in R (Choi and Asilkalkan 2019), which can be downloaded from the Comprehensive R
Archive Network (CRAN)
In order to fill this gap, we have developed the R package mudfold.
The main function of the package implements item selection algorithm of
Van Schuur (1984) for scaling the items on a unidimensional scale. Scale
quality is assessed using several diagnostics such as, scalability
coefficients similar to the homogeneity coefficients of
Loevinger (1948), statistics proposed by Post (1992),
and newly developed tests. Uncertainty for the goodness-of-fit measures
is quantified using nonparametric bootstrap (Efron et al. 1979) from
the R package boot (Canty and Ripley 2017).
Missing values can be treated using multiple multivariate imputation by
chained equations (MICE, Buuren et al. 2006), which is implemented in
the R package mice (van Buuren and Groothuis-Oudshoorn 2011).
Estimates for the person locations derived from Van Schuur (1988) and
Johnson (2006) are available to the user of the package.
Generally, the MUDFOLD algorithm is suitable for studies where there are
no restrictions on the number of items that a person can “pick".
Besides these pick-any-out-of-
Consider a sample of
Subsequently, we can define the IRF for an item
In unidimensional IRT models, unidimensionality of the latent trait, and local independence of the responses are common assumptions. However, the usual assumption of monotonicity that we meet in the cumulative IRT models, needs modification in the unfolding IRT where unimodal shaped IRFs are considered. For obtaining diagnostic properties for the nonparametric unfolding model, Post and Snijders (1993) proposed two additional assumptions for the IRFs. The assumptions of the nonparametric unfolding model are:
Assumption A1 implies that there exist only one latent trait that
explains the responses of persons on the items. Assumption A2 is
mathematically convenient since it reduces the likelihood to a simple
product and implies that given the latent trait value no other
information on the other items is relevant to predict the responses to a
particular item. The next assumption concerns the conditional
distribution of each item given the latent trait. The unimodality
assumption that is described in A3 restricts the IRFs to have a
single-peak shape without imposing any explicit functional form. If
A3 holds for all the IRFs then we can order the items on the
unidimensional continuum based on their location parameter
Additionally, two assumptions are needed about the individuals
PIRT approaches use well defined IRFs that parametrize explicitly persons and items on some known parameter space. Estimates of the parameters can be obtained using suitable frequentist or Bayesian methods and the fit of the model to the data is assessed using goodness-of-fit indices. Contrarily, in NIRT modelling the functional form of the IRF is unknown and alternative estimation methods are needed (Mokken 1997).
Models in the NIRT class, typically employ item selection algorithms that construct ordinal measurement scales for persons by iteratively maximizing some scalability measure upon the items. The resulting scales are then used to locate the individuals on the latent continuum based on their responses. Usually, these item selection algorithms are bottom-up methods that are divided into two parts. In the first part the algorithms seek to find the best minimal scale, that is a minimal set of items that meets certain scalability requirements. The best minimal scale is the starting point for the second part of the scaling procedure, where it is extended iteratively by adding in each step the item that best fulfills the prespecified scalability criteria.
As in other NIRT models, MUDFOLD adopts a two step item selection
algorithm that identifies the unique rank order for a maximal (sub) set
of items. In this algorithm, scalability coefficients analogous to the
ones defined by Mokken (1971) are used as tests for the
goodness-of-fit. Mokken’s coefficients are similar to the H coefficients
proposed by (Loevinger 1948), which were defined on the basis of
violation probabilities of the deterministic cumulative model (see Guttman 1944) for ordered item pairs. In the same line, the scalability
coefficients in MUDFOLD are defined on the basis of violation
probabilities of the deterministic unfolding model of Coombs (1964)
for triples of items. MUDFOLD’s scalability coefficients in a triple of
items compare the number of errors observed (i.e. the number of
Observed errors (O) in an ordered triple of items
Expected errors (EO) in an ordered item triple
Scalability coefficient (H) for any ordered item triple
Given that the
Perfect fit of the scale to the data yields a scalability coefficient
value of
We will demonstrate how the H coefficients for triples are calculated
using the dataset ANDRICH
that comes with the mudfold package in R
data format. The dataset contains the binary responses of
ANDRICH
data.We can install and subsequently load the package and the data into the R environment.
## Install and load the mudfold package and the ANDRICH data
install.packages("mudfold")
library(mudfold)
data("ANDRICH")
N <- ncol(ANDRICH) # number of items
n <- nrow(ANDRICH) # number of persons
item_names <- colnames(ANDRICH) # item names
Functions for calculating the observed errors, expected errors, and H
coefficients for each possible item triple are available internally in
the mudfold package. These functions can be accessed by the :::
operator. For the ANDRICH
data the H coefficients for triples can be
calculated as follows.
experr <- mudfold:::Err_exp(ANDRICH) # errors expected
obserr <- mudfold:::Err_obs(ANDRICH) # errors observed
hcoeft <- 1 - (obserr / experr) # H coefficients
Generally, there exist ANDRICH
data example, the scalability coefficients for the item
permutations of length three are stored into three-way array with
dimension
triple_HDODE <- matrix(c("HIDEOUS", "DONTBELIEV", "DETERRENT"), ncol = 3)
triple_DEDOH <- matrix(rev(triple_HDODE), ncol = 3)
If we compare the H coefficients of these two (symmetric) triples we will see that they coincide.
## Compare H coefficients
hcoeft[triple_HDODE] == hcoeft[triple_DEDOH]
The
In the first step of the item selection algorithm, a search in order to
find the best triple of items is conducted. A lower bound
In its second step, the item selection algorithm extends the best
elementary scale repeatedly until no more items fulfill its scalability
criteria. A second threshold
The search for the optimal item triple in the first step requires the
calculation of the scalability coefficients for every possible
permutation of length
Among the set of all permutations of length three we seek to find those
that fulfill certain scalability criteria and we call this set of
permutations unique triples. Unique triples is a finite set containing
all
If the set of unique triples is empty, the algorithm stops automatically without proceeding in the second step. The same holds also in the case in which unique triples exist but their scalability coefficient is lower that the bound specified by the user.
ANDRICH
data.Here we describe how the main function of the mudfold package searches
for the best minimal unfolding scale in the first step of the
implemented algorithm. After we calculated the observed errors, the
expected errors, and the scalability coefficients for each triple of
items in the ANDRICH
dataset, we need to determine the optimal triple
for the first step of MUDFOLD’s item selection algorithm. The triples of
items in the order ANDRICH
data can be
obtained with the combinations()
function from the R package
gtools (Warnes et al. 2015). These
combinations are then permuted twice to yield the orderings
## Install and load the library "gtools"
install.packages(gtools)
library(gtools)
## Obtain item permutations (h,l,k), (h,k,l), and (l,h,k)
perm1 <- combinations(N, 3, item_names, set = FALSE)
perm2 <- perm1[, c(1,3,2)]
perm3 <- perm1[, c(2,1,3)]
The set of unique triples can then be obtained.
## Find the set of unique triples.
unq <- rbind(perm1[(hcoeft[perm1] > 0 & hcoeft[perm2] < 0 & hcoeft[perm3] < 0), ],
perm2[(hcoeft[perm1] < 0 & hcoeft[perm2] > 0 & hcoeft[perm3] < 0), ],
perm3[(hcoeft[perm1] < 0 & hcoeft[perm2] < 0 & hcoeft[perm3] > 0), ])
The set of unique triples in the ANDRICH
data example contains sixteen
item triples. With the command hcoeft[unq]
we can see that all except
one of the triples show
Given the best unique triple obtained in the first step of the
algorithm, in the second step of the item selection process the
algorithm investigates repeatedly the remaining
For a scale consisting of
In order to determine the
All the
If more than one item fulfills the first criterion, then the item with the minimum number of possible scale positions is chosen.
The scalability coefficient
It can be the case that more than one scales fulfill these criteria. In such instances, the algorithm continues by choosing the scale that includes the most uniquely represented item and shows the minimum number of expected errors. The scale extension process continues as long as the scalability criteria described above are fulfilled.
ANDRICH
dataFor the ANDRICH
data, after the first step of the item selection
process where we obtained the best unique triple, the remaining five
items can still be added to the scale.
BestUnique <- unq[which.max(hcoeft[unq]), ] # Best unique triple
ALLitems <- colnames(ANDRICH)
Remaining <- ALLitems[!ALLitems %in% BestUnique] # Remaining items
Next, an iterative procedure needs to be defined for the second, scale
extension step of the MUDFOLD algorithm. Adding one item in each
repetition implies that a maximum of
For example, in the first iteration of the scale extension step for the
ANDRICH
dataset, all the scales that need to be assessed can be
constructed as follows. First we need to consider all the possible
positions where a new item can be added. The possible positions depend
on the length of the existing scale. At this point, since the scale
consists of three items there exist four possible positions where a new
item can be added.
## Create indices to be used in constructing scales
lb <- length(BestUnique) # length of best unique triple
lr <- length(Remaining) # number of remaining items to add in the scale
## create all possible positions where each new item from Remaining
## can be added in the scale
index_rep <- rep(seq(1, (lb+1)) ,lr) - 1 # possible positions
index_irep <- rep(Remaining, each = lb+1) # item for each position
After we define all the possible positions for new items, each item is added in every position and results in a different scale to be assessed.
## Create all possible scales by adding each item in Remaining
## to every possible position of BestUnique
ALLscales <- lapply(1:length(index_rep),
function(i) append(BestUnique, index_irep[i], after = index_rep[i] ))
Each of these scales will be judged in terms of its scalability
properties. For instance, let us consider the first scale that is
constructed in the first iteration of the scale extension step in the
ANDRICH
data.
Examplescale <- ALLscales[[1]]
Examplescale
# "HIDEOUS" "INEFFECTIV" "DONTBELIEV" "DETERRENT"
This scale has been constructed after inserting the item combinations()
function.
les <- length(Examplescale)
ExamplescaleTRIPLES <- combinations(n = les, r = 3, v = Examplescale, set = FALSE)
From the four triples in total, only the first three are containing the
new item
hcoeft[ExamplescaleTRIPLES[1:3, ]]
and we can see that the triple
The second scale assessment determines which scale or scales contain the
item that is the most “uniquely” represented. Let us assume that the
number of scales that fulfill the first criterion is six. Moreover,
assume that five out of these six scales contain the item
The scales that contain the least frequently observed item are checked
according to a third criterion. The third and last criterion in the
iterative scale extension phase concerns the scalability properties of
the new item. The scale that contains the new item with the highest item
scalability coefficient will be chosen as the best MUDFOLD scale if and
only if
In the ANDRICH
example the algorithm completes five iterations in the
second step which means that all the items are included in the MUDFOLD
scale. The latter, consists of eight items and shows a scale scalability
coefficient equal to
After a MUDFOLD scale with a good fit is obtained, one can assess its unfolding quality. This is done by scale diagnostics described by Post (1992) and Post and Snijders (1993). These diagnostics are based on sample proportions from which the unimodality assumption of the scale is evaluated and nonparametric estimates of the item response functions are obtained.
In this section, we discuss diagnostics implemented in the mudfold
package, which can be used to assess if a scale
Let us denote by cv.glmnet()
that is available with the package
glmnet (Friedman et al. 2010).
The condition A3 required by MUDFOLD is the assumption of
unimodality of the IRFs, which are unknown nonlinear functions of the
latent trait. In order to obtain estimates of these functions, we use a
nonlinear generalized additive model (GAM, Wood 2011) that is implememented
in the R package mgcv
(Wood 2017). Specifically, for each item the probability of positive
response
For the assumptions A4-A5, diagnostic statistics that quantify to which extent the scale agrees with these assumptions have been proposed by Post (1992). These statistics are based on conditional IRF probabilities, which are estimated by their corresponding sample proportions and collected into a matrix that is called the conditional adjacency matrix (CAM).
CAM in its CAM()
, which takes as input either a fitted MUDFOLD object or a
dataset with the complete responses of ANDRICH
dataset example, the CAM of the original data can be
calculated using the command CAM(ANDRICH)
.
Each row of the CAM is regarded as an empirical estimate of the corresponding IRF. Hence, if the ordering of the items is correct, and if assumptions A1 to A5 hold, then (i) the observed maxima of the different rows of the CAM are expected to appear around the principal diagonal (moving maxima property), and (ii) the rows of the CAM are expected to show a weakly unimodal pattern. One can potentially evaluate the unfolding model by checking how strongly the observed row patterns of the CAM deviate from the expected patterns described above.
The moving maxima property of the CAM corresponds to condition A4,
which assumes stochastic ordering of the items by their location
parameter
Calculation of the MAX can be done in two ways, namely a top-down and a
bottom-up method
The sum over all rows yields the total MAX statistic of the scale, i.e.
The MAX statistic can be calculated using the function MAX()
from the
R package mudfold, which takes as input either a fitted MUDFOLD object
obtained from the main mudfold()
function, or an object of class
"cam.mdf"
calculated from the function CAM()
. The argument ’type’
of the MAX()
function controls if the MAX for the items or the whole
scale will be returned to the user. Visual inspection of the observed
maxima pattern can also be useful. If the maximum values of the CAM rows
are close to the diagonal then the unfolding model holds. The
diagnostics()
will return and plot a matrix with a star at the maximum
of each CAM row for visual inspection of their distribution.
In order to quantify if the rows of the CAM show a weakly unimodal
pattern, the iso statistic (proposed by I. Molenaar, personal
communication) was introduced. Iso statistic (ISO), is a measure for the
degree of unimodality violation in the rows of CAM. ISO can be obtained
for each item (
To come up with an ISO value for an item
The user can calculate the ISO statistic using the function ISO()
,
which takes as input outputs either from the mudfold()
function, or
from the function CAM()
and returns a vector with the type = ’scale’
.
All the diagnostic tests discussed in this section are implemented in
the function diagnostics()
of the mudfold package. The function
diagnostics()
can be used with fitted objects from the main
mudfold()
function.
Since the sampling distributions of the MUDFOLD’s goodness-of-fit and
diagnostic statistics are non-standard, calculating their standard
errors is not straightforward. Instead, for providing uncertainty
estimates of the MUDFOLD statistics both at the item and the scale
level, nonparametric bootstrap is used (Efron et al. 1979). Bootstrap
is a resampling technique that can be used for assessing uncertainty in
instances when statistical inference is based on complex procedures.
With bootstrapping we sample
Given a MUDFOLD scale boot()
from
the R package boot (Canty and Ripley 2017). Using the boot package allows the user
of mudfold package to obtain different types of confidence intervals
for assessing uncertainty using the function boot.ci()
.
Additional to the uncertainty estimates, a bootstrap estimate of the
unfolding scale can be also calculated. This estimate corresponds to the
most frequently obtained MUDFOLD scale in
With MUDFOLD, after obtaining an item ordering (scale) that consists of
a (sub) set of
Originally, the Thurstone estimator
Van Schuur’s person parameter estimator uses the item ranks obtained
from MUDFOLD’s item selection algorithm as estimates for the vector
Missing data occur when intended responses from one or multiple persons are not provided. Handling missing values is critical since it can bias inferences or lead to wrong conclusions. One way to go is to ignore the missing observations by applying list-wise deletion. This, however, can lead to a great loss of information especially if the number of missing values is large. The other approach, is to replace the missing values with actual values which is called imputation.
In the case of random missing value mechanisms such as missing
completely at random (MCAR) and missing at random (MAR)
(Rubin 1976; Little and Rubin 1987), different approaches can
be used in order to impute the missing observations. Imputation within
IRT is in general associated with more accurate estimates of item
location and discrimination parameters under several missing data
generating mechanisms (Sulis and Porcu 2017). In the package mudfold missing
values can be imputed using the logistic regression version of multiple
multivariate imputation by chained equations (MICE). The latter is
available from the R package mice. MICE imputation within mudfold
can be used solely or in combination with bootstrap uncertainty
estimates. In the latter case, each bootstrap sample is imputed before
fitting a MUDFOLD scale, while in the former the data are imputed
The R package mudfold contains a collection of functions related to
the MUDFOLD item selection algorithm. In the following we describe the
functionality of the package and the ANDRICH
dataset is used for
demonstration purposes.
mudfold()
and as.mudfold()
The main function of this package, called mudfold()
, fits Van Schuur’s
item selection algorithm to binary data in order to obtain a
unidimensional ordinal scale for the persons. The mudfold()
function
can be called with,
mudfold(data, estimation, lambda1, lambda2, start.scale,
nboot, missings, nmice, seed, mincor, ...)
The functions has ten main arguments where only the first one is obligatory. These are:
data
: The input data, i.e. a data.frame
or matrix
,
with persons in the rows and items in the columns. It contains the
binary responses of
estimation
: This argument handles the nonparametric estimation of the
person parameters. The default, estimation = "rank"
uses a rank based
estimator (Van Schuur 1988). Alternatively, person parameters are
obtained by a quantile estimator (Johnson 2006), which is
accessible by setting estimation = "quantile"
.
lambda1
: The parameter
lambda2
: Parameter
start.scale
: The user can pass to this argument a character vector of
length greater than or equal to three, containing ordered item names
from colnames(data)
that are used as the best elementary scale for the
second step of the item selection algorithm. If start.scale = NULL
(default), the first step of the item selection algorithm determines the
best elementary triple of items that is extended in the second step.
nboot
: Argument that controls the number of bootstrap iterations. If
nboot = NULL
(default) no bootstrap is applied.
missings
: Argument that controls treatment of missing values. If
missings = "omit"
(default) list-wise deletion is applied to data
.
If missings = "impute"
then the mice function is applied to data
in
order to impute the missings nmice
times.
nmice
: Argument that controls the number of mice imputations (This
argument is used only when missings = "impute"
and nboot = NULL
.
seed
: Argument that is used for reproducibility of bootstrap results.
mincor
: This can be scalar, numeric vector (of size ncol(data)
) or
numeric matrix (square, of size ncol(data)
specifying the minimum
threshold(s) against which the absolute correlation in the data is
compared. See ?mice:::quickpred
for more details. To be used when mice
becomes problematic due to co-linear terms.
...
: Additional arguments to be passed into the boot()
function
(see ?boot
in R ).
The function mudfold()
internally has four main steps. A data checking
step, the first step of the item selection process, the second step of
the item selection process, and the bootstrap step if the user chooses
this option. The output of mudfold()
, is a list()
of class "mdf"
that contains information for each internal step of the function. The
first element of the output list contains information on the function
call. The second element contains results of the data checking step. The
next element of the output contains descriptive statistics obtained from
the observed data and the last element of the output has all the
information from the the fitting process (triple statistics, first step,
second step). If bootstrap is applied to estimate uncertainty , an
additional element that contains the bootstrap information is given to
the output.
For example, if you want to fit a MUDFOLD scale to the ANDRICH
data
and run a nonparametric bootstrap with mudfold()
function as follows.
fitANDRICH <- mudfold(ANDRICH, nboot = 100, parallel = "multicore", seed = 1)
In the example above, the first two arguments are core in the
mudfold()
function. The third argument parallel
is an argument of
the boot()
function that runs bootstrapping in parallel fashion in
order to reduce computational time. The last argument seed
is used to
ensure reproducibility of the bootstrap results.
In some cases the unfolding scale could be known. In these instances,
the user is interested in obtaining the MUDFOLD goodness-of-fit and
diagnostic statistics for the given scale. The function as.mudfold()
can be used for treating the given rank order of the items as a MUDFOLD
scale. The function uses only the first two arguments of the mudfold()
function. In principle, this function transforms a given scale into an
S3 class "mdf"
object.
For "mdf"
objects from the mudfold()
or as.mudfold()
functions,
generic functions for print()
, summary()
and plot()
and coef()
are available. The generic function print.mdf()
can be accessed with,
print(x)
where x
is an "mdf"
class object. This function prints information
for x
, such as time elapsed for fitting, warnings from the data
checking step, convergence for each step of the algorithm and statistics
with bootstrap confidence intervals if nboot
is not equal to NULL
.
In the ANDRICH
data example, the command print(fitANDRICH)
is used
to print information from the fitANDRICH
object to the console. The
function call together with the elapsed time to fit the model, the
number of individuals, and the number of items used in the analysis is
the first part of the output. Next, the values of the mudfold()
arguments are given, which are followed by convergence indicators for
each step of the item selection algorithm. Scale statistics such as the
scalability coefficient and the ISO statistic are also printed together
with their percentile confidence intervals obtained in fitANDRICH
object.
The function summary
is a generic function that is summarizing
information from model fitting functions. In our case the output of
summary.mdf()
is a list object summarizing results from the
mudfold()
function. The function can be called via
summary(object, boot, type = "perc", ...)
and consists of three arguments:
object
: a list of class "mdf"
, output of the mudfold()
function.
boot
: logical argument that controls if bootstrap confidence intervals
and bootstrap summary for each coefficient will be returned. If
boot = FALSE
(default) no information for bootstrap is returned. When
boot=TRUE
, confidence intervals, standard errors, biases, calculated
from the bootstrap iterations for each parameter are given with the
output.
type
: The type of bootstrap confidence intervals to be calculated if
the argumnet boot = TRUE
. Available options are "norm"
, "basic"
,
"perc"
(deafult), and "bca"
. See the argument type of the
boot.CI()
for details.
The output of the summary.mdf()
is a list with two main components.
The first component of the list is a data.frame
with scale statistics
and the second component is a list with item statistics.
Typing summary(fitANDRICH, boot = TRUE)
into the R console will return
the summary of the fitted scale to the ANDRICH
data. The output
consists of six distinct data.frame
objects. The first data.frame
contains information on scale statistics with their bootstraped
statistics. The next four data.frame
objects correspond to the H
coefficients, the ISO statistics, the observed errors, and the expected
errors for each item in the scale together with their bootstrap summary
statistics. The last data.frame
gives descriptive statistics for the
items in the scales.
A generic function for plotting S3 class "mdf"
objects is also
available to the user. The function plot.mdf()
returns empirical
estimates of the IRFs, the order of the items on the latent continuum or
a histogram of the person parameters . You can plot "mdf"
class
objects with the following R syntax.
plot(x, select = NULL, plot.type = "IRF")
This function consists of three arguments from which the first is the
usual argument x
which stands for the "mdf"
object to be plotted.
The argument plot.type
controls the type of plot that is returned, and
three types of plots are available. If plot.type = "scale"
, a
unidimensional continuum with the items in the obtained rank order is
returned. In the default settings of this function (i.e.
plot.type = "IRF"
), the corresponding plot has the items on the x-axis
indicating their order on the latent continuum and the probability of a
positive response on the y-axis. The IRF of each item among the latent
scale is plotted with different colours. When plot.type = "IRF"
will
return a plot with the distribution of person parameters on the latent
continuum. The argument select
is optional and provides the
possibility for the user to plot a subset of items. The user can provide
in this argument a vector of item names to be plotted. If
select = NULL
, the function returns the estimated IRFs for all items
in the obtained MUDFOLD scale. For plotting S3 class "mdf"
objects, we
use the functions na.approx()
, melt()
and ggplot()
from the R
packages zoo (Zeileis and Grothendieck 2005),
reshape2 (Wickham 2007),
and ggplot2 (Wickham 2009),
respectively.
A generic coef.mdf()
function for S3 class "mdf"
objects can also be
used. This function is a simple wrapper that uses a single argument
named ’type’
. The coef.mdf()
will extract nonparametric estimates
of: persons ranks when type = "persons"
, item ranks when
type = "items"
, or both when type = "all"
from a fitted MUDFOLD
object.
diagnostics()
functionAfter a scale has been obtained, scale diagnostics need to be applied is
order to assess its unfolding properties. The MUDFOLD diagnostics
described in section 2.4 of this paper are implemented
into a function named diagnostics()
that can calculate all of them
simultaneously. The function syntax is,
diagnostics(x, boot, nlambda, lambda.crit, type, k, which, plot)
and uses eight arguments described below.
x
: a list of class "mdf"
, output of the mudfold()
function.
boot
: logical argument that controls if bootstrap confidence intervals
and summary for the H coefficients and the ISO and MAX statistics will
be returned. If boot = FALSE
(default) no information for bootstrap is
returned. When boot = TRUE
, confidence intervals, standard errors,
biases, calculated from the bootstrap iterations for each diagnostic are
given with the output.
nlambda
: The number of regularization parameters to be used in
cv.glmnet()
function when testing local independence.
lambda.crit
: String that specifies the criterion to be used by
cross-validation for choosing the optimal regularization parameter.
Available options are "class" (default), "deviance", "auc",
"mse", "mae". See the argument ’type.measure’
in the cv.glmnet()
function for more details.
type
: The type of bootstrap confidence intervals to be calculated if
the argumnet boot = TRUE
. Available options are "norm"
, "basic"
,
"perc"
(deafult), and "bca"
. See the argument type of the
boot.CI()
for details.
k
: The dimension of the basis in the thin plate spline that is used
when testing for IRF unimodality. The default value is k = 4
.
which
: Which diagnostic should be returned by the function. Available
options are "H"
, "LI"
, "UM"
, "ISO"
, "MAX"
, "STAR"
, "all"
(default).
plot
: Logical. Should plots be returned for the diagnostics that can
be plotted? Default value is plot = TRUE
.
For the ANDRICH
data example the command diagnostics(fitANDRICH)
will calculate and plot the scale diagnostics for the fitANDRICH
object.
mudfoldsim()
functionIn order to provide the user the flexibility of simulating unfolding
data, the function mudfoldsim()
is available from the mudfold
package. The responses of subjects on distinct items are simulated with
the use of a flexible parametric IRF that generalizes proximity
relations between item and person parameters.
Assume that we want to simulate a test dataset with responses from
Unfolding models are also known as distance models since they model the
probability of positive endorsement of item
Using mudfoldsim()
function, the model parameters
We note that the IRF proposed by Andrich (1988) is a special
case of the one implemented in the mudfoldsim()
function for
mudfoldsim(N, n, gamma1 = 5, gamma2 = -10, zeros = FALSE, parameters = "normal",
seed = NULL)
and makes use of six user-specified arguments:
N
: An integer corresponding to the number of items to be simulated.
n
: The number of persons to be simulated.
gamma1
: This argument is passed to the IRF. Controls the
gamma2
: The deterministic parameter (i.e.
zeros
: A logical argument that controls if individuals who endorse no
items will be simulated. If zeros=TRUE
the function allows for
individuals that are not endorsing any of the items. On the other hand,
if zeros=FALSE
(default) only individuals who endorse at least one
item will be part of the simulated data.
parameters
: Argument for the person parameters with two options
available. In the default option parameters="normal"
and in this case
the person parameters are drawn from a standard normal distribution. On
the other hand, the user can set this argument equal to "uniform"
which implies that subject parameters will be drawn uniformly in the
range of the item parameters.
seed
: An integer to be used in the set.seed()
function. If
seed=NULL
(default), then the seed is not set.
The output of the mudfoldsim()
function is a list containing the
simulated data (in a random item order), the parameters used in the IRF,
and the matrix of probabilities under which the binary data has been
sampled.
pick()
functionSince the main mudfold()
function is designed for dichotomous (binary)
items, we provide the user with the function pick()
. The latter, is
used to transform quantitative or ordinal type of variables into a
binary form. The underlying idea of this function is that the individual
selects those items with the highest preference. This transformation can
be done in two different ways, either by user specified cut-off value(s)
or by assuming a pick
The R function pick()
can be utilized with the following code,
pick(x, k = NULL, cutoff = NULL, byItem = FALSE)
and makes use of four parameters. These are,
x
: A data.frame
or matrix
with persons in the rows and items in
the columns containing quantitative or ordinal type of responses from
k
: This integer k=NULL
). This argument is
used if one wants to transform the data into pick k
is provided by the user, then cutoff
should be
NULL
and vice verca.
cutoff
: The numeric value(s) that will be used as thresholds for the
transformation (default cutoff=NULL
). Any value greater than or equal
to the cutoff
will be x
) or equal to byItem=FALSE
) which imposes an explicit
cut-off value for each individual in x
. If byItem=TRUE
then the
length of this parameter should be
byItem
: This is a logical argument. If byItem=TRUE
, the
transformation is applied on the columns of x
. In the default
byItem=FALSE
, the function "picks" items row-wise.
In the default parameter settings of the function pick()
, the
parameters k
and cutoff
respectively are equal to NULL
. In this
case, the mean from byItem=FALSE
). When byItem=TRUE
(with k,cutoff
equal to
NULL
) then the item mean over all individuals is used as an item
specific cut-off value. The parameters k
and cutoff
are responsible
for different dichotomization processes and they cannot be used
simultaneously, which means that only one of the two arguments can be
different than NULL
.
In the case in which the user chooses to transform the data assuming
that persons are asked to pick exactly
Generally, dichotomization should be avoided since it could distort the data structure and lead into information loss. Models that take into account information different categories should be prefered over dichotomization for polytomous data.
In this section we provide examples of how to use MUDFOLD method on two datasets, which are provided with the mudfold package. The first application is from the field of psychometrics while the second example is a linguistic application.
The commands install.packages("mudfold")
and library(mudfold)
will
download, install and load the mudfold package so it can be used. The
command set.seed(1)
will set the seed for reproducibility.
In order to demonstrate the functionality of the mudfold package we re-analyze questionnaire data following the strategy suggested by Post et al. (2001). For this purpose, we use a unidimensional measurement scale for loneliness that follows the definitions of a Rasch scale and has been constructed by Jong-Gierveld and Kamphuls (1985). De Jong-Gierveld loneliness scale consists of eleven items, five of which are positive and six are negative. The items in the loneliness scale are given below and the sign next to the items corresponds to the item content.
A : |
There is always someone I can talk to about my day to day problems | + |
B : |
I miss having a really close friend | - |
C : |
I experience a general sense of emptiness | - |
D : |
There are plenty of people I can lean on in case of trouble | + |
E : |
I miss the pleasure of company of others | - |
F : |
I find my circle of friends and acquaintances too limited | - |
G : |
There are many people that I can count on completely | + |
H : |
There are enough people that I feel close to | + |
I : |
I miss having people around | - |
J : |
Often I feel rejected | - |
K : |
I can call on my friends whenever I need them | + |
Each item in the scale has three possible levels of response, i.e. “no",”more or less", “yes" and dichotomization methods that involve item reverse coding have been proposed by De Jong and Tilburg (1999). These methods as well as the determination of dimensionality of this scale have been under critical discussion. Following this discussion, Post et al. (2001) reanalyzed the loneliness scale data obtained from the NESTOR study (Knipscheer et al. 1995) using MUDFOLD in a three step analysis routine.
Persons with missing responses are removed from the data
(
The threshold that is used for the main analysis has been determined on the basis of MUDFOLD scale analysis on datasets with different thresholds. Specifically, the data has been dichotomized using as thresholds the response, (i) “yes”, (ii) “more or less”, (iii) different thresholds per item where the response category “more or less” is collapsed with the smaller category between “yes” and “no”. The results from this analysis showed that dichotomizing the data at the higher preference will yield the best unfolding measurement scale for loneliness.
Dichotomizing the data at “yes” is straightforward with the pick()
function.
data("Loneliness")
dat <- pick(Loneliness, cutoff = 3)
In the first step of the analysis, we conduct a MUDFOLD scale search on
the transformed binary responses of mudfold()
function is set to
Lonelifit <- mudfold(dat, lambda1 = 0.1, nboot = 100, seed = 1)
The function takes about five minutes to run 100 bootstrap iterations.
The resulting scale and its associated statistics can be obtained by
summarizing the Lonelifit
object.
loneliSummary <- summary(Lonelifit, boot = TRUE)
The MUDFOLD scale for the Loneliness
data in its estimated rank order
is:
loneliScale <- loneliSummary$ITEM_STATS$ITEM_DESCRIPTIVES$items
loneliScale
## "G" "H" "D" "K" "C" "E" "I" "F"
The scale has length eight, with the first four items positively
formulated and the last four negatively formulated. Items A
,B
, and
J
are excluded from the scale. This is because some triples (with
respect to the item rank order) that include these items have
scalability coefficient loneliSummary
. Scale
statistics with their bootstrap uncertainty estimates can be obtained
with the following command.
loneliSummary$SCALE_STATS[1:3, ]
## value perc_lower95CI perc_upper95CI boot(mean) boot(bias) boot(se) boot(iter)
## H(scale) 0.536 0.436 0.571 0.511 -0.025 0.031 100
## ISO(scale) 0.078 0.001 1.753 0.384 0.306 0.459 100
## MAX(scale) 0.000 0.000 2.400 0.381 0.381 0.683 100
The output above, in each row shows a scale statistic and its columns
correspond to the bootstrap properties of this statistic. The H
coefficient for the scale shows strong evidence towards
unidimensionality
(
Scale diagnostics are given in Figure 1 and 2.
Visual inspection if the maxima of the CAM rows are a nondecreasing
function of the item ranks, violations of the local independence
assumption, and the IRF for each item in the Loneliness
unfolding
scale can be obtained by using the diagnostics()
function as shown
below.
par(mfrow = c(1, 2))
# testing for local independence
diagnostics(Lonelifit, which = "LI")
# visual inspection of moving maxima
diagnostics(Lonelifit, which = "STAR")
par(mfrow=c(2,4))
# visual inspection for IRF unimodality
diagnostics(Lonelifit, which = "UM")
par(mfrow = c(1, 1))
Loneliness
unfolding
scale.Loneliness
unfolding scale.The H coefficients for each item in the scale are also available in the summary object and can be accessed by:
loneliSummary$ITEM_STATS$H_MUDFOLD_items
value perc_lower95CI perc_upper95CI boot(mean) boot(bias) boot(se) boot(iter)
H(G) 0.54 0.444 0.573 0.510 -0.034 0.035 96
H(H) 0.52 0.440 0.543 0.495 -0.027 0.025 72
H(D) 0.51 0.400 0.553 0.498 -0.015 0.032 65
H(K) 0.51 0.440 0.554 0.495 -0.016 0.025 60
H(C) 0.55 0.404 0.590 0.513 -0.041 0.049 76
H(E) 0.57 0.491 0.610 0.555 -0.016 0.029 78
H(I) 0.55 0.493 0.586 0.541 -0.011 0.022 47
H(F) 0.52 0.349 0.546 0.464 -0.057 0.058 84
From the item fit we can see that the H coefficient for each item in the
scale is above boot(iter)
of the output above you can
get information for the number of times each item was included in a
MUDFOLD scale out of G
was the
most frequently included item (K,I
were
included less frequently in a MUDFOLD scale compared to the other items
(loneliSummary$ITEM_STATS$ISO_MUDFOLD_items
into the R console will
return a summary of the ISO statistic for each item in the scale. The
latter, shows that only small violations of unimodality occur for the
items in the scale. The same holds for the MAX statistic (it can be
accessed by loneliSummary$ITEM_STATS$MAX_MUDFOLD_items
), which shows
zero values for all the items in the scale.
After the scale is obtained and checked for its conformity to the unfolding principles we can visualize the estimated empirical IRFs and the distribution of the estimated person parameters. Plots for the IRFs and the person parameters can be obtained by:
plot(Lonelifit,plot.type = "IRF")
plot(Lonelifit,plot.type = "persons")
Loneliness
unfolding scale.Loneliness
unfolding scale.Figures 3 and 4 show the empirical estimates of the IRFs and the distribution of the person parameters respectively. In figure 3 you can see that the scale clearly consists of four positively formulated items in its beginning for which the IRF is decreasing as one moves from the left to the right of the scale, and four negatively formulated items in the end for which the IRF is increasing as one moves from the left to the right of the scale. In figure 4 we can see that the sample under consideration tends to feel less lonely since the distribution of the person parameters is skewed to the right. In such example, clearly any parametric model that assumed a latent normal distribution of the latent person parameters would be inappropriate.
In this section, we present an application of MUDFOLD method to the
Plato7
data set. This dataset is available from the R package
smacof (de Leeuw and Mair 2009) and has been
also included in the mudfold package. The data can be loaded into the
R environment with the command data("Plato7")
.
Plato7
contains information on the quantity distribution over the
sentence ending from seven works of Plato (D. R. Cox 1959). Specifically, the last
five syllables from each sentence in seven Plato’s works are extracted
and categorized as short or long. This produces
The question is whether it is possible using these data to assign a
chronological order to the works of Plato. Particularly, it is known
that Plato wrote first the Republic
and last the Laws
. In between
Republic
and Laws
, Plato wrote the Critias
, Philebus
,
Politicus
, Sophist
and Timaeus
. However, the exact order of these
five works is unknown. Assuming that the change in Plato’s literary
style was monotone in time, we might be able to assign a time order in
his works by analyzing the clausula’s distribution in each Plato’s work.
We consider the development of Plato’s literary style as a
unidimensional scale, on which clausulas and works are ordered. In this
analysis we consider that the quantity of clausula
Since the data is given in continuous form, we transform the percentages
into binary format in order to apply MUDFOLD. We consider the mean
quantity of each clausula as an explicit cut-off value for the
transformation. The latter can be seen as a pick any out of pick()
from the mudfold package in its default settings as follows.
dat.Plato <- pick(Plato7)
After the transformation, we end up with a matrix containing the binary
preferences of
fitPlato <- mudfold(dat.Plato, nboot = 100, seed = 1)
summaryPlato <- summary(fitPlato, boot = TRUE)
We can check the MUDFOLD scale from the summary object.
summaryPlato$SCALE_STATS[1:3, ]
## value perc_lower95CI perc_upper95CI boot(mean) boot(bias) boot(se) boot(iter)
## H(scale) 0.558 0.457 1.000 0.714 0.156 0.146 100
## ISO(scale) 0.146 0.000 1.129 0.141 -0.005 0.254 100
## MAX(scale) 0.000 0.000 0.850 0.035 0.035 0.191 100
The scale shows strong scalability properties with
summaryPlato$ITEM_STATS$H_MUDFOLD_items
## value perc_lower95CI perc_upper95CI boot(mean) boot(bias) boot(se) boot(iter)
## H(Republic) 0.66 0.362 1 0.761 0.097 0.179 70
## H(Sophist) 0.41 0.381 1 0.656 0.241 0.157 70
## H(Politicus) 0.58 0.392 1 0.662 0.078 0.161 62
## H(Philebus) 0.63 0.429 1 0.726 0.094 0.141 83
## H(Laws) 0.51 0.394 1 0.688 0.176 0.148 77
The results shows that the MUDFOLD scale has length five and the items
Critias
and Timaeus
have been excluded from the measurement process.
Republic
is correctly ordered first and Laws
is correctly ordered
last among Plato’s works. Almost all the items are strong unfolding
items with Sophist
shows moderate unfolding strength with the lowest item scalability
coefficient (i.e. Republic
is the strongest unfolding item in the scale.
Since the ISO statistic for the scale is positive one may wants to check which items are responsible for the small amount of manifest unimodality violations that are observed. Assessing these violations for each item involves checking their ISO statistics.
summaryPlato$ITEM_STATS$ISO_MUDFOLD_items
## value perc_lower95CI perc_upper95CI boot(mean) boot(bias) boot(se) boot(iter)
## ISO(Republic) 0.104 0 0.365 0.036 -0.068 0.076 74
## ISO(Sophist) 0.042 0 0.326 0.039 -0.003 0.078 70
## ISO(Politicus) 0.000 0 0.082 0.005 0.005 0.018 65
## ISO(Philebus) 0.000 0 0.576 0.027 0.027 0.117 87
## ISO(Laws) 0.000 0 0.186 0.019 0.019 0.067 78
The obtained summary output for the ISO statistics of the items in the
MUDFOLD scale show that Republic
is the item with the higher manifest
unimodality errors in its estimated IRF with an iso statistic value of
Philebus
that
shows a bootstrap standard error of
The estimated empirical IRFs and the estimated IRFs for the items in the
Plato7
unfolding scale can be visualized with
plot(fitPlato, plot.type = "IRF")
par(mfrow = c(2, 3))
diagnostics(fitPlato, which = "UM")
par(mfrow = c(1, 1))
and the output is shown in figures 5 and 6
respectively. From figure 5 it can be seen that the scale
consists of two items in the first positions (i.e. Republic
and
Sophist
) with decreasing empirical IRFs as one moves from the left to
the right hand side of the latent scale. These two items show small
amount of manifest unimodality violations which can be seen at the end
of their IRFs where the value of the curves is larger for the item
Laws
compared to item Philebus
. Third in the scale is the item
Politicus
for which the empirical IRF shows a single-peak shape.
Politicus
is followed by the items Philebus
and Laws
with
increasing empirical IRFs at positions four and five of the scale. The
estimates of the IRFs are shown in figure 6 with no obvious
violations of the IRF unimodality.
Plato7
unfolding scale.Plato7
unfolding scale.Other diagnostics can be obtained by the diagnostics()
function. In
this example the bootstrap estimate of the scale with the estimated
MUDFOLD scale are slightly different. In such instances an additional
element with a summary of the scale estimated by the bootstrap is
included in the output. Accessing the summary of the bootstrap scale is
straightforward with summaryPlato$BOOT_SCALE
.
In this paper we introduced an R package named mudfold (Balafas et al. 2019). The
latter is available under general public license (GPL
The mudfold package is an addition to a broad family of R packages that fit IRT models. The approach described here is an additional exploratory and validation method when fitting such models. Moreover it adds to the package mokken for the case in which proximity item response data needs to be analysed.
Looking to the future our focus will be on extending the functionality
of this package. In detail, we aim on the implementation of a more
efficient item selection algorithm which can reduce the computational
cost implied from the old fashioned iterative algorithm presented here
when the sample size and item number are significantly increasing.
Methodologies for handling multicategory type of items
(Van Schuur 1984) are not yet implemented in the package, however, we
plan to extend its applicability in the future. Last but not least, a
parametric version of MUDFOLD method based on the IRF implemented in the
mudfoldsim()
will offer a complete framework for the analysis of data
that have been generated under an unfolding response process.
mudfold, GGUM, mirt, mokken, boot, mice, gtools, glmnet, mgcv, zoo, reshape2, ggplot2, smacof
Bayesian, Econometrics, Environmetrics, Finance, MachineLearning, MissingData, MixedModels, Optimization, Phylogenetics, Psychometrics, Spatial, Survival, TeachingStatistics, TimeSeries
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Balafas, et al., "mudfold: An R Package for Nonparametric IRT Modelling of Unfolding Processes", The R Journal, 2020
BibTeX citation
@article{RJ-2020-002, author = {Balafas, Spyros E. and Krijnen, Wim P. and Post, Wendy J. and Wit, Ernst C.}, title = {mudfold: An R Package for Nonparametric IRT Modelling of Unfolding Processes}, journal = {The R Journal}, year = {2020}, note = {https://rjournal.github.io/}, volume = {12}, issue = {1}, issn = {2073-4859}, pages = {49-75} }