Optimal design ideas are increasingly used in different disciplines to rein in experimental costs. Given a nonlinear statistical model and a design criterion, optimal designs determine the number of experimental points to observe the responses, the design points and the number of replications at each design point. Currently, there are very few free and effective computing tools for finding different types of optimal designs for a general nonlinear model, especially when the criterion is not differentiable. We introduce an R package ICAOD to find various types of optimal designs and they include locally, minimax and Bayesian optimal designs for different nonlinear statistical models. Our main computational tool is a novel metaheuristic algorithm called imperialist competitive algorithm (ICA) and inspired by sociopolitical behavior of humans and colonialism. We demonstrate its capability and effectiveness using several applications. The package also includes several theorybased tools to assess optimality of a generated design when the criterion is a convex function of the design.
Optimal designs have been extensively applied in many research studies to reduce the cost of experimentation. For instance, Holling and Schwabe (2013) provided examples in psychology and Dette et al. (2010) gave examples in doseresponse studies. Further applications of optimal designs in engineering and epidemiology are described in Berger and Wong (2009), which also contains applications of optimal design ideas in other disciplines. Given a statistical model and an optimality criterion, optimal designs determine the optimal number of design points required, their locations to observe the responses and the number of replications required at each location. The optimality criterion should accurately reflect the objective of the study to the extent possible and is usually formulated as a scalar function of the Fisher information matrix (FIM) that measures the worth of the design (Lehmann and Casella 1998). For example, if the objective of a study is to estimate the model parameters as accurately as possible, \(D\)optimality is often used. Such an optimal design maximizes the determinant of the FIM and is called \(D\)optimal. When errors are independent and normally distributed, \(D\)optimal designs minimize the volume of the confidence ellipsoid of the model parameters by minimizing the generalized variance, i.e., the determinant of the variancecovariance matrix (Abdelbasit and Plackett 1983).
For nonlinear models, the FIM depends on the unknown model parameters to be estimated and so the design criterion cannot be directly optimized. There are different approaches to deal with this parameter dependency: a) locally optimal designs: These are found by replacing the unknown parameters with some estimates obtained from a pilot or previous study (Chernoff 1953). Locally optimal designs usually become inefficient when the replaced estimates are far from their true unknown values. b) minimax optimal designs: They minimize the maximum inefficiency over a userselected parameter space (Sitter 1992). The optimal designs are conservative in that they protect the experiment from the worst case scenario that may happen from a poor choice of parameter values over a userspecified space of plausible values for the unknown parameters. Finding minimax optimal designs is complicated because it involves solving multilevel nested optimization problems and the objective function (minimax criterion) is not differentiable. c) Bayesian optimal designs: These optimal designs are found by optimizing an optimality criterion averaged over a userspecified (continuous) prior distribution for the unknown parameters (Chaloner and Larntz 1989; Chaloner and Verdinelli 1995; Atkinson 1996). Strictly speaking, the latter are not fully Bayesian because they do not involve computing a posterior distribution. Instead, they borrow the concept of having prior distributions to find robust designs for the frequentists (Graßhoff et al. 2012; Bürkner et al. 2019). Accordingly, they are sometimes referred to as “pseudo” Bayesian designs (Firth and Hinde 1997). In the optimal design literature, Bayesian optimal designs found under a discrete prior distribution are usually referred to as robust or optimumonaverage designs (Fedorov and Hackl 2012). For an overview of optimal designs for nonlinear models, see Fedorov and Leonov (2013).
There are several software packages to create and analyze design of experiment (DoE) for different purposes. For a review on statistical R packages in design of experiments, see https://cran.rproject.org/web/views/ExperimentalDesign.html. Only a few of them are able to find different types of optimal designs to deal with the parameter dependency for various nonlinear models. To the best of our knowledge, none of the available software packages, commercial or otherwise, provides an option to find minimax optimal designs for nonlinear models. For example, the R package LDOD (Masoudi et al. 2013) finds locally \(D\)optimal approximate designs for a large class of nonlinear models and the acebayes R package (Overstall et al. 2017) determines a more general class of fully Bayesian exact designs using the approximate coordinate exchange algorithm (Overstall and Woods 2017). Likewise, the recently available VNM R package finds multipleobjective locally optimal designs for a specific model, i.e., the fourparameter Hill model commonly used in doseresponse studies (Hyun et al. 2018). Among the commercial software, JMP (SAS Institute Inc. 2016) can also find Bayesian \(D\)optimal exact designs for nonlinear models.
This paper introduces the R package ICAOD (Masoudi et al. 2020) for finding a variety of optimal designs for nonlinear models using a novel metaheuristic algorithm called imperialist competitive algorithm(ICA). This algorithm is inspired by sociopolitical behavior of humans (AtashpazGargari and Lucas 2007; Hosseini and Al Khaled 2014) and is modified by Masoudi et al. (2017) and Masoudi et al. (2019) to find optimal designs for nonlinear models. We believe that this ICAOD package is the first single selfcontained statistical package that presents a framework to find locally, minimax and Bayesian optimal designs for nonlinear models. Similar to many popular natureinspired metaheuristic algorithms, such as particle swarm optimization (PSO) algorithm (Kennedy and Eberhart 1995), ICA does not have a rigorous proof of convergence (Yang 2011). When the criterion is a convex function on the set of design measures, equivalence theorems are available and the ICAOD package includes tools to confirm optimality of a design. More generally, the proximity of any design to the optimum without knowing the latter can be evaluated in terms of an efficiency lower bound. In particular, if this bound is unity, this confirms optimality of the design. This feature is useful to recognize a case of premature convergence in optimal design problems.
The next section reviews the statistical setup and theory for finding optimal designs for nonlinear models. The fourth section describes the imperialist competitive algorithm (ICA) and the fifth section provides implementation details for the ICAOD package. In the sixth section, we provide two examples to show the functionality of the ICAOD package. The seventh section finds locally and minimax \(D\)optimal designs for a logistic model with application in educational testing and The eighth section presents optimumonaverage and Bayesian \(D\)optimal designs for a sigmoid Emax model for doseresponse studies. The ICAOD package was first written to find locally \(D\)optimal designs, but it now also finds userdefined optimal designs. The ninth section illustrates how to use this feature to find \(c\)optimal designs for a twoparameter logistic model in dose response studies. The last section concludes with a summary.
Let \(E(Y)=f(\boldsymbol{x},\boldsymbol{\theta})\) be the mean of the response \(Y\) at the values of the independent variables \(\boldsymbol{x}\) defined on a userselected design space \(\chi\), and let \(f\) be a known function, apart from the model parameters \(\boldsymbol{\theta}= (\theta_1,..., \theta_p)^T\). Throughout we assume that there are resources to take \(N\) observations for the study and given an optimality criterion, we want to find the best choices for the levels of the independent variables to observe the outcome \(Y\). There are two types of designs: exact and approximate.An exact design \(\xi_N\) on \(\chi\) is defined by a set of \(k\) distinct levels \(\boldsymbol{x}_i\), \[\label{eqexactdesign} \xi_N = \left\{ \begin{array}{cccc} \boldsymbol{x}_1 & \boldsymbol{x}_2 & ... & \boldsymbol{x}_k\\ n_1/N & n_2/N & ...& n_k/N \end{array} \right\}, \tag{1}\] where \(\boldsymbol{x}_j \in \chi\), \(n_j\) is the number of replications of \(\boldsymbol{x}_j\) in the observations sample and \(N = \sum_{j=1}^kn_j\). Here, \(x_j, j = 1..., k\) are referred to as support points or design points of \(\xi_N\). Given \(N\) and a specific design criterion, an optimal exact design finds the best value of \(k\) and the best values of \(x_1,\ldots,x_k,n_1,\ldots,n_k\). Such optimization problems are notoriously difficult and in practice, we find optimal approximate designs instead. They are probability measure on \(\chi\) are found independent of the sample size \(N\). An approximate design \(\xi\) with \(k\) support points has the form \[\label{eqapproximatedesign} \xi = \left\{ \begin{array}{cccc} \boldsymbol{x}_1 & \boldsymbol{x}_2 & ... & \boldsymbol{x}_k\\ w_1 & w_2 & ...& w_k \end{array} \right\}, \tag{2}\] where \(w_j> 0\) is the proportion of observations that is assigned to \(\boldsymbol{x}_j\) and \(\sum_{j=1}^{k} w_j = 1\). It is implemented by first rounding each value of \(Nw_i\) to the nearest integer \(Nw_i^*\) subject to \(Nw_1^*+\ldots+Nw_k^*=N\) and taking \(Nw_i*\) observations at \(\boldsymbol{x}_i,i=1,\ldots,k\). Some optimal rounding procedures are available in Pukelsheim and Rieder (1992). When the design criterion is formulated as a convex function of the FIM, there are algorithms for finding many types of optimal approximate designs and theory to confirm optimality of an approximate design. When the design is not optimal, a theorybased efficiency lower bound of the design is available to determine its proximity to the optimum, without knowing the optimum. For these reasons, we focus on optimal approximate designs found under a convex functional in the rest of the paper.
To find an approximate design that minimizes a convex design criterion \(\psi\) over the space of all designs on \(\chi\). We have to determine the optimal number of support points, \(k\), the optimal support points \(\boldsymbol{x}_1,\ldots,\boldsymbol{x}_k\) and their corresponding \(w_1,\ldots,w_k\). For example, if estimating model parameters is of interest, \(D\)optimality, defined by the logarithm of the determinant of the inverse of the FIM, is a convex functional over the space of all designs on \(\chi\) (Silvey 1980; Fedorov and Leonov 2013) and the design that minimuzes it is called \(D\)optimal. In what follows, we focus on the \(D\)optimality criterion and briefly discuss other optimality criteria and optimal designs which can be studied similarly.
Assuming all observation errors are independent and normally distributed with means \(0\) and a constant variance \(\texttt{(}Y)\), the FIM of a generic \(k\)point approximate design \(\xi\) is given by \[\label{eq:FIM} M(\xi,\boldsymbol{\theta}) = \sum_{i=1}^k w_i I(\boldsymbol{x}_i,\boldsymbol{\theta}), \tag{3}\] where \[I(\boldsymbol{x}_i,\boldsymbol{\theta}) = \frac{1}{\texttt{(}Y_i)}\nabla f(\boldsymbol{x}_i,\boldsymbol{\theta})\nabla f(\boldsymbol{x}_i,\boldsymbol{\theta})^T,\] and \(\nabla f(\boldsymbol{x}_i,\boldsymbol{\theta})^T=\left( \frac{\partial f(\boldsymbol{x}_i,\boldsymbol{\theta})}{\partial \theta_1}, \cdots,\frac{\partial f(\boldsymbol{x}_i,\boldsymbol{\theta})}{\partial \theta_p} \right)\). Here, \(\frac{\partial f(\boldsymbol{x}_i,\boldsymbol{\theta})}{\partial \theta_j}\) denotes the partial derivative of \(f\) with respect to \(\theta_j\). The FIM is singular if \(k < p\). To avoid singular designs, i.e., designs with singular Fisher information matrices, we assume \(k\geq p\).
Clearly, the FIM (3) depends on the unknown parameters for nonlinear models. Different approaches have been proposed to deal with this parameter dependency based on the type of information available for the unknown parameters. For example, let \(\boldsymbol{\theta}_0\) be an initial guess for \(\boldsymbol{\theta}\) available from a similar study. A locally \(D\)optimal design \(\xi^*_{loc}\) minimizes \[\label{eq:locallycriterion} \psi_{loc}(\xi) = \logM(\xi, \boldsymbol{\theta}_0), \tag{4}\] where \(\cdot\) denotes the determinant. In practice, it is more realistic to assume that the unknown parameters belong to a userspecified parameter space \(\Theta\), which is comprised of all possible values for \(\boldsymbol{\theta}\). Given \(\Theta\), we can find minimax optimal designs that minimize the maximum inefficiency over \(\Theta\) and protect the experiment from the worstcase scenario over the parameter space. A minimax \(D\)optimal design \(\xi^*_{min}\) is obtained by minimizing \[\label{eq:minimaxcriterion} \psi_{min}(\xi) = \max_{\boldsymbol{\theta}\in \Theta} \logM(\xi, \boldsymbol{\theta}), \tag{5}\] over the space of all designs on \(\chi\). The minimax problem (5) is a bilevel nested optimization problem with inner and outer optimization problems. Given any arbitrary design, the inner optimization problem is to maximize the \(D\)criterion \(\logM(\xi, \boldsymbol{\theta})\) over \(\Theta\) to find the maximum inefficiency and the outer optimization problem is to minimize the maximum of the inner problem over the space of all designs on \(\chi\). Alternatively, when a prior distribution \(\pi_{\Theta}(\boldsymbol{\theta})\) is available for the unknown parameters on \(\Theta\), Bayesian optimal designs may also be found: a (pseudo) Bayesian \(D\)optimal design \(\xi_{bayes}^*\) minimizes \[\label{eq:bayesiancriterion} \psi_{bayes}(\xi) = \int_{\boldsymbol{\theta}\in \Theta} \logM(\xi, \boldsymbol{\theta}) \pi_{\Theta}(\boldsymbol{\theta}) d\boldsymbol{\theta}. \tag{6}\] When \(\pi_{\Theta}(\boldsymbol{\theta})\) is a discrete prior, the obtained designs are sometimes referred to as optimumonaverage or robust designs.
One advantage of working with approximate designs is existence of an equivalence theorem, which can be used to verify the optimality of a given design if the criterion is a convex function on the set of design measures. Each convex optimality criterion gives rise to a different equivalence theorem, but they generally have the same form. For example, a design \(\xi_{loc}^*\) is locally \(D\)optimal if and only if the following inequality holds for all \(\boldsymbol{x}\in \chi\), \[\label{eq:sensitivitylocally} c_{loc}(\boldsymbol{x}, \xi_{loc}^*) = \mathop{\mathrm{tr}}M^{1}(\xi_{loc}^*, \boldsymbol{\theta}_{0})I(\boldsymbol{x}, \boldsymbol{\theta}_{0})  p \leq 0, \tag{7}\] with equality in (7) at all support points of \(\xi_{loc}^*\). The left handside of inequality (7) is sometimes called sensitivity function. The equivalence theorem for Bayesian \(D\)optimality criterion is very similar (Kiefer and Wolfowitz 1959; Chaloner and Larntz 1989): a design \(\xi_{bayes}^*\) is a Bayesian \(D\)optimal design if and only if the following inequality holds for all \(\boldsymbol{x}\in \chi\), \[\label{eq:sensitivitybayesian} c_{bayes}(\boldsymbol{x}, \xi_{bayes}^*) = \int_{\Theta} tr \{ M^{1}(\xi_{bayes}^*, \boldsymbol{\theta})I(\boldsymbol{x}, \boldsymbol{\theta})\}\pi(\boldsymbol{\theta}) d\thetap \leq 0, \tag{8}\] with equality in (8) at all support points of \(\xi_{bayes}^*\). However, the equivalence theorem for a minimax type criterion takes on a more complicated form because (5) is not differentiable. The equivalence theorem states that a design \(\xi^*_{min}\) is minimax \(D\)optimal among all the designs on \(\chi\) if and only if there exists a probability measure \(\mu^*\) on \[\label{eq:Aset} A(\xi_{min}^*) = \left\{\boldsymbol{\nu}\in \Theta \mid \logM(\xi_{min}^*, \boldsymbol{\nu}) = \max_{\boldsymbol{\theta}\in \Theta} \logM(\xi_{min}^*, \boldsymbol{\theta}) \right\}, \tag{9}\] such that the following inequality holds for all \(\boldsymbol{x}\in \chi\), \[\label{eq:sensitivityminimax} c_{min}(\boldsymbol{x}, \xi_{min}^*) = \int_{A(\xi_{min}^*)} \mathop{\mathrm{tr}}M^{1}(\xi_{min}^*, \boldsymbol{\nu}) I(\boldsymbol{x}, \boldsymbol{\nu})\mu^* d(\boldsymbol{\nu})p \leq 0, \tag{10}\] with equality in (10) at all support points of \(\xi_{min}^*\) (Fedorov 1980; Wong 1992; Berger et al. 2000; King and Wong 2000). The set \(A(\xi_{min}^*)\) is sometimes called the answering set of \(\xi^*\) and the measure \(\mu^*\) is a subgradient of the nondifferentiable criterion evaluated at \(M(\xi_{min}^*,\nu)\). Understanding the properties of the subgradients and how to find them efficiently for the minimax optimal design problems present a key problem in solving this type of problems. In particular, there is no theoretical rule on how to choose the number of points in \(A(\xi_{min}^*)\) as support for the measure \(\mu^*\) and they would have to be found by trialanderror. For more details, see Masoudi et al. (2017). When \(\chi\) is one or two dimensional, it is very common to plot the sensitivity function versus \(\boldsymbol{x}\in \chi\) and visually inspect whether the graph meets the conditions in the equivalence theorem. If it does, the generated design is optimal; otherwise it is not optimal.
We measure the efficiency of one design \(\xi_1\) relative to another design \(\xi_2\) using their criterion values. For example, for \(D\)optimality (4), we use \[\label{eq:locallyDefficiency} eff_{loc} =\left(\frac{M(\xi_1, \boldsymbol{\theta})}{M(\xi_2, \boldsymbol{\theta})}\right)^{1/p} =\exp\left(\frac{\psi_{loc}(\xi_2)  \psi_{loc}(\xi_1) }{p}\right). \tag{11}\] The relative \(D\)efficiency (11) may be interpreted in term of sample size; if its value is \(\rho\), then \(\xi_1\) requires \(1/\rho\) times as many observations to have the same \(D\)efficiency as \(\xi_2\). This means that, when \(\xi_2\) is an optimal design, about \((1/\rho1)100\%\) more number of observations are required for design \(\xi_1\) to do as well as the optimal design. Similarly, we define Bayesian and minimax \(D\)efficiencies by replacing \(\psi_{loc}\) with \(\psi_{min}\) and \(\psi_{bayes}\), respectively. Standardly, (11) becomes the \(D\)efficiency of \(\xi_1\) when \(\xi_2\) is the \(D\)optimal design.
When the design criterion is a convex functional, we can use the equivalence theorem to quantify the proximity of a design \(\xi\) to the optimal design without knowing the latter by means of the efficiency lower bound (ELB). For example, for \(D\)optimality, we have \[\label{eq:ELB} ELB = \frac{p}{p + \max_{\boldsymbol{x}\in \chi}c(\boldsymbol{x}, \xi)}, \tag{12}\] where \(c(\boldsymbol{x}, \xi)\) is the sensitivity function associated with \(D\)optimality. The value of the ratio in (12) is between \(0\) and \(1\), and it is equal to \(1\) if and only if the design is optimal. The efficiency bounds are not unique and can be varously derived using somewhat similar arguments, for example, see Atwood (1969) and Pázman (1986).
The imperialist competitive algorithm (ICA) is an evolutionary algorithm inspired from colonialism and sociopolitical behavior of humans, where developed countries attempt to take over or colonize lessdeveloped countries to use their resources and extend their power (AtashpazGargari and Lucas 2007). Within the optimization framework, ICA has a population of solutions called countries. In optimal design problems, each country is the location of the support points and the corresponding weights of a design on the space of all possible designs. ICA divides the population of countries into some subpopulations called empires. Each empire contains one imperialist and some colonies. The imperialist is the most powerful country within the empire. Here, the power of a country is defined to be a function of its cost value, i.e., criterion value. This means that, in a minimization problem, countries with smaller cost values are stronger. In ICA, there are two types of evolutionary moves: a) evolution within each empire, and, b) evolution among the empires. In the earlier, the colonies within each empire start to move or be absorbed toward their relevant imperialist country in a process called assimilation (Lin et al. 2013). During this process, a colony may reach a better position than its imperialist. In this case, the imperialist loses its rank and the colony becomes the new imperialist. The assimilation improves searching around the better current solutions and so enhances the exploitation of the algorithm.
The evolution among the empires is achieved by a process called imperialists competition. In this process, the most powerful empires receive more chances to take possession of the colonies of the weakest empires. The competition step in ICA improves the exploration of the algorithm in a search for the global optimum. When an empire does not have any colony, it will be eliminated. ICA continues until it satisfies the stopping rule conditions. For more details, see AtashpazGargari and Lucas (2007) and Hosseini and Al Khaled (2014).
To apply ICA for an optimal design problem, the user should first provide an initial guess about the number of support point \(k (\geq p)\). In practice, the user can start by \(p\) and increment its value by one until the equivalence theorem confirms the optimality of the current best design, which is the country with the least cost value. In optimal design problems, the ELB defined by (12) can be used to build a stopping rule condition for ICA. For example, the algorithm can be stopped when the value of the ELB of the best current design is larger than, say, \(0.95\). Clearly, finding ELB in each iteration increases the CPU time required by the algorithm as another optimization problem has to be solved to find maximum of the sensitivity function over \(\chi\). This is especially true for minimax and Bayesian type criteria, because the sensitivity function for the earlier involves solving a bilevel nested optimization problem and the latter requires approximating integrals. Therefore, we prefer to calculate the ELB periodically, say, after every \(100\) iterations, instead of every iteration to save the CPU time.
Different functions are available to find optimal designs for nonlinear
models in ICAOD: a)
locally()
: Finds locally optimal designs, b) minimax()
: Finds
minimax optimal designs , c) bayes()
: Finds Bayesian optimal designs
and d) robust()
: Finds optimumonaverage or robust designs.
Throughout this paper, we refer to them as “OD functions”.
ICAOD uses the S3 object
oriented system and works with an object of class ‘minimax
’. The class
‘minimax
’ has its own plot
, print
and update
method. The plot
method is used to plot the sensitivity function and also calculate the
ELB for the output design. The print
method is to display the brief
profile of ICA iterations and the summary of identified optimal designs.
The update
method is for executing the algorithm for more number of
iterations. By default, OD functions are defined to determine
\(D\)optimal designs. In the section
UserSpecified Optimality Criteria
, we demonstrate how to specify
userdefined optimality criteria. In what follows, the OD functions are
explained in detail.
The locally()
function finds locally optimal designs and its main
arguments are:
locally(formula, predvars, parvars, family = gaussian(), fimfunc = NULL,
ICA.control = list(), sens.control = list(),
lx, ux, k, iter, crt_func = NULL, sens_func = NULL,
inipars)
The arguments in the first three lines of codes are common between the OD functions. Table 1 provides an overview of them.
The arguments in the first line are required to construct the FIM of the
model; inipars
is equivalent to \(\boldsymbol{\theta}_0\) in
(4) and defines the vector of initial estimates
for the model parameters.
The ICAOD package includes
a formula interface to specify the model of interest. For example,
assume the twoparameter logistic model defined by
\[\label{eq:logisticformula}
f(x, \boldsymbol{\theta}) = \frac{1}{1 + \exp(b(xa))}, \tag{13}\]
where \(\boldsymbol{\theta}= (a, b)\) is the vector of model parameters
and \(x\) is the model predictor. To define (13) in
ICAOD, we can set
formula = 1/(1 + exp(b * (xa)))
, predvars = "x"
,
parvars = c("a", "b")
and family = "binomial"
(or
family = binomial()
). Alternatively, one may pass the FIM
of (13) as an R function
via the argument
fimfunc
directly. In this option, the arguments of the defined
function must be a) x
: is a vector
of
\((\boldsymbol{x}_1, ..., \boldsymbol{x}_k)\)
in (2), b) w
: is a vector
of
\((w_1, ..., w_k)\) in (2), and c) param
: is
a vector
of \(\boldsymbol{\theta}\) in (13). The
output is the FIM of (13) evaluated at the given
x
, w
and param
as a matrix
.
The argument sens.control
is a list of control parameters for
nloptr()
available in the
nloptr package
(Johnson 2014). This function is used here to solve
\(\max_{\boldsymbol{x}\in \chi}c(\boldsymbol{x}, \xi)\) for computing the
ELB (12). When not given, it will be created automatically by
the function sens.control
. We recommend not to change its default
values as they have been successfully tested for a large number of
problems.
The crt_func
and sens_func
arguments are used to find a userdefined
optimal designs, which are described in the section
UserSpecified Optimality Criteria
.
The minimax()
function finds minimax optimal designs and its main
arguments are:
minimax(formula, predvars, parvars, family = gaussian(), fimfunc = NULL,
ICA.control = list(), sens.control = list(),
lx, ux, k, iter, crt_func = NULL, sens_func = NULL,
n.grid = 0,
lp, up, sens.minimax.control = list(), crt.minimax.control = list())
The first three lines of codes are similar to the ones in locally()
and the rest of the arguments are used to evaluate the minimax
criterion (5) and its sensitivity
function (10) at a given design.
Table 2 presents an overview of the
arguments specifically available in minimax()
.
In ICAOD, the parameter
space \(\Theta\) are either continuous or discrete. Note that the lower
bound and upper bound of \(\Theta\) are specified via the arguments lp
and up
, respectively. When \(\Theta\) is continuous,
ICAOD uses nloptr()
to
solve the inner maximization problem in (5) over
\(\Theta\) at a given design. The default optimization algorithm from
nloptr()
is the DIRECTL algorithm, which is a deterministic search
algorithm based on the systematic division of the search domain into
smaller and smaller hyperrectangles (Gablonsky and Kelley 2001). For our
applications, the most influential tuning parameter of nloptr()
is the
maximum number of function evaluations denoted by maxeval
(its default
value is \(1000\)) via the crt.minimax.control
argument. The parameter
space may also be discretized. In this option, the total number of grid
points is equal to n.grid^p
. When specified, ICA evaluates the
criterion at these grid points to solve the maximization problem over
\(\Theta\).
Argument  function  Description 

lp 
minimax() 
A vector of lower bounds for \(\boldsymbol{\theta}\). 
up 
A vector of upper bounds for \(\boldsymbol{\theta}\).  
n.grid 
(optional) When have a positive value, the parameters space \(\Theta\) will be discretized, where the number of grid points will be equal to n.grid^p (defaults to \(0\)). 

crt.minimax.control 
A list of control parameters of the function nloptr() , which is used to maximize the optimality criterion at a given design over \(\Theta\). By default, it will be created by crt.minimax.control() . 

sens.minimax.control 
A list of control parameters to find the answering set (9), which is required to obtain the sensitivity function and calculate the ELB. By default, it will be created by sens.minimax.control() . For more details, see ?sens.minimax.control . 

prior 
bayes() 
An object of class ‘cprior ’ that contains the necessary information about the prior distribution for the unknown parameters \(\boldsymbol{\theta}\). For popular prior distributions, it can be created via the uniform() , normal() , skewnormal() , student() functions. For more details, see ?bayes . 
crt.bayes.control 
A list of control parameters to approximate the integrals in (6), using either the hcubature() function (an adaptive multidimensional integration method over hypercubes) or the Gaussian quadrature formulas implemented by the mvQuad package. By default, it will be created by crt.bayes.control() . 

sens.bayes.control 
A list of control parameters required to approximate the integrals in (8). It is very similar to crt.bayes.control() and by default will be created by crt.bayes.control() . 

prob 
robust() 
A vector of the probability measure associated with each vector of initial estimates for the unknown parameters \(\boldsymbol{\theta}\). 
parset 
A matrix where each of its row is a vector of the initial estimates for \(\boldsymbol{\theta}\). 
The bayes()
function finds Bayesian optimal designs and its main
arguments are:
bayes(formula, predvars, parvars, family = gaussian(), fimfunc = NULL,
ICA.control = list(), sens.control = list(),
lx, ux, k, iter, crt_func = NULL, sens_func = NULL,
crt.bayes.control = list(), sens.bayes.control = list()) prior,
The first three lines of codes are similar to the ones in locally()
and the rest of the arguments are used to approximate the integrals in
(6) and (8) at a
given design. Table 2 presents an
overview of the arguments specifically available in bayes()
.
By default, ICAOD uses the
hcubature()
function from the
cubature package
(Johnson 2013; Narasimhan and Johnson 2017) to approximate the integrals. Th function
hcubature()
includes an adaptive multidimensional integration method
over hypercubes known as hcubature algorithm (Genz and Malik 1980; Berntsen et al. 1991).
For our applications, the most important tuning parameters of the
hcubature algorithm are the maximum number of integrand evaluations
maxEval
(its default value is 50000
) and a userspecified tolerance
tol
(its default value is 1e5
). This algorithm stops either when
the integral error estimate is less than the integral estimate
multiplied by its value or when the it reaches the specified maximum
number of function evaluations maxEval
, whichever happens earlier.
When the prior distribution is less diffuse, it is sometimes more
efficient to reduce the value of maxEval
to increase the speed of the
hcubature algorithm. The control parameters of the hcubature()
function can be regulated via the argument crt.bayes.control
.
Alternatively, ICAOD also
offers the GaussLegendre and the GaussHermite formulas to approximate
the integrals. These methods are implemented in ICA using the
mvQuad package (Weiser 2016)
and can be requested via the argument crt.bayes.control
. For more
details, see ?mvQuad::createNIGrid()
.
The robust()
function finds optimumonaverage or robust designs and
its main arguments are:
robust(formula, predvars, parvars, family = gaussian(), fimfunc = NULL,
ICA.control = list(), sens.control = list(),
lx, ux, k, iter, crt_func = NULL, sens_func = NULL,
prob, parset)
The first three lines of codes are similar to the ones in locally()
and the rest of the arguments are used to evaluate the
optimumonaverage criterion at a given design.
Table 2 presents an overview of the
arguments specifically available in robust()
.
In this section, we provide two examples to show the functionality of the ICAOD package to determine optimal designs. In the first example, we find locally and minimax \(D\)optimal designs for a logistic model with applications in educational testing. In the second example, we specify Bayesian and robust optimal designs for the sigmoid Emax model with applications in doseresponse studies.
The logistic model is very popular for modeling binary outcomes. For example, consider an educational research that studies the effect of hours of practice on the mastery of a mathematical task. Let \(Y\) be a binary response variable that takes the value \(1\) if a subject has mastered the task and \(0\) otherwise. The logistic model is defined by \[\label{eq:logisticsingle} f(x, \boldsymbol{\theta}) = P(Y=1) = \frac{\exp(\beta_0 + \beta_1 x)}{1 + \exp(\beta_0 + \beta_1 x)}, \tag{14}\] where \(x\) is the hours of practice and \(\boldsymbol{\theta}= (\beta_0, \beta_1)^T\). Assume that for each subject up to six hours of practice are possible, i.e., \(x \in \chi = [0, 6]\). If the purpose of the study is to estimate the model parameters accurately, an appropriate criterion is the \(D\)optimality. The design questions here are a) what is the best number of levels of \(x\) to apply in the study, b) what are these levels and c) how many subjects should be assigned to each level? For example, a researcher may choose a uniform design that includes an equal number of subjects who have practiced for \(0, 1, 2, 3, 4, 5, 6\) hours. We denote this design by \[\label{equniformdesign} \xi_{uni} = \left\{ \begin{array}{ccccccc} 0 & 1 & 2 & 3 & 4 & 5 & 6 \\ 1/7& 1/7 & 1/7& 1/7 & 1/7 & 1/7 & 1/7 \end{array} \right\}. \tag{15}\] The FIM of model (14) depends on the unknown parameters through \(\frac{\partial f(\boldsymbol{x},\boldsymbol{\theta})}{\partial \beta_j}, j = 0, 1\). Following Berger and Wong (2005), let \(\boldsymbol{\theta}_0 = (4, 1.3333)^T\) be the best initial guess for \(\boldsymbol{\theta}\) available from, say, a similar study. In ICAOD, the locally \(D\)optimal design is found by
> library("ICAOD")
R> log1 < locally(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial", lx = 0, ux = 6, iter = 40, k = 2,
+ inipars = c(4, 1.3333), ICA.control = list(rseed = 1))
> print(log1)
R
Finding locally optimal designs
:
Call~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x))
iter x1 x2 w1 w2 min_cost mean_cost1 1 1.897630 4.289279 0.4480275 0.5519725 3.585707 3.629564
5 5 1.735791 4.135709 0.4818501 0.5181499 3.574277 3.608956
9 9 1.812077 4.132323 0.4965689 0.5034311 3.569134 3.569134
14 14 1.831070 4.143704 0.4982596 0.5017404 3.568777 3.568777
18 18 1.845842 4.158716 0.4996087 0.5003913 3.568684 3.568684
22 22 1.842875 4.159136 0.4999992 0.5000008 3.568680 3.568680
27 27 1.842477 4.157866 0.4999457 0.5000543 3.568679 3.568679
31 31 1.842450 4.157647 0.4999709 0.5000291 3.568679 3.568679
35 35 1.842456 4.157634 0.4999973 0.5000027 3.568679 3.568679
40 40 1.842479 4.157646 0.4999987 0.5000013 3.568679 3.568679
designs (k=2):
Optimal
Points1 Points21.84248 4.15765
Weights1 Weights20.500 0.500
: 40
ICA iteration: 3.568679
Criterion valuefunction evaluations: 1768
Total number of : 76
Total number of successful local search moves: 48
Total number of successful revolution moves: Maximum_Iteration
Convergence: 701
Total number of successful assimilation moves: 1.09 seconds! CPU time
Throughout this paper, the rseed
argument is used to guarantee the
reproducibility of the results. The algorithm stopped at iteration
number \(40\) because it reached the maximum number of iterations
(iter = 40
). Here, the design provided by the output assigns equal
weights to \(x_1 = 1.84249\) and \(4.15765\). This mean that, half of the
subjects should be assigned to practice nearly less than 2 hours and the
other half should practice a little bit more than 4 hours. The
\(D\)criterion (4) evaluated at this design is
equal to \(3.5686\). Alternatively, the optimal design at the final
iteration and the detailed profiles of ICA optimization at each
iteration can be obtained by
> log1$design
R
iter x1 x2 w1 w2 min_cost mean_cost max_sens elb1 40 1.842479 4.157646 0.4999987 0.5000013 3.568679 3.568679 NA NA
time_sec1 NA
> log1$out
R
iter x1 x2 w1 w2 min_cost mean_cost1 1 1.897630 4.289279 0.4480275 0.5519725 3.585707 3.629564
2 2 1.894919 4.287451 0.4875810 0.5124190 3.575292 3.619014
3 3 1.895518 4.285989 0.4875810 0.5124190 3.575159 3.616086
4 4 1.735791 4.135761 0.4818501 0.5181499 3.574278 3.613881
5 5 1.735791 4.135709 0.4818501 0.5181499 3.574277 3.608956
...36 36 1.842457 4.157634 0.4999973 0.5000027 3.568679 3.568679
37 37 1.842457 4.157634 0.4999973 0.5000027 3.568679 3.568679
38 38 1.842460 4.157632 0.4999980 0.5000020 3.568679 3.568679
39 39 1.842481 4.157638 0.5000023 0.4999977 3.568679 3.568679
40 40 1.842479 4.157646 0.4999987 0.5000013 3.568679 3.568679
The plot of the sensitivity function of the design provided by the output and the value of the ELB is obtained by
> plot(log1)
Rfunction is 5.323248e06
Maximum of the sensitivity bound (ELB) is 0.9999973
Efficiency lower 0.33 seconds! Verification required
Output from plot(log1)

Output from plot(log3)

locally()
function for the logistic model
over χ = [0,6] when θ = θ_{0} = (−4,1.3333)^{T}.
The left panel (a) verrfies the global optimality of the obtained design
and the right panel (b) does not verify the optimality of the obtained
design. The solid red dots are the values of the sensitivity function at
the obtained design points.
Figure 1 (a) displays the plot of the sensitivity function (7) of the design provided by the output on the design space \([0,6]\). Based on the equivalence theorem, this design is optimal because the sensitivity function is equal or less than zero on \([0,6]\) and (roughly) equal to zero at \(1.84249\) and \(4.15765\) (see the red points). The value of the ELB is nearly \(1\), which also indicates the optimality of this design.
It is interesting to assess the performance of the uniform design \(\xi_{uni}\) with respect to the locally \(D\)optimal design obtained above. Using (11), we can calculate the \(D\)efficiency of \(\xi_{uni}\) relative to the locally \(D\)optimal design by
> leff(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial", inipars = c(4, 1.3333),
+ x1 = c(0:6), w1 = rep(1/7, 7),
+ x2 = log1$evol[[20]]$x, w2 = log1$evol[[20]]$w)
1] 0.7778719 [
The value of the relative \(D\)efficiency indicates that \(\xi_{uni}\) requires about \(100(1/0.7771)= 29\%\) more number of subjects to have the same \(D\)efficiency as the \(D\)optimal design when \(\boldsymbol{\theta}= \boldsymbol{\theta}_0\). Therefore, having subjects to practice, say, less than 1 hours or more than 5 hours will not increase the efficiency of the parameter estimates very much.
The value of the ELB may also be used to construct a stopping rule
condition for ICA. This feature is activated via the ICA.control
argument in all OD functions similar to what follows.
> log2 < locally(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial", lx = 0, ux = 6, iter = 40, k = 2,
+ inipars = c(4, 1.3333),
+ ICA.control = list(rseed = 1,
+ checkfreq = 20,
+ stop_rule = "equivalence",
+ stoptol = .99))
> print(log2)
R
Finding locally optimal designs
:
Call~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x))
iter x1 x2 w1 w2 min_cost mean_cost1 1 1.897630 4.289279 0.4480275 0.5519725 3.585707 3.629564
3 3 1.895518 4.285989 0.4875810 0.5124190 3.575159 3.616086
5 5 1.735791 4.135709 0.4818501 0.5181499 3.574277 3.608956
7 7 1.818028 4.160041 0.4797011 0.5202989 3.570617 3.570617
9 9 1.812077 4.132323 0.4965689 0.5034311 3.569134 3.569134
11 11 1.827779 4.137145 0.4958561 0.5041439 3.568919 3.568919
13 13 1.844558 4.142393 0.4961667 0.5038333 3.568856 3.568856
15 15 1.845992 4.165776 0.4984264 0.5015736 3.568713 3.568713
17 17 1.845348 4.155565 0.4996783 0.5003217 3.568688 3.568688
20 20 1.842781 4.159234 0.4999992 0.5000008 3.568680 3.568680
designs (k=2):
Optimal
Points1 Points21.84278 4.15923
Weights1 Weights20.500 0.500
: 20
ICA iteration: 3.56868
Criterion valuefunction evaluations: 918
Total number of : 46
Total number of successful local search moves: 46
Total number of successful revolution moves: equivalence
Convergence: 345
Total number of successful assimilation moves: 0.81 seconds!
CPU time
function is 3.483904e06
Maximum of the sensitivity bound (ELB) is 0.9999983
Efficiency lower 0.39 seconds!
Verification required > log2$design
R
iter x1 x2 w1 w2 min_cost mean_cost max_sens1 20 1.842781 4.159234 0.4999992 0.5000008 3.56868 3.56868 3.483904e06
elb time_sec1 0.9999983 0.39
We set stop_rule
= "equivalence"
to activate the stopping rule that
is based on the equivalence theorem. In this case, ICA starts to
calculate the ELB for the best design every checkfreq
= 20 iterations
and it stops whenever the value of the ELB is larger than stoptol
=
0.99. In this example, ICA stopped at the first check run because the
value of ELB is 0.999 (> stoptol
). Note that we requested to
calculate the ELB after every 20 iterations, instead of every iteration,
to prevent a significant increase in the CPU time. This
equivalencebased stopping rule is also available in other OD functions.
However, we note that optimality verification for Bayesian or minimax
type criteria is more complicated and may slow down the ICA.
ICAOD can also handle a
situation where the design points are prespecified, but their optimal
associated weights are of interest. For example, assume that the
experimental resources only allow a prespecified hours of practice,
say, \(x_1 = 1\), \(x_2 = 2\), \(x_3 = 3\) hours. In all OD functions, the
design points can be specified similarly via the argument x
(a vector
of design points):
> log3 < locally(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial", lx = 0, ux = 6, iter = 40,
+ x = c(1, 2, 3),
+ inipars = c(4, 1.3333),
+ ICA.control = list(rseed = 1, checkfreq = Inf))
> print(log3)
R
Finding locally optimal designs
:
Call~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x))
iter x1 x2 x3 w1 w2 w3 min_cost mean_cost1 1 1 2 3 0.4528099 2.460808e03 0.5447293 4.196368 4.205840
5 5 1 2 3 0.5106454 5.660663e03 0.4836939 4.190011 4.190011
9 9 1 2 3 0.4993132 8.104013e05 0.5006058 4.187368 4.187368
14 14 1 2 3 0.4993694 9.602963e06 0.5006210 4.187346 4.187346
18 18 1 2 3 0.4998314 4.227502e06 0.5001644 4.187343 4.187343
22 22 1 2 3 0.4998286 1.079043e07 0.5001713 4.187342 4.187342
27 27 1 2 3 0.4999951 1.656952e08 0.5000049 4.187342 4.187342
31 31 1 2 3 0.4999982 4.628899e10 0.5000018 4.187342 4.187342
35 35 1 2 3 0.4999994 5.689118e11 0.5000006 4.187342 4.187342
40 40 1 2 3 0.5000001 2.449702e12 0.4999999 4.187342 4.187342
designs (k=3):
Optimal : 0.500 0.000 0.500
Weights
: 40
ICA iteration: 4.187342
Criterion valuefunction evaluations: 1731
Total number of : 39
Total number of successful local search moves: 30
Total number of successful revolution moves: Maximum_Iteration
Convergence: 878
Total number of successful assimilation moves: 1.22 seconds!
CPU time
function is 2.558775
Maximum of the sensitivity bound (ELB) is 0.4387143
Efficiency lower 0.35 seconds! Verification required
The results show that no weight should be assigned to the subjects with 2 hours of practice. This means that, the responses from subjects with 2 hours of practice will not increase the efficiency of estimation very much. Hence, this level may be eliminated to save more resources.
The value of the ELB and the plot of the sensitivity function in
Figure 1 (b) clearly show that the
obtained design is not globally optimal. This comes as no surprise
because the given design points in x
do not belong to the support of
the optimal design when \(\boldsymbol{\theta}= \boldsymbol{\theta}_0\).
Note that checkfreq = Inf
requests a plot
method for the design
provided by the output so that plot()
is not required anymore. For
space consideration, we use this option in the rest of this paper.
Locally optimal designs usually lose their efficiency when the parameter estimates are far from their true unknown values. Moreover, in practice, it is more realistic to assume that the parameters belong to a parameter space, rather than fixing their values at some points. For example, let \(\boldsymbol{\theta}= (\beta_0, \beta_1)^T\) belongs to \(\Theta = [\beta_0^L, \beta_0^U] \times [\beta_1^L, \beta_1^U]\), where \(\beta_0^L = 6\), \(\beta_0^U = 2\), \(\beta_1^L = .5\) and \(\beta_1^U = 2\). As a conservative strategy, a minimax \(D\)optimal design minimizes the maximum inefficiency over \(\Theta\). To find the minimax \(D\)optimal design for our design setting, we first set \(k = 2\) to find the minimax \(D\)optimal design within the class of twopoint designs:
> log4 < minimax(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial",
+ lx = 0, ux = 6, lp = c(6, .5), up = c(2, 2),
+ iter = 200, k = 2,
+ ICA.control = list(rseed = 1,
+ checkfreq = 50,
+ stop_rule = "equivalence",
+ stoptol = .99),
+ crt.minimax.control = list(optslist = list(maxeval = 200)))
> print(log4)
R
Finding minimax optimal designs
:
Call~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x))
iter x1 x2 w1 w2 min_cost mean_cost1 1 0.4446832 4.868664 0.4243584 0.5756416 7.853582 8.061686
23 23 0.7665387 4.895727 0.4965758 0.5034242 7.782827 7.782827
45 45 0.7639494 4.895787 0.4999084 0.5000916 7.782754 7.782754
67 67 0.7636147 4.895791 0.5000079 0.4999921 7.782754 7.782754
89 89 0.7636144 4.895791 0.5000079 0.4999921 7.782754 7.782754
111 111 0.7636144 4.895791 0.5000079 0.4999921 7.782754 7.782754
133 133 0.7635408 4.895792 0.4999934 0.5000066 7.782754 7.782754
155 155 0.7635408 4.895792 0.4999934 0.5000066 7.782754 7.782754
177 177 0.7635408 4.895792 0.4999934 0.5000066 7.782754 7.782754
200 200 0.7635408 4.895792 0.4999934 0.5000066 7.782754 7.782754
designs (k=2):
Optimal
Points1 Points20.76354 4.89579
Weights1 Weights20.500 0.500
: 200
ICA iteration: 7.782754
Criterion valuefunction evaluations: 1710132
Total number of : 120
Total number of successful local search moves: 60
Total number of successful revolution moves: Maximum_Iteration
Convergence: 1007
Total number of successful assimilation moves: 6 0.5
Vector of maximum parameter values: 211.47 seconds!
CPU time
function is 21.9395
Maximum of the sensitivity bound (ELB) is 0.08354392
Efficiency lower 0.72 seconds!
Verification required in 'sens.minimax.control' ('n_seg')
Adjust the control parameters in 'sens.bayes.control' for higher speed. or
To increase the CPU time, we reduced the value of maxeval
from \(1000\)
(default value) to \(200\). Figure 2
(a) displays the sensitivity plot of the design by provided by the
output and it does not verify the optimality of the twopoint design.
k = 2  k = 3 
minimax()
function for
the logistic regression model over χ = [0,6] when Θ = [−6,−2] × [0.5,2]. The left
panel (a) does not verifies the optimality of the obtained design and
the right panel (b) shows the nearly optimality of the threepoint
design. The solid red dots are the values of the sensitivity function at
the obtained design points.
Therefore, we increment the value of \(k\) by one and reexecute the above code:
> log5 < minimax(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial",
+ lx = 0, ux = 6, lp = c(6, .5), up = c(2, 2),
+ iter = 500, k = 3,
+ ICA.control = list(rseed = 1,
+ checkfreq = 50,
+ stop_rule = "equivalence",
+ stoptol = .99),
+ crt.minimax.control = list(optslist = list(maxeval = 200)))
> print(log5)
R
Finding minimax optimal designs
:
Call~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x))
iter x1 x2 x3 w1 w2 w3 min_cost1 1 1.0613974 2.577126 5.994463 0.1263308 0.5337391 0.3399301 6.862938
6 6 0.9212932 2.161076 5.999569 0.1092450 0.5392481 0.3515069 6.826309
11 11 0.9291691 2.163344 5.998872 0.1131382 0.4602769 0.4265849 6.760707
17 17 0.9358663 2.165053 5.998143 0.1131382 0.4602769 0.4265849 6.758872
22 22 1.0343686 2.163586 5.997440 0.1000379 0.4592324 0.4407297 6.749718
28 28 1.1201234 2.340084 5.995084 0.1526103 0.3628279 0.4845619 6.745782
33 33 1.0084352 2.232590 5.999449 0.1185027 0.3770129 0.5044843 6.738361
39 39 1.0003813 2.245542 5.999894 0.1094120 0.3941922 0.4963958 6.737553
44 44 1.0159978 2.225392 5.999982 0.1148592 0.3916467 0.4934942 6.737084
50 50 1.0269974 2.206648 5.999927 0.1135038 0.3915064 0.4949898 6.736338
mean_cost1 7.553929
6 6.951918
11 6.805511
17 6.776728
22 6.757709
28 6.748353
33 6.743650
39 6.742541
44 6.738447
50 6.737655
designs (k=3):
Optimal
Points1 Points2 Points31.02700 2.20665 5.99993
Weights1 Weights2 Weights30.114 0.392 0.495
: 50
ICA iteration: 6.736338
Criterion valuefunction evaluations: 511836
Total number of : 309
Total number of successful local search moves: 92
Total number of successful revolution moves: equivalence
Convergence: 407
Total number of successful assimilation moves: 6 0.5
Vector of maximum parameter values: 71.04 seconds!
CPU time
function is 0.01269528
Maximum of the sensitivity bound (ELB) is 0.9936924
Efficiency lower 2.16 seconds!
Verification required in 'sens.minimax.control' ('n_seg')
Adjust the control parameters in 'sens.bayes.control' for higher speed.
or > log5$design
R
iter x1 x2 x3 w1 w2 w3 min_cost1 50 1.026997 2.206648 5.999927 0.1135038 0.3915064 0.4949898 6.736338
mean_cost max_sens elb time_sec1 6.737655 0.01269528 0.9936924 2.16
Figure 2 (b) displays the plot of the sensitivity function of the threepoint generated design and it indicates its nearly optimality. The optimal design suggests subjects with nearly \(1\), \(2\) and \(6\) hours of practice, where roughly half of the subjects should be assigned to practice for \(6\) hours.
Similar to the locally \(D\)optimal design, we can assess the minimax \(D\)efficiency of \(\xi_{uni}\) with respect to the minimax \(D\)optimal design by
> meff(formula = ~exp(b0 + b1 * x)/(1 + exp(b0 + b1 * x)),
R+ predvars = "x", parvars = c("b0", "b1"),
+ family = "binomial",
+ lp = c(6, .5), up = c(2, 2),
+ x1 = c(0:6), w1 = rep(1/7, 7),
+ x2 = log5$evol[[20]]$x, w2 = log5$evol[[20]]$w)
1] 0.7459795 [
This value indicates that \(\xi_{uni}\) requires about \(100(1/0.740891) = 35\%\) more subjects to have the same minimax \(D\)efficiency as the minimax \(D\)optimal design when \(\Theta = [6, 2]\times[0.5, 2]\).
The sigmoid Emax model is commonly used in pharmacokinetics/pharamacodynamics to describe the Sshape doseresponse relationship (see, e.g., Macdougall 2006; Thomas 2006). This model is defined by \[\label{eq:sigmoidEmax} E(Y) = f(x, \boldsymbol{\theta}) = \beta_1 + (\beta_2\beta_1) \frac{x^{\beta_4}}{x^{\beta_4} + \beta_3^{\beta_4}}, \tag{16}\] where \(x\) is the dose level (in mg), \(x \in \chi = (0, x_0]\), \(x_0\) is userselected and \(\boldsymbol{\theta}= (\beta_1, \beta_2, \beta_3, \beta_4)^T\), \(\theta_2>\beta_1\), \(\beta_3>0\). All errors are assumed to be independent and normally distributed with mean zero and constant variance. Here, \(\beta_1\) is the minimum mean response, \(\beta_2\) is the maximum mean response, \(\beta_3\) is the ED50, i.e., the dose at which \(50\) percent of the maximum mean effect is achieved, and \(\beta_4\) is the slope parameter.
In doseresponse studies, optimal designs usually determine how many doses are required to be tested, what are their levels, and how many subjects to allocate to each dose level. Let \(\chi = (0, 1000]\)mg. Similar to Dragalin et al. (2007) and Wang and Yang (2014), we are interested in the efficient estimation of \(\boldsymbol{\theta}\) and the \(D\)optimality is an appropriate design criterion for this purpose.
It is straightforward to show that the FIM of the sigmoid Emax model
depends on the unknown parameters \(\boldsymbol{\theta}\). This parameter
dependency must be dealt with based on the type of information available
on \(\boldsymbol{\theta}\). For example, using information from a pilot
study, one may elicit a uniform prior distribution for
\(\boldsymbol{\theta}\) and search for Bayesian optimal designs. As an
illustrative example, let \(\beta_1 \sim U(4, 8)\),
\(\beta_2 \sim U(11, 15)\), \(\beta_3 \sim U(100, 130)\) and
\(\beta_4 \sim U(5, 9)\), and all the uniform prior distributions be
independent. For simplicity, we denote the independent uniform
distributions for \(\beta_i, i = 1, 2,3, 4\) by \(\pi_{\Theta}\), where
\(\Theta = [4, 8] \times [11, 15] \times[100,130] \times [5, 9]\) is the
parameter space. This prior can be defined in
ICAOD by the uniform()
function as follows.
> prior1 < uniform(lower = c(4, 11, 100, 5), upper = c(8, 15, 130, 9)) R
Here, the output is an object of class ‘cprior
’, which can be passed
to the argument prior
of the bayes()
function.
To find the number of support points for the Bayesian \(D\)optimal design, we repeated the same incremental process for finding minimax optimal design. This process is excluded here due to space consideration. The Bayesian \(D\)optimal design has \(5\) points in its support, which are found by
> sig1 < bayes(formula = ~b1 + (b2b1) * x^b4/(x^b4 + b3^b4),
R+ predvars = "x",
+ parvars = c("b1", "b2", "b3", "b4"),
+ lx = .001, ux = 1000, k = 5, iter = 400, prior = prior1,
+ ICA.control = list(rseed = 1, checkfreq = Inf))
> print(sig1)
R
Finding Bayesian optimal designs
:
Call~b1 + (b2  b1) * x^b4/(x^b4 + b3^b4)
iter x1 x2 x3 x4 x5 w1 w21 1 18.0475346 96.25014 167.5667 179.3485 867.8058 0.2909521 0.1581145
45 45 17.4961312 89.11527 108.3659 137.0731 866.5956 0.2368958 0.1548766
89 89 0.4786294 95.25557 115.0471 138.2732 554.3556 0.2460043 0.2036432
134 134 1.0415205 94.75054 113.8961 138.3159 959.5441 0.2430275 0.1959315
178 178 0.7994201 94.64836 113.7667 138.3264 999.8505 0.2433772 0.1949845
222 222 1.8666601 94.60760 113.7090 138.3516 999.9069 0.2432310 0.1942385
267 267 0.5492615 94.60194 113.7011 138.3539 999.9987 0.2432085 0.1941306
311 311 0.4263460 94.60176 113.6966 138.3513 1000.0000 0.2432061 0.1941301
355 355 0.4164132 94.60189 113.6965 138.3510 1000.0000 0.2432041 0.1941322
400 400 0.1805450 94.60188 113.6964 138.3510 1000.0000 0.2432040 0.1941319
w3 w4 w5 min_cost mean_cost1 0.3418695 0.006883086 0.2021808 14.10396 15.82570
45 0.1428410 0.225597557 0.2397890 12.74113 12.74774
89 0.1043130 0.200569541 0.2454699 12.72302 12.72989
134 0.1140765 0.203490785 0.2434737 12.72086 12.72095
178 0.1150226 0.203128842 0.2434868 12.72082 12.72083
222 0.1158376 0.203138347 0.2435545 12.72082 12.72082
267 0.1159406 0.203152463 0.2435679 12.72082 12.72082
311 0.1159200 0.203173666 0.2435701 12.72082 12.72082
355 0.1159159 0.203177516 0.2435702 12.72082 12.72082
400 0.1159155 0.203178189 0.2435705 12.72082 12.72082
designs (k=5):
Optimal
Points1 Points2 Points3 Points4 Points50.18055 94.60188 113.69639 138.35096 1000.00000
Weights1 Weights2 Weights3 Weights4 Weights50.243 0.194 0.116 0.203 0.244
: 400
ICA iteration: 12.72082
Criterion valuefunction evaluations: 85150
Total number of : 2378
Total number of successful local search moves: 81
Total number of successful revolution moves: maxiter
Convergence: 1700
Total number of successful assimilation moves: 611.44 seconds!
CPU time
function is 9.439815e07
Maximum of the sensitivity bound (ELB) is 0.9999998
Efficiency lower 65.3 seconds!
Verification required in 'sens.minimax.control' ('n_seg')
Adjust the control parameters in 'sens.bayes.control' for higher speed.
or > sig1$design
R
iter x1 x2 x3 x4 x5 w1 w2 w31 400 0.180545 94.60188 113.6964 138.351 1000 0.243204 0.1941319 0.1159155
w4 w5 min_cost mean_cost max_sens elb time_sec1 0.2031782 0.2435705 12.72082 12.72082 9.439815e07 0.9999998 65.3
Figure 3 (a) is generated from the
output and presents the plot of the sensitivity function of the
fivepoint design and it verifies its optimality. In our example, the
Bayesian \(D\)optimal design suggests five dose levels, with four of them
located below \(140\)mg and one located at the maximum. Roughly \(50\%\) of
the observations should be assigned to the lower and upper bound of the
dose interval. Note that the result can also be obtained in lesser CPU
time if we adjust the control parameters of the integral approximations
via the argument crt.bayes.control
.
Using a nonoptimal design may be inefficient even when its design points are sampled uniformly from the design space. As an illustrative example, assume a situation where a researcher decides to work with an equallyweighted uniform design that has \(11\) points located on \(0.001, 100, 200, 300, ....,1000\). This design is not optimal when \(\boldsymbol{\theta}\sim \pi_\Theta\). The Bayesian \(D\)efficiency of the uniform design with respect to the obtained Bayesian \(D\)optimal design is calculated by
> beff(formula = ~b1 + (b2b1) * x ^b4/(x^b4 + b3^b4),
R+ predvars = "x",
+ parvars = c("b1", "b2", "b3", "b4"),
+ prior = prior1,
+ x1 = c(.001,seq(100, 1000, by = 100)),
+ w1 = rep(1/11, 11),
+ x2 = sig1$evol[[400]]$x, w2 = sig1$evol[[400]]$w)
1] 0.3063289 [
The nonoptimal design may seem reasonable, but its Bayesian
\(D\)efficiency value suggests that, roughly \(226\%\) more observations
are needed to maintain the \(D\)efficiency for the nonoptimal design in
comparison to the Bayesian \(D\)optimal design when
\(\boldsymbol{\theta}\sim \pi_\Theta\). The bayes()
function is very
flexible and can incorporate different prior distributions.
ICAOD can also find robust or optimumonaverage designs when the prior distributions are discrete. As an illustrative example, assume \(\Theta_0 = \{\boldsymbol{\theta}_{01}, \boldsymbol{\theta}_{02} , \boldsymbol{\theta}_{03} , \boldsymbol{\theta}_{04} , \boldsymbol{\theta}_{05}\}\) be a set of five vectors of initial estimates for \(\boldsymbol{\theta}= (\beta_1, \beta_2,\beta_3, \beta_4)\), where \(\boldsymbol{\theta}_{01} = (4, 11, 100, 5)\), \(\boldsymbol{\theta}_{02} = (5, 12, 110, 6)\), \(\boldsymbol{\theta}_{03} = (6, 13, 120, 7)\), \(\boldsymbol{\theta}_{04} = (8, 15, 130, 9)\) and \(\boldsymbol{\theta}_{05} = (12, 30, 160, 13)\). Let \(\pi_{\Theta_0}\) denotes a discrete uniform prior distribution that assigns the same probability to each vector element of \(\Theta_0\). The sixpoint optimumonaverage design is given by
> parset1 < matrix(c(4, 11, 100, 5,
R+ 5, 12, 110, 6,
+ 6, 13, 120, 7,
+ 8, 15, 130, 9,
+ 12, 30, 160, 13),
+ nrow = 5, byrow = TRUE)
> sig2 < robust(formula = ~b1 + (b2b1) * x ^b4/(x^b4 + b3^b4),
R+ predvars = "x",
+ parvars = c("b1", "b2", "b3", "b4"),
+ lx = .001, ux = 1000, k = 6, iter = 400,
+ parset = parset1,
+ prob = rep(1/5, 5),
+ ICA.control = list(rseed = 1, checkfreq = Inf))
> print(sig2)
Ronaverage optimal designs
Finding robust or optimum
:
Call~b1 + (b2  b1) * x^b4/(x^b4 + b3^b4)
iter x1 x2 x3 x4 x5 x6 w11 1 52.47474098 86.13143 108.6089 176.5576 847.1196 865.1056 0.3356308
45 45 0.03699118 93.84970 115.0752 144.3279 172.2496 612.5133 0.1921861
89 89 0.26057011 86.13743 112.6560 143.7663 170.6661 899.7495 0.1938690
134 134 0.27440675 86.41506 112.7178 143.7321 170.5697 999.4115 0.1997277
178 178 0.42978429 86.41957 112.7179 143.7262 170.5713 999.5817 0.2001652
222 222 0.86217693 86.41953 112.7092 143.7262 170.5733 999.9999 0.2001538
267 267 0.78134061 86.42156 112.7098 143.7250 170.5722 1000.0000 0.2001708
311 311 0.28372653 86.42155 112.7098 143.7248 170.5723 1000.0000 0.2001738
355 355 0.08123600 86.42156 112.7099 143.7248 170.5723 1000.0000 0.2001734
400 400 0.04980091 86.42158 112.7099 143.7248 170.5723 1000.0000 0.2001734
w2 w3 w4 w5 w6 min_cost mean_cost1 0.06870259 0.1056298 0.2022530 0.12858302 0.1592008 14.10402 16.35115
45 0.15186984 0.1444531 0.2155502 0.08196997 0.2139708 12.25422 12.31711
89 0.13222946 0.1570768 0.1882862 0.09776625 0.2307723 12.21447 12.28391
134 0.13190606 0.1549309 0.1855214 0.09816873 0.2297452 12.21398 12.28328
178 0.13154222 0.1547794 0.1858112 0.09840028 0.2293017 12.21398 12.28311
222 0.13149988 0.1548130 0.1857954 0.09845378 0.2292841 12.21398 12.28305
267 0.13150916 0.1547914 0.1857821 0.09847028 0.2292762 12.21398 12.27535
311 0.13150554 0.1547882 0.1857820 0.09847434 0.2292761 12.21398 12.26001
355 0.13150671 0.1547883 0.1857817 0.09847396 0.2292760 12.21398 12.21398
400 0.13150681 0.1547882 0.1857817 0.09847394 0.2292759 12.21398 12.21398
designs (k=6):
Optimal
Points1 Points2 Points3 Points4 Points5 Points60.04980 86.42158 112.70988 143.72485 170.57227 1000.00000
Weights1 Weights2 Weights3 Weights4 Weights5 Weights60.200 0.132 0.155 0.186 0.098 0.229
: 400
ICA iteration: 12.21398
Criterion valuefunction evaluations: 20070
Total number of : 3787
Total number of successful local search moves: 88
Total number of successful revolution moves: Maximum_Iteration
Convergence: 1895
Total number of successful assimilation moves: 29.56 seconds!
CPU time
function is 3.960066e07
Maximum of the sensitivity bound (ELB) is 0.9999999
Efficiency lower 2.27 seconds!
Verification required > sig2$design
R
iter x1 x2 x3 x4 x5 x6 w1 w21 400 0.04980091 86.42158 112.7099 143.7248 170.5723 1000 0.2001734 0.1315068
w3 w4 w5 w6 min_cost mean_cost max_sens1 0.1547882 0.1857817 0.09847394 0.2292759 12.21398 12.21398 3.960066e07
elb time_sec1 0.9999999 2.27