The R package gset calculates equivalence and futility boundaries based on the exact bivariate non-central
Group sequential tests are repeated significance testing on data
accumulated during a study, in contrast to the traditional one-time
analysis at the end of the study. Since the same hypothesis is
repeatedly tested, it is critical to compute the proper critical values
at each interim analysis to keep the overall Type I error rate at a
prespecified nominal level. Applied appropriately, group sequential
designs (GSDs) can help saving resources, shortening study duration, and
stopping ineffective treatments much earlier than the traditional
non-sequential designs. There are existing software applications, both
commercial and open-source, of GSDs in studies, including
PROC SEQDESIGN
and PROC SEQTEST
procedures in SAS©, EAST© developed
by Cytel, as well as the R packages
gsDesign (general GSDs
and operating characteristics; Anderson (2014)),
GroupSeq (GSD via the
Type I error spending approach; Pahl (2014)), the ldBand
function from
Hmisc (GSD via from the
Lan-DeMets method with a variety of ldbands
(boundary calculation using the Lan-DeMets
This discussion focuses on GSDs in studies with equivalence hypotheses.
Equivalence studies concern “equivalence” between two groups. Since
mathematical equivalence is impossible to establish, the concept of
“equivalence bounds” is often applied. Denote the parameter that
represents the dissimilarity in a response variable between two groups
by
To the best of our knowledge, there are no software packages or tools
that target specifically at the GSDs of equivalence studies. Though some
existing software applications, which are not designed for GSD in
equivalence studies, can be tricked into doing sequential tests on
equivalence hypotheses, the tricking process itself can be confusing and
error-prone. Even if the tricking is successful, other potential
problems still exist. First, many of the procedures are based on the
large sample normality assumption. Second, the futility boundaries are
calculated from testing
In this discussion, we introduce the R package gset that fills the
software gap in GSDs for studies with equivalence hypotheses. The
stagewise equivalence and futility boundaries, either binding or
nonbinding, are back-calculated from the equations that are formed based
on the exact dual
Denote by
The
where
The futility boundaries can be binding or non-binding, depending on whether the equivalence boundaries are affected by the action of stopping for futility or not. When the futility boundaries are binding, the decisions to stop for equivalence or futility are “competitive”; that is, the trial will stop at the time point whichever boundaries are crossed first. Otherwise, the overall Type I error rate would be inflated. The equivalence and binding futility boundaries can be determined from the following equation system
In many practical cases, the study sponsors would want an option to continue the trial even when the interim data suggests stopping for futility. The futility boundaries calculated under this circumstance are referred to as “non-binding”. A study with non-binding futility boundaries can stop for futility at an interim look when the futility boundaries are crossed, or not. Non-binding futility provides more flexibility to a study without inflating Type I error rate. The equivalence and non-binding futility can be determined from Equations ((4)), ((6)), and ((7)), plus the equation below
In terms of the actual computation of the critical values, gset
employs the Monte Carlo (MC) simulation approach to calculate
equivalence and futility boundaries (either non-binding or binding) from
the two equation systems given above. Liu and Li (2014) prove that there are some
inherent constraints among the critical values including
In the last step, the equivalence and futility boundaries must meet so
that a final dichotomous decision on whether equivalence is achieved can
be made. If the originally planned
The package gset contains 8 functions (Table 1).
Among the 8 functions, 4 functions can be used to compute the
equivalence and futility boundaries (equivonly
, nonbinding
,
binding
, nminmax
). The futility critical values calculated by
nonbinding
and binding
in the last step are forced to equal to the
equivalence critical values by default so that a dichotomous decision on
equivalence can be made in the last stage, though users can use argument
force = FALSE
to not force them to be the same. The sample size where
the futility and equivalence critical values naturally coincide in the
last stage are calculated by function nminmax
, with the nominal Type I
and II error rates maintained. In other words, the futility and
equivalence boundaries agree naturally in the last stage with
nfix
)
computes the sample size of an equivalence study in the traditional
non-sequential setting. Two functions generate the stagewise boundary
plots (figureE
, figureEF
); and one function (oc
) examines the
operating characteristics of a given GSD in equivalence studies;
including the empirical Type I error rate, empirical power, expected
sample size, and the probabilities of stopping at interim looks due to
equivalence or futility.
Function | Description |
---|---|
equivonly |
computes equivalence boundaries for GSD in equivalence studies that only stop for equivalence |
nonbinding |
computes equivalence and non-binding futility boundaries for GSD in equivalence studies |
binding |
computes equivalence and binding futility boundaries for GSD in equivalence studies |
nminmax |
calculates the sample size of an equivalence study in the sequential setting as well as the equivalence and futility boundaries (either binding or non-binding) under the calculated sample size |
nfix |
calculates the sample size of an equivalence study in the traditional non-sequential setting |
figureE |
generates the stagewise equivalence boundary plots |
figureEF |
generates the stagewise equivalence and futility boundary plots |
oc |
examines the operating characteristics of a given GSD in equivalence studies |
Table 2 lists the arguments to be supplied by users for
calculating equivalence and futility boundaries. To calculate the sample
size of an equivalence study in the traditional non-sequential setting
via nfix
, besides l
, u
, theta
, sigma
, type1
, type2
as
listed in Table 2, users also need to supply the sampling
ratio between the two groups: nrange
; the default is c(0, 1000)
. To calculate the sample
size of an equivalence study in the GSD setting via nminmax
, besides
l
, u
, theta
, sigma
, t.vec
, type1
, type2
, gamma
, n.sim
as given in Table 2, users also need to specify a logical
argument binding
: whether the futility boundaries are binding, and
n1.lower
, n1.upper
, n2.lower
, n2.upper
, which represent the
lower and upper bounds of the interval from which nminmax
in groups 1
and 2 will be solved using a bisection method. The boundary plots are
generated directly from functions equivonly
, nonbinding
and
binding
by default (users can suppress the plots by specifying
binding = FALSE
), or they can be generated by functions figureE
and
figureEF
by taking the boundary outputs from equivonly
,
nonbinding
, binding
, nminmax
as their input.
Argument | Description |
---|---|
l |
the lower equivalence bound as given in the equivalence hypothesis |
u |
the upper equivalence bound as given in the equivalence hypothesis |
theta |
the true mean difference between 2 groups |
sigma |
between-subject standard deviation of the response variable |
n1 |
size of group 1 |
n2 |
size of group 2 |
t.vec |
cumulative interim look time points assuming a constant accrual rate. For example, if a study has equally spaced 4 looks, then t.vec = 1:4/4 . t.vec can be any vector as long as it is monotonically increasing and the last element is 1 |
type1 |
overall Type I error rate |
type2 |
overall Type II error rate |
gamma |
the gamma parameter in the gamma cumulative error spending function. gamma is a scalar for equivonly , and a 2-dimensional vector for binding, nonbinding, nminmax ; it can be any value; default is |
crange |
a 2-dimensional vector containing the end-points of the interval from the equivalence boundaries will be solved; default is c(-10, 10) . |
drange |
a 2-dimensional vector containing the end-points of the interval from which the futility boundaries will be solved; default is c(-10, 10) . |
force |
whether to force the futility boundaries to equal to the equivalence boundaries in the last look; default is force = TRUE . |
plot |
whether to generate the boundaries plot. Default plot = TRUE . |
ll |
a parameter in the boundary plot; the short arm of the |
ul |
a parameter in the boundary plot; the long arm of the |
n.sim |
number of randomly simulated samples in the MC computation of the boundaries; default n.sim |
seed |
seed used in the MC computation. If non-specified, the seed is set randomly. |
As for the error spending function employed by gset, the gamma error
spending function as introduced by Hwang et al. (1990) is used on both Type I and
Type II error. The function, using Type I error rate
In this section, we illustrate the implementation of package gset with
3 examples. In all examples, the base design is the complete crossover
design, there are 4 looks of the GSD in each example. The equivalence
bounds in the hypothesis are n.sim
is set.seed
to have
their results reproduced in different runs.
library(gset)
#### specify the parameters
L <- -0.2
U <- 0.2
theta <- 0
sigma <- 0.4
alpha <- 0.05
beta <- 0.2
K <- 4
r <- 1
#### sample size in the non-sequential setting
n.fix <- nfix(r, L, U, theta, sigma, alpha, beta)
The output is given below. The sample size is 69 for a non-sequential crossover design.
$n1
[1] 69
$n2
[1] 69
Example 1: If a study considers only stopping for equivalence, then the following command computes the equivalence boundaries and generates the boundary plots.
bound1<- equivonly(L, U, sigma, n.fix$n1, n.fix$n2, 1:K/K, alpha, beta)
#### the boundary plot can be regenerated by using figureE(bound1, K)
The output is given below. It contains the cumulative Type I error
spending and the equivalence critical values at each look. If
$typeI
[1] 0.001602930 0.005960146 0.017804287 0.050000000
$equivL
[1] 1.857541 2.151003 2.212045 1.739736
$equivU
[1] -1.857541 -2.151003 -2.212045 -1.739736
The operating characteristics of the GSD with the calculated equivalence
boundaries can be investigated by applying commands
oc(L, U, theta = L, sigma, K, 69, 69, bound1, futility = FALSE)
(under
oc(L, U, theta = 0, sigma, K, 69, 69, bound1, futility = FALSE)
(under
Example 2: If a study considers stopping for equivalence and
futility but would like to have the flexibility to continue even if the
futility boundaries are crossed at an interim look, then the following
commands can be used to get the critical values for a GSD with
non-binding futility. By default, the futility boundaries in the last
step are forced to equal to the equivalence boundaries; users can use
argument force = FALSE
to remove the constraint.
bound2 <- nonbinding(L, U, theta, sigma, n.fix$n1, n.fix$n2, 1:K/K, alpha, beta)
### the boundary plot can be regenerated by using figureEF(bound2, K)
The output is given below. It contains the cumulative error spending and
the equivalence and non-binding futility boundaries at each look. Note
that the equivalence critical values with non-binding futility in theory
should be the same as the equivalence critical values in a study with
equivalence boundaries only. Comparing the results from equivonly
above, we can see that the critical values are close but not exactly the
same due to the MC errors. If the same random seed had been used, the
results would have been the same.
$typeI
[1] 0.001602930 0.005960146 0.017804287 0.050000000
$typeII
[1] 0.006411721 0.023840584 0.071217148 0.200000000
$equivL
[1] 1.829343 2.160757 2.197812 1.721998
$equivU
[1] -1.829343 -2.160757 -2.197812 -1.721998
$futilL
[1] -1.2607491 -0.2314749 0.6712118 1.7219983
$futilU
[1] 1.2607491 0.2314749 -0.6712118 -1.7219983
The boundary plots are given in Figure 2. Since
The operating characteristics of the GSD with the calculated equivalence
and futility boundaries can be investigated by applying commands
oc(L, U, theta = L, sigma, K, 69, 69, bound2, futility = TRUE)
(under
oc(L, U, theta = 0, sigma, K, 69, 69, bound2, futility = TRUE)
(under
under |
under |
---|---|
|
|
|
|
under
$reject.rate
[1] 0.0479
$accept.rate
[1] 0.9521
$En1
[1] 66.9
$En2
[1] 66.9
$prob.stop
[1] 0.008025 0.022650 0.055450 0.913875
$prob.stopE
[1] 0.002075 0.004375 0.010900 0.030550
$prob.stopF
[1] 0.005950 0.018275 0.044550 0.883325
under
$reject.rate
[1] 0.787275
$accept.rate
[1] 0.223075
$En1
[1] 62.3
$En2
[1] 62.3
$prob.stop
[1] 0.011450 0.052500 0.299125 0.636925
$prob.stopE
[1] 0.005325 0.034550 0.253925 0.493475
$prob.stopF
[1] 0.006125 0.017950 0.045200 0.153800
To calculate the sample size that yields the desirable power (80%), the
following command can be used. The equivalence boundaries should remain
the same (except than some MC numerical errors) as those under
bound3 <- nminmax(L, U, theta, sigma, 69, 69, 1:K/K, alpha, beta)
The outputs are given below.
$n1minmax
[1] 75
$n2minmax
[1] 75
$typeI
[1] 0.001602930 0.005960146 0.017804287 0.050000000
$typeII
[1] 0.006411721 0.023840584 0.071217148 0.200000000
$equivL
[1] 1.852941 2.193359 2.208099 1.729116
$equivU
[1] -1.852941 -2.193359 -2.208099 -1.729116
$futilL
[1] -1.2006119 -0.1403367 0.8067496 1.7291157
$futilU
[1] 1.2006119 0.1403367 -0.8067496 -1.7291157
The new sample size is 75. The boundary plot, which can be obtained
using figureEF(bound3, K)
, is similar to Figure 2, and not
provided due to space limitation. The operating characteristics of the
GSD with oc(L, U, theta = L, sigma, K, 75, 75, bound3, futility = FALSE)
(under
oc(L, U, theta = 0, sigma, K, 75, 75,
bound3, futility = FALSE)
(under
Example 3: If a study plans to stop the study whenever the
equivalence or the flexibility boundaries are crossed at an interim
look, then the following command can be used. By default, the futility
boundaries in the last step are forced to equal to the equivalence
boundaries; users can use argument force = FALSE
to remove the
constraint.
bound4 <- binding(L, U, theta, sigma, n.fix$n1, n.fix$n2, 1:K/K, alpha, beta)
### the boundary plot can be regenerated by using figureEF(bound4, K)
The output is given below. The equivalence critical values (binding futility) are different from those from studies with nonbinding futilities (Example 2). The boundary plots are given in Figure 3. The futility critical values at the first and second interim looks appear above the identity line in this example, and the trial cannot stop for futility in the first two looks.
$typeI
[1] 0.001602930 0.005960146 0.017804287 0.050000000
$typeII
[1] 0.006411721 0.023840584 0.071217148 0.200000000
$equivL
[1] 1.806721 2.155557 2.220632 1.730838
$equivU
[1] -1.806721 -2.155557 -2.220632 -1.730838
$futilL
[1] -1.2525572 -0.2457909 0.6858693 1.7308381
$futilU
[1] 1.2525572 0.2457909 -0.6858693 -1.7308381
The operating characteristics of the GSD with the equivalence and
binding futility boundaries can be investigated by applying commands
oc(L, U, theta = L, sigma, K, 69, 69, bound4, futility = TRUE, binding = TRUE)
(under oc(L, U, theta = 0, sigma, K, 69, 69, bound4, futility = TRUE, binding = TRUE)
(under
under |
under |
---|---|
|
|
|
|
under
$reject.rate
[1] 0.0491
$accept.rate
[1] 0.9509
$En1
[1] 46.7
$En2
[1] 46.7
$prob.stop
[1] 0.114075 0.306725 0.354850 0.224350
$prob.stopE
[1] 0.002075 0.004200 0.011050 0.031775
$prob.stopF
[1] 0.112000 0.302525 0.343800 0.192575
under
$reject.rate
[1] 0.76775
$accept.rate
[1] 0.23225
$En1
[1] 61.9
$En2
[1] 61.9
$prob.stop
[1] 0.011675 0.052475 0.275500 0.660350
$prob.stopE
[1] 0.005325 0.036075 0.227675 0.498675
$prob.stopF
[1] 0.006350 0.016400 0.047825 0.161675
To calculate the sample size that yields the desirable power (80%) in
Example 3, the following command can be used. The equivalence boundaries
should remain the same (except for some MC numerical errors) as those
under
bound5 <- nminmax(L, U, theta, sigma, 69, 69, 1:K/K, alpha, beta, binding = TRUE)
The outputs are given below. The new sample size is 73. The boundary
plot, which can be obtained using figureEF(bound5, K)
is not provided
due to space limitation. The operating characteristics of the GSD with
oc(L, U, theta = L, sigma, K, 73, 73, bound5, futility = FALSE, binding = TRUE)
(under oc(L, U, theta = 0, sigma, K, 73, 73, bound5, futility = FALSE, binding = TRUE)
(under
$n1minmax
[1] 73
$n2minmax
[1] 73
$typeI
[1] 0.001602930 0.005960146 0.017804287 0.050000000
$typeII
[1] 0.006411721 0.023840584 0.071217148 0.200000000
$equivL
[1] 1.881127 2.187860 2.217254 1.716097
$equivU
[1] -1.881127 -2.187860 -2.217254 -1.716097
$futilL
[1] -1.2035253 -0.1731546 0.7543735 1.7160975
$futilU
[1] 1.2035253 0.1731546 -0.7543735 -1.7160975
We have introduced the R package gset that computes the stagewise critical values and sample size for testing equivalence hypothesis in GSDs. We outlined the underlying theory and computation approach that gset is based on, and illustrated the usage of the package with several GSD examples. gset can compute the critical values for GSDs that stop only for equivalence, or stop for equivalence and futility – either binding or non-binding. It also produces 2-dimensional boundary plots which give a direct visualization of the stagewise stopping boundaries. The operating characteristics of the a proposed GSD can be examined in gset via the computation of empirical Type I error rates, empirical power, stopping probabilities at the interim looks, and expected sample sizes.
gset is the first package in R that targets specifically at the GSD
with equivalence hypothesis. Furthermore, it is based on the exact
bivariate
gsDesign, GroupSeq, Hmisc, PwrGSD, AGSDest, clinfun
Bayesian, ClinicalTrials, Databases, Econometrics, ExperimentalDesign, MissingData, ReproducibleResearch, Survival
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Liu, "gset: An R Package for Exact Sequential Test of Equivalence Hypothesis Based on Bivariate Non-Central t-Statistics", The R Journal, 2015
BibTeX citation
@article{RJ-2014-033, author = {Liu, Fang}, title = {gset: An R Package for Exact Sequential Test of Equivalence Hypothesis Based on Bivariate Non-Central t-Statistics}, journal = {The R Journal}, year = {2015}, note = {https://rjournal.github.io/}, volume = {6}, issue = {2}, issn = {2073-4859}, pages = {174-184} }