This article illustrates the use of the bcmixed package and focuses on the two main functions: bcmarg and bcmmrm. The bcmarg function provides inference results for a marginal model of a mixed effect model using the Box–Cox transformation. The bcmmrm function provides model median inferences based on the mixed effect models for repeated measures analysis using the Box–Cox transformation for longitudinal randomized clinical trials. Using the bcmmrm function, analysis results with high power and high interpretability for treatment effects can be obtained for longitudinal randomized clinical trials with skewed outcomes. Further, the bcmixed package provides summarizing and visualization tools, which would be helpful to write clinical trial reports.
Longitudinal data are often observed in medical or biological research. One of the most popular statistical models for studying longitudinal continuous outcomes is the linear mixed effect model. Several packages are available from CRAN that allow for the implementation of linear mixed effects models (e.g., nlme (Pinheiro et al. 2021), glme (Sam Weerahandi et al. 2021), lme4 (Bates et al. 2015), CLME (Jelsema and Peddada 2016), PLmixed (Rockwood and Jeon 2019), MCMCglmm (Hadfield 2010)).The linear mixed effect models assume that longitudinal outcomes follow a multivariate normal distribution. However, the distribution of the outcome is often right skewed in the medical and biological fields. Therefore, evaluating fixed effects based on the normal distribution theory may result in inefficient inferences such as power loss for some statistical tests. In addition, although a model-based mean for a certain level of the categorical exploratory variables is often estimated when applying the linear mixed effect model (e.g., the model-based mean for each treatment group of the last visit in a randomized clinical trial), the mean may be inadequate as a representative value for the skewed data.
The Box–Cox transformation (Box and Cox 1964) is often applied to skewed longitudinal data (Lipsitz et al. 2000) to reduce the skewness of a skewed outcome and apply existing statistical models based on a normal distribution. However, it is difficult to directly interpret the model mean estimated on the scale after applying some transformations to the outcome variable.
For the sake of the interpretability of the analysis results, (Maruo et al. 2015) propose a model median inference method on the original scale based on the Box–Cox transformation in the context of randomized clinical trials. (Maruo et al. 2017) extend this method to the framework of mixed effects models for repeated measures (MMRM) analysis (Mallinckrodt et al. 2001) in the context of longitudinal randomized clinical trials.
The bcmixed package
(Maruo et al. 2020) contains functions to estimate model medians for
longitudinal data proposed by (Maruo et al. 2017), as well as a sample data
set that is used in (Maruo et al. 2017). In this package, the parameter
estimation is conducted partially using the gls function in the
nlme package. This paper
illustrates the usage of the package with the sample data in several
contexts.
The remainder of this manuscript is organized as follows: In section 2, we describe the methods proposed by (Maruo et al. 2017). Then, we illustrate the usage of the bcmixed package with the example data in section 3. Finally, our contributions are summarized in section 4.
We briefly introduce the method proposed by in (Maruo et al. 2017). The detailed expressions can be found in (Maruo et al. 2017).
Let the outcome be a continuous and positive value. The outcomes are measured over time for each subject \(i=1,\ldots,n\), and the number of planned measurement occasions is \(T\) (occasion index: \(t=1,\ldots,T\)). The outcome vector for the \(i\)th subject is denoted by \(\mathbf{y}_i=(y_{i1},\ldots,y_{in_i})^T\), where \(n_i\) is the number of measurements for the \(i\)th subject. We have \(n_i\le T\) because of missingness. Then, we consider applying the following model (Box and Cox 1964; Lipsitz et al. 2000): \[\label{model} \mathbf{z}_{i}=X_i\mathbf{\beta}+W_i\mathbf{b}_i+\mathbf{\epsilon}_i, \tag{1}\] where \(\mathbf{z}_i\) is the Box–Cox transformed outcome vector such that the \(j\)th element (\(j=1,\ldots,n_i\)) is denoted by \[z_{ij}=\left\{\begin{array}{ll} (y_{ij}^\lambda-1)/\lambda,&\lambda\neq 0,\\ \log y_{ij},&\lambda=0, \end{array} \right.\] and \(\lambda\) is the transformation parameter. The parameter \(\mathbf{\beta}\) is the \(p\)-dimensional vector of the fixed effect, \(\mathbf{b}_i\) is the \(q\)-dimensional vector of random effects distributed as \(MVN_q(\mathbf{0}_q, D)\), \(\mathbf{\epsilon}_i\) is the \(n_i\)-dimensional vector of random errors distributed independently as \(MVN_{n_i}(\mathbf{0}_{n_i}, \Sigma_i)\), and \(X_i\) and \(W_i\) are \(n_i \times p\) and \(n_i \times q\) design matrices relating the fixed and random effects, respectively. The random variables \(\mathbf{b}_i\) and \(\mathbf{\epsilon}_i\) are independent. From Formula ((1)), it is derived that \(\mathbf{z}_i|\lambda\sim MVN_{n_i}(X_i\mathbf{\beta}, V_i\)), where \(V_i = W_i D W_i^T + \Sigma_i\). In this paper, we consider situations where researchers have little interest in random effects and are interested in assessing fixed effects. In such cases, a simple formulation of the linear mixed effect model ((1)) can be implemented wherein the random effects are not explicitly modeled, but are rather included as part of the covariance matrix \(V_i\). We focus on such a “marginal” mean model. The covariance parameter vector in \(V=V_i\) for \(n_i=T\) (i.e., subjects with no missing values) is denoted as \(\mathbf{\alpha}=(\alpha_1,\ldots,\alpha_m)^T\). The dimension of \(\mathbf{\alpha}\), that is \(m\), depends on \(T\) and the specified covariance structure.
Let the model parameter vector be \(\mathbf{\theta}=(\lambda,\mathbf{\beta}^T,\mathbf{\alpha}^T)^T\). The maximum likelihood estimate of \(\mathbf{\theta}\) is obtained through maximizing the profile likelihood for \(\lambda\) (Lipsitz et al. 2000; Maruo et al. 2017). The model-based and robust variance estimators of the maximum likelihood estimator \(\hat{\mathbf{\theta}}\) are given by \(V_{\mathbf{\theta}}^{(M)}=\{-\hat{H}\}^{-1}\) and \(V_{\mathbf{\theta}}^{(R)}=\{-\hat{H}\}^{-1}\hat{J}\{-\hat{H}\}^{-1}\), respectively, where \(H=\frac{\partial^2}{\partial\mathbf{\theta}\partial\mathbf{\theta}^T}l(\mathbf{\theta})\), \(J=\sum_{i=1}^n\left\{\frac{\partial}{\partial\mathbf{\theta}}l_i(\mathbf{\theta})\right\}\left\{\frac{\partial}{\partial\mathbf{\theta}}l_i(\mathbf{\theta})\right\}^T\), \(l(\mathbf{\theta})\) is the likelihood function for \(n\) subjects, and \(l_i(\mathbf{\theta})\) is the likelihood function for the \(i\)th subject. The matrices \(\hat{H}\) and \(\hat{J}\) are obtained from the matrices \(H\) and \(J\) by replacing \(\mathbf{\theta}\) by \(\hat{\mathbf{\theta}}\), respectively. A robust variance estimator is an asymptotically valid estimator even when the model ((1)) is mis-specified (Cox 1961).
We now focus on the continuous and positive outcomes of a certain disease, and consider a situation in which the efficacy of some treatments (group index: \(g=1,...,G\)) is compared based on a randomized, parallel group clinical trial, where the total number of subjects is \(n\). The explanatory variable matrix, \(X_i\), and the fixed effect parameter, \(\mathbf{\beta}\) in model ((1)) are denoted in detail as follows. Now, \(X_i\) is given by the \(n_i\times(GT+C)\) matrix that contains the intercept, \(G-1\) group variables, \(T-1\) occasion variables, group-by-occasion interaction variables, and \(C\) covariates, where the categorical covariates are converted into dummy variables. The fixed effect parameter vector is given by \(\mathbf{\beta}=(\beta_I,\mathbf{\beta}_g^T,\mathbf{\beta}_t^T,\mathbf{\beta}_{gt}^T,\mathbf{\beta}_c^T)^T\), where \(\beta_I\), \(\mathbf{\beta}_g=(\beta_{g(1)},\ldots,\beta_{g(G-1)})^T\), \(\mathbf{\beta}_t=(\beta_{t(1)},\ldots,\beta_{t(T-1)})^T\), \(\mathbf{\beta}_{gt}=(\beta_{gt(1,1)},\beta_{gt(1,2)},\ldots,\beta_{gt(G-1,T-1)})^T\), and \(\mathbf{\beta}_c=(\beta_{c(1)},\ldots,\beta_{c(C)})^T\) are the intercept, group, occasion, group-by-occasion, and covariate parameter vectors, respectively. Although group \(G\) and occasion \(T\) is set to be at the reference level, we set \(\beta_{g(G)}=\beta_{t(T)}=\beta_{gt(G,t)}=\beta_{gt(g,T)}=0\) for the sake of description (\(g=1,\ldots,G,\) \(t=1,\ldots,T\)).
Under these settings, the model median, \(\xi_{(g,t)}\), for the treatment group \(g\) at the occasion \(t\) on the original scale is given by \[\xi_{(g,t)}=\left\{\lambda\left(\beta_I+\beta_{g(g)}+\beta_{t(t)}+\beta_{gt(g,t)}+\mathbf{x}_{\bar{c}}^T\mathbf{\beta}_c\right)+1\right\}^{1/\lambda},\] where \(\mathbf{x}_{\bar{c}}\) is the vector of the mean of each covariate for all subjects. The model median is the inverse Box–Cox transformation of the model means on the Box–Cox transformed scale. The model median can be easily interpreted because it is the estimator of the median on the original scale.
Using the delta method, the variance estimator of the maximum likelihood estimator for the model median, \(\hat{\xi}_{(g,t)}\), is given by \(V_{\xi(g,t)}^{(\cdot)}=\Delta_{\xi(g,t)}^TV_{\mathbf{\theta}}^{(\cdot)}\Delta_{\xi(g,t)},\) where \(\Delta_{\xi(g,t)}=\left.\frac{\partial}{\partial\mathbf{\theta}}\xi_{(g,t)}\right|_{\mathbf{\theta}=\hat{\mathbf{\theta}}}\). If we use \(V_{\mathbf{\theta}}^{(\cdot)}=V_{\mathbf{\theta}}^{(M)}\), we obtain the model-based variance estimator, \(V_{\xi(g,t)}^{(M)}\). On the other hand, the robust variance estimator, \(V_{xi(g,t)}^{(R)}\), is obtained if we use \(V_{\mathbf{\theta}}^{(\cdot)}=V_{\mathbf{\theta}}^{(R)}\).
Thus, the model median difference between groups \(g_1\) and \(g_2\) at occasion \(t\) is given by \(\delta_{(g_1;g_2,t)}=\xi_{(g_1,t)}-\xi_{(g_2,t)}\) and the variance estimators of the maximum likelihood estimator of the model median difference, \(\hat{\delta}_{(g_1;g_2,t)}\), is given by \(V_{\delta(g_1;g_2,t)}=(\Delta_{(g_1,t)}-\Delta_{(g_2,t)})^TV_{\mathbf{\theta}}^{(\cdot)}(\Delta_{(g_1,t)}-\Delta_{(g_2,t)})\). The model-based and robust variance estimators are obtained in the same way as the model median estimator. Using the asymptotic normality of the maximum likelihood estimator, the \(100(1-\alpha)\%\) confidence interval of \(\delta_{(g_1;g_2,t)}\) is obtained as \(\hat{\delta}_{(g_1;g_2,t)}\pm \Phi^{-1}(1-\alpha/2)\sqrt{V_{\delta(g_1;g_2,t)}^{(\cdot)}}\), where \(\Phi^{-1}(\cdot)\) is the quantile function of the standard normal distribution. The Wald-type test for the null hypothesis, \(\delta_{(g_1;g_2,t)}=0\), is performed with the test statistic \(t=\hat{\delta}_{(g_1;g_2,t)}/\sqrt{V_{\delta(g_1;g_2,t)}^{(\cdot)}}\).
The performances of the above-mentioned inference procedures may be low for a small sample because these are based on the asymptotic properties of the maximum likelihood estimation. Therefore, (Maruo et al. 2017) applied the following empirical small sample adjustment for the inferences of the model median differences by referring to the study in (Schluchter and Elashoff 1990).
They provide an adjusted standard error (SE) for the median difference
as \(\sqrt{M/(M-p)V_{\delta(g_1;g_2,t)}^{(\cdot)}}\) for the compound
symmetry (CS) or the first-order auto regression (AR(1)) structure and
\(\sqrt{n^*/(n^*-T)}\)
\(\times\sqrt{V_{\delta(g_1;g_2,t)}^{(\cdot)}}\) for the unstructured (UN)
structure. Further, we approximate the null distribution by the \(t\)
distribution where the degrees of freedom for the CS or the AR(1)
structure and the UN structure are \((n-G)(T-1)-m\) and \(n^*-T\),
respectively.
Although (Maruo et al. 2017) and (Maruo et al. 2020) focus only on the three covariance structures, these three options are sufficient in the applied settings because the MMRM analysis is often applied in the following steps. The UN structure is used, and the CS or AR(1) structure with the robust variance estimation is used when the parameter estimation process is not properly converged (e.g., see (Gosho et al. 2017)).
In this section, we describe the bcmixed package that provides the analysis results based on the mixed effect models with the Box–Cox transformation. The package bcmixed is available from the Comprehensive R Archive Network (CRAN) at https:// CRAN.R-project.org/package=bcmixed. The package bcmixed can be installed and loaded using the following code.
R> install.packages("bcmixed")
R> library(bcmixed)In particular, the following main functions are demonstrated in this article:
bcmarg: Inference on the marginal model of the mixed effect model with the Box–Cox transformation.
bcmmrm: Inference on the model median for longitudinal data in randomized clinical trials.
First, we illustrate a real example for the acquired immune deficiency
syndrome (AIDS) clinical trial data, which is stored as the data frame
aidscd4 in the bcmixed package. The data are from a randomized,
double-blind study of AIDS patients with advanced immune suppression
(cluster of differentiation 4 (CD4) cells count of less than or equal to
50 cells/mm3) (Henry et al. 1998; Fitzmaurice et al. 2011). Patients in the
AIDS clinical trial group study 193A were randomized to dual or triple
combinations of human immunodeficiency virus-1 reverse transcriptase
inhibitors. In particular, the patients were randomized to one of four
daily regimens. The original data can be downloaded from
https://content.sph.harvard.edu/fitzmaur/ala/ (Datasets->AIDS
Clinical Trial Group (ACTG) Study 193A). As for the more detailed data
handling process, see (Maruo et al. 2017). The data frame aidscd4
contains seven variables (Table 1).
| Variable | Description | 
|---|---|
| id | A patient identifier; in total there are 1177 patients. | 
| weekc | A visit variable (weeks 8, 16, 24, 32). | 
| treatment | Allocated treatment regimens; | 
| 1= zidovudine alternating monthly with 400mg didanosine, | |
| 2= zidovudine plus 2.25mg of zalcitabine, | |
| 3= zidovudine plus 400mg of didanosine, and | |
| 4= zidovudine plus 400mg of didanosine plus 400mg of nevirapine. | |
| age | Patients’ age (years). | 
| sex | Patients’ sex ( 1= male,0= female). | 
| cd4.bl | A baseline value of CD4 cells count + 1. | 
| cd4 | A CD4 cells count + 1. | 
Figure 1 shows the longitudinal box-whisker plot of the values of CD4 plus 1 for each group plotted using ggplot2 package (Wickham 2016). The distribution shapes of the measurements were heavily skewed.
This function provides the inference results for the marginal model of the mixed effect model with the Box–Cox transformation described in Section 2.1.
The bcmarg function is called using the following syntax.
bcmarg(formula, data, time = NULL, id = NULL, structure = "UN", 
  lmdint = c(-3, 3))The bcmarg function takes arguments tabulated in Table
2.
| Argument | Description | 
|---|---|
| formula | A two-sided linear formula object describing the model, with the response | 
| on the left of a ~ operator and the terms, separated by +operators, on the right. | |
| data | A data frame containing the variables used in the model. | 
| time | A time variable name for repeated measurements. The default is NULL. | 
| id | A subject id variable name for repeated measurements. The default is NULL. | 
| structure | Specify the covariance structure from c("UN", "CS", "AR(1)"). The default is"UN". | 
| lmdint | A vector containing the end points of the interval to be searched for a | 
| transformation parameter. The default is c(-3, 3). | 
If time and id are not specified, an ordinary linear model with the
Box–Cox transformation is applied.
The bcmarg function returns an object of class "bcmarg". The objects
of this class have methods for generic functions coef, logLik,
vcov, fitted, print, and summary. The object includes the
components for the marginal model parameter inference (Table
3).
| Component | Description | 
|---|---|
| lambda | A numeric value of the estimate of the transformation parameter. | 
| beta | A vector with the estimates of the regression parameters. | 
| alpha | A vector with the estimates of the covariance parameters. | 
| V | A variance-covariance matrix for any subject with no missing values. | 
| betainf | A matrix containing the inference results for betaunder the assumption | 
| that lambdais known. | |
| Vtheta.mod | A model-based variance-covariance matrix for MLE of the vector of all | 
| parameters: c(lambda, beta, alpha). | |
| Vtheta.rob | A robust variance-covariance matrix for MLE of the vector of all parameters. | 
| logLik | A numeric value of the maximized likelihood. | 
| adj.prm | A vector with parameters used for the empirical small sample adjustment in | 
| bcmmrm: c(number of subjects, number of completed subjects, number of | |
| outcome observations, number of missing observations). | |
| glsObject | An object of "gls"(or"lm"whentimeandidare not specified) containing | 
| results of gls(orlm) function on the transformed scale. | 
In bcmarg function, lambda is estimated with the optimize function
by maximizing the profile likelihood function for \(\lambda\). If an error
occurs in the optimize function, problems may be solved by changing
the search area for \(\lambda\), lmdint.
We applied a marginal model to the aidscd4 data, where the fixed
effects were the treatment, visit, treatment-visit interaction, and the
Box–Cox transformed baseline, where the visit was treated as a nominal
variable. The covariance structure was set as unstructured (default
setting). This model is frequently used for MMRM analysis. A sample code
is as follows. The bct.v function returns the Box–Cox transformed
vector.
R> data("aidscd4")
R> aidscd4$cd4.bl.tr <- bct.v(aidscd4$cd4.bl)$transformed
R> res1 <- bcmarg(formula = cd4 ~ as.factor(treatment) * as.factor(weekc) + 
+                 cd4.bl.tr, data = aidscd4, time = weekc, id = id)The summarized analysis results on the transformed scale are obtained by
applying the summary function to the "bcmarg" object as follows.
R> summary(res1)        
Box-Cox transformed mixed model analysis
  Formula: cd4 ~ as.factor(treatment) * as.factor(weekc) + cd4.bl.tr 
  Time: weekc 
  ID: id 
  Covariance structure: "UN" 
  Data: aidscd4 
  Log-likelihood: -13322.96
  Estimated transformation parameter:  0.154
Coefficients on the transformed scale:
                                           Value Std.Error t-value p-value
(Intercept)                               1.0849    0.1249   8.684   0.000
as.factor(treatment)2                     0.2454    0.1214   2.022   0.043
as.factor(treatment)3                     0.4203    0.1212   3.468   0.001
as.factor(treatment)4                     0.7649    0.1199   6.377   0.000
as.factor(weekc)16                       -0.2043    0.0843  -2.424   0.015
as.factor(weekc)24                       -0.4498    0.0899  -5.004   0.000
as.factor(weekc)32                       -0.6750    0.0988  -6.835   0.000
cd4.bl.tr                                 0.5782    0.0200  28.922   0.000
as.factor(treatment)2:as.factor(weekc)16 -0.1183    0.1185  -0.998   0.318
as.factor(treatment)3:as.factor(weekc)16 -0.0560    0.1180  -0.475   0.635
as.factor(treatment)4:as.factor(weekc)16  0.0675    0.1175   0.574   0.566
as.factor(treatment)2:as.factor(weekc)24 -0.1863    0.1264  -1.474   0.141
as.factor(treatment)3:as.factor(weekc)24 -0.0243    0.1264  -0.193   0.847
as.factor(treatment)4:as.factor(weekc)24  0.0870    0.1263   0.689   0.491
as.factor(treatment)2:as.factor(weekc)32 -0.0852    0.1414  -0.603   0.547
as.factor(treatment)3:as.factor(weekc)32 -0.0893    0.1400  -0.638   0.524
as.factor(treatment)4:as.factor(weekc)32  0.1414    0.1381   1.024   0.306
Covariance parameters on the transformed scale:
UN(1,1) UN(1,2) UN(1,3) UN(1,4) UN(2,2) UN(2,3) UN(2,4) UN(3,3) UN(3,4) UN(4,4) 
  1.798   1.156   1.105   0.927   1.957   1.346   1.298   1.903   1.452   2.009 
Correlations on the transformed scale:
       8    16    24    32
8  1.000 0.616 0.598 0.488
16 0.616 1.000 0.697 0.655
24 0.598 0.697 1.000 0.743
32 0.488 0.655 0.743 1.000The results of coefficients on the transformed scale are obtained with
the gls function in the
nlme package. The
transformation parameter was estimated as 0.154, which suggested that
the shape of the residual distribution was close to a multivariate
log-normal distribution. All main effects were significant; however,
treatment-by-week interaction was not significant at a level of 0.05.
Note that inference results for beta under the assumption that the
transformation parameter is known are provided. Although statistical
tests would be asymptotically valid (e.g., see (Doksum and Wong 1983)),
standard errors might be underestimated (e.g., see (Bickel and Doksum 1981)).
This function provides the results for the model median inferences for
longitudinal randomized clinical trial data described in Section
2.2. The parameter inference is conducted by calling the
bcmarg function in the bcmmrm function.
The bcmmrm function is called using the following syntax.
bcmmrm(outcome, group, data, time = NULL, id = NULL, covv = NULL, 
  cfactor = NULL, structure = "UN", lmdint = c(-3, 3), glabel = NULL, 
  tlabel = NULL)The bcmmrm function takes arguments tablated in Table
4.
| Argument | Description | 
|---|---|
| outcome | A name of positive outcome (dependent) variable included in data. | 
| group | A name of treatment group variable included in data. | 
| data | A data frame that may include outcome,group,time,id, and specified covariate | 
| variables. | |
| time | A name of time variable for repeated measurements included in data. | 
| The default is NULL. | |
| id | A name of subject id variable for repeated measurements included in data. | 
| The default is NULL. | |
| covv | A character vector with names of covariate variables included in data. | 
| The default is NULL. | |
| cfactor | An integer vector including nominal variable indicators for covariate variables. | 
| Nominal variable: 1, continuous variable:0. The default isNULL. | |
| structure | Specify the covariance structure from c("UN", "CS", "AR(1)"). The default is"UN". | 
| conf.level | A numeric value of the confidence level for the confidence intervals. | 
| The default is 0.95. | |
| lmdint | A vector containing end points of the interval to be searched for a transformation | 
| parameter. The default is c(-3, 3). | |
| glabel | A vector of length number of treatment groups containing the labels of group | 
| variable. The default is NULLand the levels ofgroupvariable indataare used. | |
| tlabel | A vector of length number of repeated measures containing the labels of time | 
| variable. The default is NULLand the levels oftimevariable indataare used. | 
If time and id are not specified, inference results reduce to the
results for the context of linear regression model provided by
(Maruo et al. 2015).
The bcmmrm function returns an object of class "bcmmrm" representing
the results of model median inference based on the Box–Cox transformed
MMRM analysis. Generic functions boxplot, coef, logLik, vcov,
fitted, plot, print, and summary have methods to demonstrate the
results of the fit. Components tablated in Table 5
must be included in a legitimate "bcmmrm" object.
| Component | Description | 
|---|---|
| call | A list containing an image of the bcmmrmcall that produced the object. | 
| median.mod, | Lists including inference results for the model medians for all groups. | 
| median.rob, | Levels of the list are time points, where the correspondence table is given | 
| median.mod.adj, | as time.tbl$code. mod: model-based inference, rob: robust inference, | 
| median.rob.adj | adj: with empirical small sample adjustment. | 
| meddif.mod, | Lists including inference results for the for the model median differences | 
| meddif.rob, | between all pairs of groups (group1 - group0). The levels of the list are | 
| meddif.mod.adj, | time points, where the correspondence table is given as time.tbl$code. | 
| meddif.rob.adj | mod: model-based inference,rob: robust inference, | 
| adj: with empirical small sample adjustment. | |
| lambda | A numeric value of estimates of the transformation parameter. | 
| R | A correlation matrix for transformed outcomes. | 
| betainf | Inference results for betaunder the assumption thatlambdais known. | 
| time.tbl | A data frame of a correspondence table for the time points. | 
| This object is used when applying the above generic functions. | |
| group.tbl | A data frame of a correspondence table for treatment groups. | 
| This object is used when applying the above generic functions. | |
| inf.marg | A "bcmarg"object with results of thebcmargfunction called in the | 
| bcmmrmfunction. | |
| outdata | A data frame where the transformed outcome ( ytr), the fitted values | 
| on the transformed scale ( ytr.fitted), and the residuals on | |
| the transformed scale ( res.tr) are added to input data. | |
| conf.level | A numeric value of the specified confidence level. | 
lambda, R, betainf, and inf.marg are obtained from the results
of the bcmarg function that is called in the bcmmrm function.
We applied the MMRM analysis with the Box–Cox transformation described
in Section 3.2 to the aidscd4 data, where the Box–Cox
transformed baseline and sex were included as the covariates. The
example code is as follows.
R> data("aidscd4")
R> aidscd4$cd4.bl.tr <- bct.v(aidscd4$cd4.bl)$transformed
R> res2 <- bcmmrm(outcome = cd4, group = treatment, data = aidscd4, time = weekc, 
+                id = id, covv = c("cd4.bl.tr", "sex"),  cfactor = c(0, 1), 
+                glabel = c("Zid/Did", "Zid+Zal", "Zid+Did", "Zid+Did+Nev")The transformed baseline and sex are continuous and categorical
variables, respectively, and therefore, cfactor was set as c(0,1).
The print function only provides information about model detail, the
estimated transformation parameter, the maximized log-likelihood, and
the model median estimate for each time point and group as follows.
R> print(res2)
Model median estimation based on MMRM with Box-Cox transformation
  Outcome: cd4 
  Group: treatment 
  Time: weekc 
  ID: id 
  Covariate(s): c("cd4.bl.tr", "sex") 
  Covariance structure: "UN" 
  Data: aidscd4 
  Estimated transformation parameter:  0.154 
  Log-likelihood: -13322.36
Model median estimates (row: group, col: time):
  treatment | weekc    8   16   24   32
1           Zid/Did 18.9 16.5 14.1 12.1
2           Zid+Zal 22.0 17.9 14.6 13.5
3           Zid+Did 24.5 20.9 18.2 15.1
4       Zid+Did+Nev 30.1 27.8 24.2 21.8The summary function provides more detailed analysis results as
follows.
R> summary(res2)
Model median inference based on MMRM with the Box-Cox transformation
Data and variable information:
  Outcome: cd4 
  Group: treatment 
  Time: weekc 
  ID: id 
  Covariate(s): c("cd4.bl.tr", "sex") 
  Data: aidscd4 
Analysis information:
  Covariance structure: "UN" 
  Robust inference: TRUE 
  Empirical small sample adjustment: TRUE 
  Confidence level: 0.95 
Analysis results:
  Estimated transformation parameter:  0.154 
 
Model median inferences for weekc = 8 
 
    treatment median    SE lower.CL upper.CL
1     Zid/Did   18.9 0.862     17.2     20.6
2     Zid+Zal   22.0 1.124     19.8     24.2
3     Zid+Did   24.5 1.465     21.6     27.4
4 Zid+Did+Nev   30.1 1.597     27.0     33.3
(...Omitted for weeks 16 and 24...)
Model median inferences for weekc = 32 
 
    treatment median    SE lower.CL upper.CL
1     Zid/Did   12.1 0.662     10.8     13.4
2     Zid+Zal   13.5 0.813     11.9     15.1
3     Zid+Did   15.1 1.019     13.1     17.1
4 Zid+Did+Nev   21.8 1.376     19.1     24.5
 
Inferences of model median difference between groups 
  ( treatment 1 - treatment 0 ) for weekc = 8 
 
  treatment 1 treatment 0 delta   SE lower.CL upper.CL t.value p.value
1     Zid+Zal     Zid/Did  3.12 1.40    0.363     5.87    2.22   0.027
2     Zid+Did     Zid/Did  5.64 1.69    2.325     8.96    3.34   0.001
3 Zid+Did+Nev     Zid/Did 11.25 1.80    7.711    14.80    6.24   0.000
4     Zid+Did     Zid+Zal  2.53 1.83   -1.059     6.12    1.39   0.167
5 Zid+Did+Nev     Zid+Zal  8.14 1.93    4.349    11.93    4.22   0.000
6 Zid+Did+Nev     Zid+Did  5.61 2.16    1.372     9.85    2.60   0.010
(...Omitted for weeks 16 and 24...)
 
Inferences of model median difference between groups 
  ( treatment 1 - treatment 0 ) for weekc = 32 
 
  treatment 1 treatment 0 delta   SE lower.CL upper.CL t.value p.value
1     Zid+Zal     Zid/Did  1.35 1.04   -0.692     3.40    1.30   0.194
2     Zid+Did     Zid/Did  3.00 1.20    0.633     5.37    2.49   0.013
3 Zid+Did+Nev     Zid/Did  9.70 1.52    6.705    12.69    6.37   0.000
4     Zid+Did     Zid+Zal  1.65 1.30   -0.907     4.20    1.27   0.206
5 Zid+Did+Nev     Zid+Zal  8.34 1.60    5.206    11.48    5.23   0.000
6 Zid+Did+Nev     Zid+Did  6.70 1.71    3.338    10.06    3.92   0.000Significant differences were detected for the following pairs Zid+Did vs. Zid/Did, Zid+Did+Nev vs. Zid/Did, Zid+Did+Nev vs. Zid+Zal, and Zid+Did+Nev vs. Zid+Did at week 32.
The summary function provides the results using the robust variance
and the small sample adjustment in the default settings. If users want
to summarize results using the model variance and not using the small
sample adjustment, specify
summary(bcmmrmObject, robust = F, ssadjust = F). Further details of
the summary function for the "bcmmrm" object can be obtained with
?summary.bcmmrm.
The inference results for the median differences at week 32 (fourth visit) can also be called as follows although the levels of the group variable in the data frame are used without formatting.
R> res2$meddif.rob.adj[[4]]
  group1 group0    delta       SE   lower.CL  upper.CL  t.value      p.value
1      2      1 1.354438 1.041338 -0.6922404  3.401117 1.300672 1.940595e-01
2      3      1 3.000942 1.204919  0.6327547  5.369129 2.490575 1.312570e-02
3      4      1 9.697631 1.522831  6.7046091 12.690653 6.368159 4.869820e-10
4      3      2 1.646503 1.299231 -0.9070469  4.200054 1.267291 2.057293e-01
5      4      2 8.343192 1.596236  5.2058987 11.480486 5.226792 2.682862e-07
6      4      3 6.696689 1.709027  3.3377114 10.055667 3.918421 1.034880e-04The "bcmmrm" object can be plotted with the plot function as follows
(Figure 2).
plot function to bcmmrm object.R> plot(res2, xlab = "Week", ylab = "CD4+1")The plot function provides a longitudinal plot in the default
settings. However, a plot at a specified time point can be drawn with
the following code (Figure 3):
R> plot(res2, timepoint = 32, xlab = "Treatment", ylab = "CD4+1", col = 1:4)plot function to bcmmrm object with timepoint
option.Further, the plot function provides the results using the robust
variance and the small sample adjustment in the default setting. Many
other options such as main and legend can be used in the plot
function. Further details can be obtained with ?plot.bcmmrm.
A diagnosis of a model fitting can be conducted with the boxplot
function, which provides a box-whisker plot of the Box–Cox transformed
residuals for each group. A sample code is provided as follows (Figure
4):
R> boxplot(res2)boxplot function to bcmmrm
object.The shape of the transformed residual for each group is not skewed and
the median and mean are close to each other, which suggests that the
median would not be biased at week 32. The boxplot function provides
the results at the last time points in the default settings. A
box-whisker plot at another time point can be obtained by specifying
boxplot(bcmmrmObject, timepoint = xx).
We demonstrated the use of the bcmixed package. The bcmarg function
provides the analysis results for a marginal model of a mixed effect
model with the Box–Cox transformation. The results for the statistical
test of the bcmarg function are meaningful. However, it is difficult
to interpret coefficients (\(\mathbf{\beta}\)) on the transformed scale.
The bcmmrm function provides the model median inferences based on the
MMRM with the Box–Cox transformation for longitudinal randomized
clinical trials. Using this function, treatment effects can be
interpreted as median differences between treatment groups at specified
time points. (Maruo et al. 2017) also show that this method controls the
type I error of the statistical test for the model median difference,
and it has moderate or high performance for power compared with the
existing methods (ordinary MMRM, MMRM for the log-transformed outcome,
etc.) from their simulation studies (the simulation program is provided
in https://github.com/kzkzmr/Maruo_2017_StatMed_Simulation with
penalized power results proposed by (Cavus et al. 2019)). Thus, bcmmrm
function analysis results with high power and high interpretability for
longitudinal randomized clinical trials with skewed outcomes. Further,
the bcmmrm function provides summarizing and visualization tools,
which would be helpful to write clinical trial reports.
Although the bcmixed package can be used for data other than that of
randomized clinical trials, the performances of these methods have not
been evaluated well yet. Therefore, users should use them carefully.
Although a model fitting can be diagnosed with the boxplot function,
more model diagnosis tools may be implemented in the future. Users might
apply the MissMech
package(Jamshidian et al. 2014, 2015), which diagnoses
multivariate normality and heteroscedasticity, to the transformed
residuals stored in "bcmarg" object. Note that the MissMech package
assumes missing mechanisms are missing completely at random (MCAR), and
statistical tests for model fittings may lead to significant results for
a medium-to-large sample even when the models fit the data adequately.
This work was supported by JSPS KAKENHI Grant Number JP19K11849.
bcmixed, nlme, glme, lme4, CLME, PLmixed, MCMCglmm, ggplot2, MissMech
Bayesian, ChemPhys, ClinicalTrials, Econometrics, Environmetrics, Finance, MixedModels, NetworkAnalysis, OfficialStatistics, Phylogenetics, Psychometrics, Spatial, SpatioTemporal, Survival, TeachingStatistics
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Maruo, et al., "bcmixed: A Package for Median Inference on Longitudinal Data with the Box--Cox Transformation", The R Journal, 2021
BibTeX citation
@article{RJ-2021-083,
  author = {Maruo, Kazushi and Ishii, Ryota and Yamaguchi, Yusuke and Gosho, Masahiko},
  title = {bcmixed: A Package for Median Inference on Longitudinal Data with the Box--Cox Transformation},
  journal = {The R Journal},
  year = {2021},
  note = {https://doi.org/10.32614/RJ-2021-083},
  doi = {10.32614/RJ-2021-083},
  volume = {13},
  issue = {2},
  issn = {2073-4859},
  pages = {253-265}
}