swgee: An R Package for Analyzing Longitudinal Data with Response Missingness and Covariate Measurement Error

Abstract:

Though longitudinal data often contain missing responses and error-prone covariates, relatively little work has been available to simultaneously correct for the effects of response missingness and covariate measurement error on analysis of longitudinal data. proposed a simulation based marginal method to adjust for the bias induced by measurement error in covariates as well as by missingness in response. The proposed method focuses on modeling the marginal mean and variance structures, and the missing at random mechanism is assumed. Furthermore, the distribution of covariates are left unspecified. These features make the proposed method applicable to a broad settings. In this paper, we develop an R package, called swgee, which implements the method proposed by . Moreover, our package includes additional implementation steps which extend the setting considered by . To describe the use of the package and its main features, we report simulation studies and analyses of a data set arising from the Framingham Heart Study.

Cite PDF Tweet

Published

Aug. 17, 2019

Received

Jul 31, 2018

Citation

Xiong & Yi, 2019

Volume

Pages

11/1

416 - 426


1 Introduction

Longitudinal studies are commonly conducted in the health sciences, biochemical, and epidemiology fields; these studies typically collect repeated measurements on the same subject over time. Missing observations and covariate measurement error frequently arise in longitudinal studies and they present considerable challenges in statistical inference about such data . It has been well documented that ignoring missing responses and covariate measurement error may lead to severely biased results, thus leading to invalid inferences .

Regarding longitudinal data with missing responses, there has been extensive methods such as maximum likelihood, multiple imputation, and weighted generalized estimating equations (GEE) method . In terms of methods of handling measurement error in covariate, many methods have been developed for various settings. Comprehensive discussions can be found in , , , and . However, there has been relatively little work on simultaneously addressing the effects of response missingness and covariate measurement error in longitudinal data analysis, although some work such as , and , are available. In particular, proposed an estimation method based on the marginal model for the response process, which does not require the full specification of the distribution of the response variable but models only the mean and variance structures. Furthermore, a functional method is applied to relax the need of modeling the covariate process. These features make the method of flexible for many applications.

Relevant to our R package, a set of R packages and statistical software have been available for performing the GEE and weighted GEE analyses for longitudinal data with missing observations. In particular, package gee and yags perform the GEE analyses under the strong assumption of missing completely at random (MCAR) . Package wgeesel can perform the multiple model selection based on weighted GEE/GEE. Package geepack implements the weighted GEE analyses under the missing at random (MAR) assumption, in which an optional vector of weights can be used in the fitting process but the weight vector has to be externally calculated. In addition, the statistical software SAS/STAT version 13.2 includes an experimental version of the function PROC GEE , which fits weighted GEE models.

Our swgee package has several features distinguishing from existing packages. First, swgee is designed to analyze longitudinal data with both missing responses and error-prone covariates. To the best of our knowledge, this is the first R package that can simultaneously account for response missingness and covariate measurement error. Secondly, this simulation based marginal method can be applied to a broad range of problems because the associated model assumptions are minimal. swgee can be directly applied to handle continuous and binary responses as well as count data with dropouts under the MAR and MCAR mechanisms. Thirdly, observations are weighted inversely proportional to their probability of being observed, with weights calculated internally. Lastly, the swgee package employs the simulation extrapolation (SIMEX) algorithm to account for the effect of measurement error in covariates.

The remainder is organized as follows. Section 2 introduces the notation and model setup. In Section 3, we describe the method proposed by and its implementation in R in Section 4. The developed R package is illustrated with simulation studies and analyses of a data set arising from the Framingham Heart Study in Section 5. General discussion is included in Section 6.

2 Notation and framework

For i=1,,n and j=1,,m, let Yij be the response variable for subject i at time point j, let Xij be the vector of covariates subject to error, and Zij be the vector of covariates which are error-free. Write Yi=(Yi1,Yi2,,Yim), Xi=(Xi1,Xi2,,Xim), and Zi=(Zi1,Zi2,,Zim).

Response model

For i=1,,n and j=1,,m, let μij=E(Yij|Xi,Zi) and vij=var(Yij|Xi,Zi) be the conditional expectation and variance of Yij, given the covariates Xi and Zi, respectively. We model the influence of the covariates on the marginal response mean by means of a regression model: (1)g(μij)=Xijβx+Zijβz, where β=(βx,βz) is the vector of regression parameters and g() is a specified monotone function. The intercept term, if any, of the model may be included as the first element of βz by including the unit vector as the first column of Zi.

To model the variance of Yij, we consider (2)vij=h(μij;ϕ), where h(;) is a given function and ϕ is the dispersion parameter that is known or to be estimated. We treat ϕ as known with emphasis setting on estimation of the β parameter. Here we assume that E(Yijk|Xi,Zi)=E(Yijk|Xij,Zij) for k=1 and 2, that is, the dependence of the mean μij and the variance vij on the subject-level covariates Xi and Zi is completely reflected by the dependence on the time-specific covariates Xij and Zij. This assumption has been widely used in marginal analysis of longitudinal analysis . The necessity of these assumptions was discussed by .

Missing data model

For i=1,,n and j=1,,m, let Oij be 1 if Yij is observed and 0 otherwise, and let Oi=(Oi1,Oi2,,Oim) be the vector of missing data indicators. Dropouts or monotone missing data patterns are considered here. That is, Oij=0 implies Oij=0 for all j>j. We assume that Oi1=1 for every subject i. To reflect the dynamic nature of the observation process over time, we assume an MAR mechanism for the missing process. That is, given the covariates, the missingness probability depends on the observed responses but not unobserved response components . Let λij=P(Oij=1|Oi,j1=1,Xi,Zi,Yi) and πij=P(Oij=1|Xi,Zi,Yi), then (3)πij=t=2jλit. Logistic regression models are used to model the dropout process: (4)logit(λij)=uijα, for j=2,,m, where uij is the vector consisting of the information of the covariates Xi, Zi and the observed responses, and α is the vector of regression parameters. Write θ=(α,β) and let q=dim(θ).

Measurement error model

For i=1,,n and j=1,,m, let Wij be the observed measurements of the covariates Xij. Covariates Xij and their observed measurements Wij are assumed to follow a classical additive measurement error model: (5)Wij=Xij+eij, where the eij are independent of Xi , Zi and Yi. And eij follows N(0,Σe) with the covariance matrix Σe. This model has been widely used in the context of handling measurement error problems. assumed that Σe is known or can be estimated from replication experiments .

3 Methodology

Weighted estimation function

The inverse probability weighted generalized estimating equations method is often employed to accommodate the missing data effects when primary interest lies in the estimation of the marginal mean parameters β in the model (1). For i=1,,n, let Mi be the random dropout time for subject i and mi be a realization. Define Li(α)=(1λimi)t=2mi1λit, where λit is determined by model (4). Let Si(α)=logLi(α)/α be the vector of score functions contributed from subject i. Let Di=μi/β be the matrix of the derivatives of the mean vector μi=(μi1,,μim) with respect to β and let Δi=diag(I(Oij=1)/πij,j=1,2,,m) be the weighted matrix accommodating missingness, where I() is the indicator function. Let Vi=Ai1/2CiAi1/2 be the conditional covariance matrix of Yi, given Xi and Zi, where Ai=diag(vij,j=1,2,,m) and Ci=[ρi;jk] is the correlation matrix with diagonal elements equal 1 and ρi;jk being the conditional correlation coefficient of response components Yij and Yik for jk, given Xi and Zi. Define Ui(θ)=DiVi1Δi(Yiμi) and (6)Hi(θ)=(Ui(θ),Si(α)).

In the absence of measurement error, that is, covariates Xij are precisely observed, we have E[Hi(θ)]=0. Hence, H(θ)=i=1nHi(θ) are unbiased estimation functions for θ . Under regularity conditions, the consistent estimator θ^ of θ can be obtained by solving (7)H(θ)=0, where the weight matrix Δi is used to adjust for the contributions of subject i with his/her missingness probabilities incorporated. Specifically, the probability πij is determined by (3) in conjunction with (4). Correlation matrix Ci can be replaced by the moment estimate, or alternatively, a working independence matrix Ai may be used to replace Vi . A detail discussion can be found in .

SIMEX approach

When measurement error is present in covariates Xij, H(θ) is no longer unbiased if naively replacing Xij with its observed measurement Wij. developed a simulation-extrapolation (SIMEX) method to adjust for the bias induced by using Wij, as well as the missingness effects in the response variables. This method originates from the SIMEX method by who considered cross-sectional data with measurement error alone. The basic idea of the SIMEX method is to first add additional variability to the observed measurement Wij, then establish the trend how different degrees of measurement error may induce bias in estimation of the model parameters, and finally extrapolate this trend to the case of no measurement error.

Now, we describe the SIMEX method developed by . Let B be a given positive integer and Λ={λ1,λ2,,λM} be a sequence of nonnegative numbers taken from [0,λM] with λ1=0.

The SIMEX approach is very appealing because of its simplicity of implementation and no requirement of modeling the true covariates Xi. However, to use this method, several aspects need to be considered. As suggested by , the specification of Λ is not unique; a typical choice of grid Λ is the equal cut points of interval [0,2] with M=5 or 9. Choosing B=100 or 200 is often sufficient for many applications. The quadratic regression function is commonly used for Step 3 to yield reasonable results. .

Finally, we extend the method by to accommodating the case where the covariance matrix Σe for model (5) is unknown but repeated surrogate measurements of Xij are available. Let Wijk denote the repeated surrogate measurements of Xij for i=1,,n;j=1,,m; and k=1,,K. The surrogate measurements Wijk and the true covariate Xij are linked by the model (8)Wijk=Xij+eijk, where the eijk are independent of Xi , Zi and Yi, and eijk follows N(0,Σe) with the covariance matrix Σe. We now adapt the arguments of to modify the simulation step of the preceding SIMEX method. For a given b and λΛ, set (9)Wij(b,λ)=Wij+λ/Kk=1Kcijk(b)Wijk, where Wij=K1k=1KWijk and cij(b)=(cij1(b),,cijk(b)) is a normalized contrast satisfying k=1Kcijk=0 and k=1Kcijk2=1.

A simple way to generate a contrast cij(b) can be done by independently generating K variates, dijk(b), from N(0,1) for k=1,,K and a given b. Let dij(b)=K1k=1Kdijk(b). Then cijk(b) is set as cijk(b,λ)=dijk(b)dij(b)k=1K{dijk(b)dij(b)}2. Once Wij(b,λ) of (9) is available, we repeat Steps 2 and 3 to obtain the SIMEX estimator and the associated standard error.

4 Implementation in R

We implement the SIMEX procedure described in Section 3 in R and develop the package, called swgee. Our package swgee takes the advantage of existing R packages geepack and mvtnorm . Specifically, the function swgee produces the estimates for elements of the parameter vector β, which are of primary interest, the associated standard errors, and P-values.

Our R function swgee requires the input data set to be sorted by subject i and visit time j for i=1,,n and j=1,,m. If a subject is missing at a certain time, the corresponding measurements should be recorded as NAs. As long as the user provides the missing data model (4), the function swgee can internally generate the missing data indicators Oij for i=1,,n and j=1,,m, and then apply the user specified model (4) to fit the data. The missingness probabilities πij are calculated by (3) and then used to construct the weight matrix Δi for the estimating equation (6). The estimate of the missing data model (4) parameter α can also be retrieved from the function swgee output.

The form of calling function swgee is given by

    swgee(formula, data, id, family, corstr, missingmodel, SIMEXvariable,
        SIMEX.err, repeated = FALSE, repind = NULL, B, lambda)

where the arguments are described as follows:

5 Examples

An example data set

To illustrate the usage of the developed R package swgee, we apply the package to a subset of GWA13 (Genetic Analysis Workshops) data arising from the Framingham Heart Study. The data set consists of measurements of 100 patients from a series of exams with 5 assessments for each individual. Measurements such as height, weight, age, systolic blood pressure (SBP) and cholesterol level (CHOL) are collected at each assessment, and 14% patients dropped out of the study. The original data were analyzed by . It is of interest to study how an individual’s obesity may change with age (Zij) and how it is associated with SBP (Xij1) and CHOL (Xij2), where i=1,,100, and j=1,,5. The response Yi is the indicator of obesity status of subject i as in ; SBP is rescaled as log(SBP50) as in ; and CHOL is standardized. The response and the covariates are postulated by the logistic regression model: logitμij=β0+βx1Xij1+βx2Xij2+βzZij,

where β0, βx1, βx2 and βz are regression coefficients of interest. We assume that errors in both risk factors Xij1 and Xij2 can be represented by model (5). The missing data process is characterized by the logistic regression model: logitλij=α1+α2Yi,j1+α3Xi,j1,1+α4Xi,j1,2+α5czi,j1, for j=2,,5.

We now apply the developed R package swgee, which can be downloaded from CRAN and then loaded in R:

    R> library("swgee")

Next, load the data that are properly organized with the variable names specified. In the example here, the data set, named as bmidata, is included by issuing

    R> data("BMI")
    R> bmidata <- BMI

We are concerned how measurement error in SBP and CHOL impacts estimation of parameter β=(β0,βx1,βx2,βz). For illustrative purposes, we use setting with B=100, λM=2 and M=5. In this example, we assume that parameters in Σe=(σ12σ12σ21σ22) with σ12=σ21 are known. This is a typical case when conducting sensitivity analysis. Here we set σ1=σ2=0.5 and σ12=σ21=0 as an example.

The naive GEE approach without considering missingness and measurement error effects in covariates gives the output:

    R> output1 <- gee(bbmi~sbp+chol+age, id=id, data=bmidata,
    +      family=binomial(link="logit"), corstr="independence")

    R> summary(output1)

     GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
     gee S-function, version 4.13 modified 98/01/27 (1998)

    Model:
     Link:                      Logit
     Variance to Mean Relation: Binomial
     Correlation Structure:     Independent

    Call:
    gee(formula = bbmi ~ sbp + chol + age, id = id, data = bmidata,
        family = binomial(link = "logit"), corstr = "independence")

    Summary of Residuals:
            Min          1Q      Median          3Q         Max
    -0.26533967 -0.11385369 -0.08572483 -0.06279540  0.95475735

    Coefficients:
                   Estimate Naive S.E.    Naive z Robust S.E.   Robust z
    (Intercept) -5.43746374 1.42090827 -3.8267521  1.64320527 -3.3090593
    sbp          0.59071183 0.30643396  1.9276970  0.24338420  2.4270755
    chol         0.11109496 0.13654324  0.8136247  0.23086218  0.4812177
    age          0.01297337 0.01339946  0.9682008  0.01814546  0.7149652

    Estimated Scale Parameter:  1.017131
    Number of Iterations:  1
    Working Correlation
         [,1] [,2] [,3] [,4] [,5]
    [1,]    1    0    0    0    0
    [2,]    0    1    0    0    0
    [3,]    0    0    1    0    0
    [4,]    0    0    0    1    0
    [5,]    0    0    0    0    1

To adjust for possible effects of missingness as well as measurement error in variables SBP and CHOL, we call the developed function swgee for the analysis:

    R> set.seed(1000)
    R> sigma <- diag(rep(0.25, 2))
    R> output2 <- swgee(bbmi~sbp+chol+age, data=bmidata, id=id,
    +     family=binomial(link="logit"), corstr="independence",
    +     missingmodel=O~bbmi+sbp+chol+age, SIMEXvariable=c("sbp","chol"),
    +     SIMEX.err=sigma, repeated=FALSE, B=100, lambda=seq(0, 2, 0.5))

    > summary(output2)
    Call: beta
                 Estimate    StdErr t.value   p.value
    (Intercept) -8.004577  2.060967 -3.8839 0.0001028 ***
    sbp          1.196363  0.356868  3.3524 0.0008011 ***
    chol         0.099984  0.264180  0.3785 0.7050810
    age          0.012718  0.017201  0.7394 0.4596520
    ---
    Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1  1

    Call: alpha
            Estimate    StdErr t.value  p.value
    alpha1  9.019084  3.086533  2.9221 0.003477 **
    alpha2 -0.786135  0.656843 -1.1968 0.231370
    alpha3 -0.568740  0.732885 -0.7760 0.437732
    alpha4 -0.128941  0.247757 -0.5204 0.602761
    alpha5 -0.064257  0.025982 -2.4731 0.013395 *
    ---
    Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1  1

The function swgee can store individual estimated coefficients in the simulation step, and this enables us to show the extrapolation curve through the developed R function plot.swgee. The plot.swgee function plots the extrapolation of the estimate of each covariate effect with the quadratic extrapolants. Figure 1 displays the graph for the variable SBP in the example for which the quadratic extrapolation function is applied from the following command:

    R> plot(output2,"sbp")
graphic without alt text
Figure 1: Display of the SIMEX estimate for the example: the dot is the SIMEX estimate obtained from the quadratic extrapolation.

Simulation studies

In this section, we conduct simulation studies to investigate the impact of ignoring covariate measurement error and response missingness on estimation, where the implementation is carried out using the usual GEE method. Furthermore, we assess the performance of the swgee method which accommodates the effects induces from error-prone covariates and missing responses. We set n=200 and m=3, and generate 500 simulations for each parameter configuration. Consider the logistic regression model logit(μij)=β0+βx1xij1+βx2xij2+βzzij, where β0=0, βx1=log(1.5), βx2=log(1.5), βz=log(0.75) and zij is generated independently from Bin(1,0.5) to represent a balanced design. The true covariate Xij=(xij1,xij2) is generated from the normal distribution N(μx,Σx), where μx=(0.5,0.5) and Σx=(σx12ρxσx1σx2ρxσx1σx2σx22) with σx1=σx2=1. The surrogate value Wij=(Wij1,Wij2) is generated from N(Xij,Σe) with Σe=(σ12ρσ1σ2ρσ1σ2σ22). ρ and ρx are set to 0.50 to represent moderate correlations. To feature minor, moderate and severe degrees of measurement error, we consider σ1,σ2= 0.25, 0.50 or 0.75. The missing data indicator is generated from model (4), where α0=α1=0.5, α2=α3=0.1, and αz=0.2. In implementing the swgee method, we choose B=100, λM=2, M=5, and a quadratic regression for each extrapolation step.

In Table 1, we report on the results of the biases of the estimates (Bias), the empirical standard error (SE), and the coverage rate (CR in percent) for 95% confidence intervals. When measurement error is minor, (i.e. σ1=σ2=0.25), both gee and swgee provide reasonable results with fairly small finite sample biases and coverage rates close to the nominal level 95%. When there is moderate or substantial measurement error in covariates Xij, the performance of the gee method deteriorates remarkably in estimation of error-prone covariate effects, leading to considerably biased estimates for βx1 and βx2. The corresponding coverage rates for 95% confidence intervals can be quite low. In contrast, the swgee method remarkably improve the performance, providing a lot smaller biases and much higher coverage rates. The estimates for βz are not subject to much impact of measurement error, which is partially attributed by that the precisely observed covariates zij are generated independently of error-prone covairates Xij under the current simulation study.

In summary, ignoring measurement error may lead to substantially biased results. Properly addressing covariate measurement error in estimation procedures is necessary. The proposed swgee method performs reasonably well under various configurations. As expected, its performance may become less satisfactory when measurement error becomes substantial. However, the swgee method does significantly improve the performance of the gee analysis.

Table 1: Simulation Results
σ1 σ2 Method βx1 βx2 βz
Bias SE CR Bias SE CR Bias SE CR
0.25 0.25 gee -0.0310 0.1228 92.6 -0.0158 0.1246 92.6 0.0063 0.2121 94.6
0.25 0.25 swgee -0.0062 0.1420 95.0 0.0104 0.1425 95.2 0.0036 0.2354 95.6
0.25 0.50 gee -0.0019 0.1212 95.4 -0.0997 0.1156 83.4 0.0082 0.2110 94.2
0.25 0.50 swgee -0.0003 0.1415 95.0 -0.0087 0.1543 93.0 0.0035 0.2361 95.6
0.25 0.75 gee 0.0328 0.1189 95.4 -0.1841 0.1022 51.0 0.0101 0.2100 94.0
0.25 0.75 swgee 0.0205 0.1407 95.8 -0.0660 0.1562 86.4 0.0046 0.2359 95.6
0.50 0.25 gee -0.1156 0.1114 78.2 0.0139 0.1236 94.2 0.0078 0.2113 94.6
0.50 0.25 swgee -0.0282 0.1520 93.2 0.0177 0.1431 95.4 0.0031 0.2362 95.2
0.50 0.50 gee -0.0948 0.1114 81.8 -0.0780 0.1161 85.6 0.0102 0.2099 94.2
0.50 0.50 swgee -0.0228 0.1510 93.8 -0.0022 0.1542 93.6 0.0030 0.2370 95.4
0.50 0.75 gee -0.0629 0.1103 87.8 -0.1727 0.1036 55.6 0.0125 0.2088 94.2
0.50 0.75 swgee -0.0052 0.1499 94.8 -0.0608 0.1570 87.2 0.0042 0.2369 95.2
0.75 0.25 gee -0.1991 0.0966 45.6 0.0484 0.1216 94.2 0.0092 0.2107 94.6
0.75 0.25 swgee -0.0870 0.1508 86.4 0.0395 0.1430 93.6 0.0034 0.2366 95.2
0.75 0.50 gee -0.1889 0.0976 50.0 -0.0458 0.1154 89.8 0.0121 0.2091 94.0
0.75 0.50 swgee -0.0831 0.1509 87.8 0.0165 0.1539 94.0 0.0034 0.2375 95.4
0.75 0.75 gee -0.1636 0.0974 58.8 -0.1468 0.1039 66.4 0.0147 0.2077 94.2
0.75 0.75 swgee -0.0678 0.1505 90.0 -0.0442 0.1574 88.8 0.0046 0.2374 95.2

6 Summary and discussion

Missing observations and covariate measurement error commonly arise in longitudinal data. However, there has been relatively little work on simultaneously accounting for the effects of response missingness and covariate measurement error on estimation of response model parameters for longitudinal data. described a simulation based marginal method to adjust for the biases induced by both missingness and covariate measurement error. The proposed method does not require the full specification of the distribution of the response vector but only requires modeling its mean and covariance structure. In addition, the distribution of covariates is left unspecified, which is desirable for many practical problems. These features make the proposed method flexible.

Here we not only develop the R package swgee to implement the method by , but also include an extended setting in the package. Our aim is to provide analysts an accessible tool for the analysis of longitudinal data with missing responses and error-prone covariates. Our illustrations show that the developed package has the advantages of simplicity and versatility.

7 Acknowledgments

Juan Xiong was supported by the Natural Science Foundation of SZU (grant no.2017094). Grace Y. Yi was supported by the Natural Sciences and Engineering Research Council of Canada. The authors thanks Boston University and the National Heart, Lung, and Blood Institute (NHLBI) for providing the data set from the Framingham Heart Study (No. N01-HC-25195) in the illustration. The Framingham Heart Study is conducted and supported by the NHLBI in collaboration with Boston University. This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or NHLBI.

Conflict of Interest: None declared.

CRAN packages used

gee, yags, wgeesel, geepack, mvtnorm

CRAN Task Views implied by cited packages

Distributions, Econometrics, Finance, MixedModels

Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Footnotes

    References

    J. P. Buonaccorsi. Measurement error: Models, methods, and applications. Boca Raton, Florida: Chapman & Hall/CRC, 2010.
    V. J. Carey. : Generalized estimation equation solver. 2015. R package version 4.13-19.
    V. J. Carey. : Yet another GEE solve. 2011. R package version 6.1-13.
    R. J. Carroll, D. Ruppert, L. A. Stefanski and C. M. Crainiceanu. Measurement error in nonlinear models: A modern perspective. 2nd ed Boca Raton, Florida: Chapman & Hall/CRC, 2006.
    J. R. Cook and L. A. Stefanski. Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89(428): 1314–1328, 1994. URL https://doi.org/10.1080/01621459.1994.10476871.
    V. Devanarayan and L. A. Stefanski. Empirical simulation extrapolation for measurement error models with replicate measurements. Statistics and Probability Letters, 59(3): 219–225, 2002. URL https://doi.org/10.1016/S0167-7152(02)00098-6.
    P. J. Diggle and M. G. Kenward. Informative drop-out in longitudinal data analysis (with discussion). Applied Statistics, 43(1): 49–93, 1994. URL https://doi.org/10.2307/2986113.
    W. A. Fuller. Measurement error models. New York: John Wiley & Sons, 1987.
    A. Genz and F. Bretz. Computation of multivariate normal and t probabilities. New York: Springer-Verlag, 2009.
    A. Genz, F. Bretz, T. Miwa, X. Mi and T. Hothorn. : Multivariate normal and t distributions. 2018. R package version 1.0-7.
    P. Gustafson. Measurement error and misclassification in statistics and epidemiology. Boca Raton, Florida: Chapman & Hall/CRC, 2003.
    W. He, J. Xiong and G. Y. Yi. SIMEX R package for accelerated failure time models with covariate measurement error. Journal of Statistical Software, 46(1): 1–14, 2012. URL https://doi.org/10.18637/jss.v046.c01.
    S. Hojsgaard, U. Halekoh and J. Yan. : Generalized estimating equation package. 2016. R package version 1.2-1.
    M. G. Kenward. Selection models for repeated measurements with non-random dropout: An illustration of sensitivity. Statistics in Medicine, 17(23): 2723–2732, 1998. URL https://doi.org/10.1002/(SICI)1097-0258(19981215)17:23<2723::AID-SIM38>3.0.CO;2-5.
    T. L. Lai and D. S. Small. Marginal regression analysis of longitudinal data with time-dependent covariates: A generalized method-of-moments approach. Journal of The Royal Statistical Society Series B-statistical Methodology, 69(1): 79–99, 2007. URL https://doi.org/10.1111/j.1467-9868.2007.00578.x.
    K. Y. Liang and S. L. Zeger. Longitudinal data analysis using generalized linear models. Biometrika, 73(1): 13–22, 1986. URL https://doi.org/10.2307/2336267.
    G. Lin and R. N. Rodriguez. Weighted methods for analyzing missing data with the GEE procedure. Paper SAS166-2015, 1–8, 2015.
    R. J. A. Little and D. B. Rubin. Statistical analysis with missing data. 2nd ed New Jersey: John Wiley & Sons, 2002.
    W. Liu and L. Wu. Simultaneous inference for semiparametric nonlinear mixed-effects models with covariate measurement errors and missing responses. Biometrics, 63(2): 342–350, 2007. URL https://doi.org/10.1111/j.1541-0420.2006.00687.x.
    J. S. Preisser, K. K. Lohman and P. J. Rathouz. Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random. Statistics in Medicine, 21(20): 3035–3054, 2002. URL https://doi.org/10.1002/sim.1241.
    A. Qu, G. Y. Yi, P. X. K. Song and P. Wang. Assessing the validity of weighted generalized estimating equations. Biometrika, 98(1): 215–224, 2011. URL https://doi.org/10.1093/biomet/asq078.
    J. M. Robins, A. Rotnitzky and L. Zhao. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association, 90(429): 106–121, 1995. URL https://doi.org/10.1080/01621459.1995.10476493.
    SAS Institute Inc. SAS/STAT software, version 13.2. Cary, NC, 2014. URL http://www.sas.com/.
    C. Y. Wang, Y. Huang, E. C. Chao and M. K. Jeffcoat. Expected estimating equations for missing data, measurement error, and misclassification, with application to longitudinal nonignorable missing data. Biometrics, 64(1): 85–95, 2008. URL https://doi.org/10.1111/j.1541-0420.2007.00839.x.
    C. Xu, Z. Li and M. Wang. : Weighted generalized estimating equations and model selection. 2018. R package version 1.5.
    G. Y. Yi. A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates. Biostatistics, 9(3): 501–512, 2008. URL https://doi.org/10.1093/biostatistics/kxm054.
    G. Y. Yi. Statistical analysis with measurement error or misclassification. New York: Springer-Verlag, 2017.
    G. Y. Yi, Y. Ma and R. J. Carroll. A functional generalized method of moments approach for longitudinal studies with missing responses and covariate measurement error. Biometrika, 99(1): 151–165, 2012. URL https://doi.org/10.1093/biomet/asr076.

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Xiong & Yi, "swgee: An R Package for Analyzing Longitudinal Data with Response Missingness and Covariate Measurement Error", The R Journal, 2019

    BibTeX citation

    @article{RJ-2019-031,
      author = {Xiong, Juan and Yi, Grace Y.},
      title = {swgee: An R Package for Analyzing Longitudinal Data with Response Missingness and Covariate Measurement Error},
      journal = {The R Journal},
      year = {2019},
      note = {https://rjournal.github.io/},
      volume = {11},
      issue = {1},
      issn = {2073-4859},
      pages = {416-426}
    }