glmmPen: High Dimensional Penalized Generalized Linear Mixed Models

Hillary M. Heiling; Naim U. Rashid; Quefeng Li; Joseph G. Ibrahim

doi:10.32614/RJ-2023-086

0.1 Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-086.zip

F. A. Archila. : Maximum likelihood estimation for generalized linear mixed models. 2020. URL https://CRAN.R-project.org/package=mcemGLM. R package version 1.1.1.

D. Bates, M. Mächler, B. Bolker and S. Walker. Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1): 1–48, 2015. URL https://doi.org/10.18637/jss.v067.i01.

B. M. Bolker, M. E. Brooks, C. J. Clark, S. W. Geange, J. R. Poulsen, M. H. H. Stevens and J.-S. S. White. Generalized linear mixed models: A practical guide for ecology and evolution. Trends in ecology & evolution, 24(3): 127–135, 2009. URL https://doi.org/10.1016/j.tree.2008.10.008.

H. D. Bondell, A. Krishna and S. K. Ghosh. Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics, 66(4): 1069–1077, 2010. URL https://doi.org/10.1111/j.1541-0420.2010.01391.x.

J. G. Booth and J. P. Hobert. Maximizing generalized linear mixed model likelihoods with an automated monte carlo EM algorithm. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(1): 265–285, 1999. URL https://doi.org/10.1111/1467-9868.00176.

P. Breheny and J. Huang. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics, 5(1): 232–253, 2011. URL https://doi.org/10.1214/10-AOAS388.

P. Breheny and J. Huang. Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and Computing, 25(2): 173–187, 2015. URL https://doi.org/10.1007/s11222-013-9424-2.

B. Carpenter, A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li and A. Riddell. Stan: A probabilistic programming language. Journal of Statistical Software, 76(1): 2017. URL https://doi.org/10.18637/jss.v076.i01.

Z. Chen and D. B. Dunson. Random effects selection in linear mixed models. Biometrics, 59(4): 762–769, 2003. URL https://doi.org/10.1111/j.0006-341X.2003.00089.x.

C. Dean and J. D. Nielsen. Generalized linear mixed models: A review and some extensions. Lifetime data analysis, 13: 497–512, 2007. URL https://doi.org/10.1007/s10985-007-9065-x.

M. Delattre, M. Lavielle, M.-A. Poursat, et al. A note on BIC in mixed-effects models. Electronic Journal of Statistics, 8(1): 456–475, 2014. URL https://doi.org/10.1214/14-EJS890.

M. Donohue, R. Overholser, R. Xu and F. Vaida. Conditional akaike information under generalized linear and proportional hazards mixed models. Biometrika, 98(3): 685–700, 2011. URL https://doi.org/10.1093/biomet/asr023.

D. Eddelbuettel and R. François. : Seamless r and c++ integration. Journal of Statistical Software, 40(8): 1–18, 2011. URL http://www.jstatsoft.org/v40/i08/.

D. Eddelbuettel and C. Sanderson. : Accelerating r with high-performance c++ linear algebra. Computational Statistics and Data Analysis, 71: 1054–1063, 2014. URL https://doi.org/10.1016/j.csda.2013.02.005.

Y. Fan and R. Li. Variable selection in linear mixed effects models. Annals of Statistics, 40(4): 2043, 2012. URL https://doi.org/10.1214/12-AOS1028.

D. J. Feaster, S. Mikulich-Gilbertson and A. M. Brincks. Modeling site effects in the design and analysis of multi-site trials. The American journal of drug and alcohol abuse, 37(5): 383–391, 2011. URL https://doi.org/10.3109/00952990.2011.600386.

G. M. Fitzmaurice, N. M. Laird and J. H. Ware. Applied longitudinal analysis. 2nd ed John Wiley & Sons, 2012. URL https://doi.org/10.1002/9781119513469.

J. Friedman, T. Hastie and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1): 1–22, 2010. URL https://www.jstatsoft.org/v33/i01/.

R. I. Garcia, J. G. Ibrahim and H. Zhu. Variable selection for regression models with missing data. Statistica Sinica, 20(1): 149, 2010. URL https://pubmed.ncbi.nlm.nih.gov/20336190/.

G. H. Givens and J. A. Hoeting. Computational statistics. 2nd ed 2012. John Wiley & Sons. URL https://doi.org/10.1111/j.1467-985X.2006.00430_5.x.

A. Groll. glmmLasso: Variable selection for generalized linear mixed models by L1-penalized estimation. 2017. URL https://CRAN.R-project.org/package=glmmLasso. R package version 1.5.1.

M. J. Gurka, L. J. Edwards and K. E. Muller. Avoiding bias in mixed model inference for fixed effects. Statistics in Medicine, 30(22): 2696–2707, 2011. URL https://doi.org/10.1002/sim.4293.

J. D. Hadfield. MCMC methods for multi-response generalized linear mixed models: The r package. Journal of Statistical Software, 33(2): 1–22, 2010. URL https://www.jstatsoft.org/v33/i02/.

M. D. Hoffman and A. Gelman. The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo. Journal of Machine Learning Research, 15(1): 1593–1623, 2014. URL https://dl.acm.org/doi/abs/10.5555/2627435.2638586.

J. G. Ibrahim, H. Zhu, R. I. Garcia and R. Guo. Fixed and random effects selection in mixed effects models. Biometrics, 67(2): 495–503, 2011. URL https://doi.org/10.1111/j.1541-0420.2010.01463.x.

M. J. Kane, J. Emerson and S. Weston. Scalable strategies for computing with massive data. Journal of Statistical Software, 55(14): 1–19, 2013. URL http://www.jstatsoft.org/v55/i14/.

K. Kleinman, R. Lazarus and R. Platt. A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. American Journal of Epidemiology, 159(3): 217–224, 2004. URL https://doi.org/10.1093/aje/kwh029.

I. H. Langford. Using a generalized linear mixed model to analyze dichotomous choice contingent valuation data. Land Economics, 507–514, 1994. URL https://doi.org/10.2307/3146644.

J. Lorah and A. Womack. Value of sample size for computation of the bayesian information criterion (BIC) in multilevel modeling. Behavior Research Methods, 51(1): 440–450, 2019. URL https://doi.org/10.3758/s13428-018-1188-3.

S. Ma, S. Ogino, P. Parsana, R. Nishihara, Z. Qian, J. Shen, K. Mima, Y. Masugi, Y. Cao, J. A. Nowak, et al. Continuity of transcriptomes among colorectal cancer subtypes based on meta-analysis. Genome Biology, 19(1): 142, 2018. URL https://doi.org/10.1186/s13059-018-1511-4.

I. Misztal. Reliable computing in estimation of variance components. Journal of Animal Breeding and Genetics, 125(6): 363–370, 2008. URL https://doi.org/10.1111/j.1439-0388.2008.00774.x.

R. A. Moffitt, R. Marayati, E. L. Flate, K. E. Volmar, S. G. H. Loeza, K. A. Hoadley, N. U. Rashid, L. A. Williams, S. C. Eaton, A. H. Chung, et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nature Genetics, 47(10): 1168, 2015. URL https://doi.org/10.1038/ng.3398.

A. Pajor. Estimating the marginal likelihood using the arithmetic mean identity. Bayesian Analysis, 12(1): 261–287, 2017. URL https://doi.org/10.1214/16-BA1001.

P. Patil and G. Parmigiani. Training replicable predictors in multiple studies. Proceedings of the National Academy of Sciences, 115(11): 2578–2583, 2018. URL https://doi.org/10.1073/pnas.1708283115.

J. Pinheiro, D. Bates, S. DebRoy, D. Sarkar and R Core Team. : Linear and nonlinear mixed effects models. 2021. URL https://CRAN.R-project.org/package=nlme. R package version 3.1-152.

N. U. Rashid, Q. Li, J. J. Yeh and J. G. Ibrahim. Modeling between-study heterogeneity for improved replicability in gene signature selection and clinical prediction. Journal of the American Statistical Association, 115(531): 1125–1138, 2020. URL https://doi.org/10.1080/01621459.2019.1671197.

M. Riester, W. Wei, L. Waldron, A. C. Culhane, L. Trippa, E. Oliva, S. Kim, F. Michor, C. Huttenhower, G. Parmigiani, et al. Risk prediction for late-stage ovarian cancer by meta-analysis of 1525 patient samples. JNCI: Journal of the National Cancer Institute, 106(5): 2014. URL https://doi.org/10.1093/jnci/dju048.

G. O. Roberts and J. S. Rosenthal. Examples of adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2): 349–367, 2009. URL https://doi.org/10.1002/wics.1307.

SAS Institute Inc. SAS/STAT software, version 9.2. Cary, NC, 2008. URL http://www.sas.com/.

A. W. Schmidt-Catran and M. Fairbrother. The random effects in multilevel models: Getting them wrong and getting them right. European Sociological Review, 32(1): 23–38, 2016. URL https://doi.org/10.1093/esr/jcv090.

Stan Development Team. : The r interface to stan. 2020. URL http://mc-stan.org/. R package version 2.21.2.

M. Szyszkowicz. Use of generalized linear mixed models to examine the association between air pollution and health outcomes. International Journal of Occupational Medicine and Environmental Health, 19(4): 224–227, 2006. URL https://doi.org/10.2478/v10001-006-0032-7.

J. A. Thompson, K. L. Fielding, C. Davey, A. M. Aiken, J. R. Hargreaves and R. J. Hayes. Bias and inference from misspecified mixed-effect models in stepped wedge trial analysis. Statistics in Medicine, 36(23): 3670–3682, 2017. URL https://doi.org/10.1002/sim.7348.

J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander and J. M. Stuart. The cancer genome atlas pan-cancer analysis project. Nature genetics, 45(10): 1113–1120, 2013. URL https://doi.org/10.1038/ng.2764.

H. Wickham. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York, 2016. URL https://ggplot2.tidyverse.org.

glmmPen: High Dimensional Penalized Generalized Linear Mixed Models

0.1 Supplementary materials

References

Reuse

Citation