glmfitmiss-package {glmfitmiss} | R Documentation |
glmfitmiss: Fitting Binary Regression Models with Missing Data
Description
The glmfitmiss package provides functions for fitting binary regression models in the presence of missing data in both response variable level and covariate levels. The package includes likelihood-based methods, primarily based on the EM algorithm by Ibrahim (1990) for handling missing data mechanisms. Bias-reducing adjusted score approaches introduced by Firth (1993) are also incorporated in all the supported methods.
Details
This package enhances the accuracy of binary regression modeling in the presence of missing data by incorporating Ibrahim (1990) EM algorithm and Firth (1993) bias-reducing adjusted score methods.
The main functions in this package are:
-
emBinRegMAR
: Fits a binary regression model with missing categorical covariates. Assumes missing data are Missing at Random (MAR). -
emBinRegNonIG
: Fits a binary regression model with missing responses that are nonignorable based on Ibrahim and Lipsitz (1996). -
emBinRegMixedMAR
: Fits a binary regression model with missing responses and covariates, accounting for the non-ignorable missing responses assumption and Missing at Random (MAR) missing covariates. -
logRegMAR
: Fits a logistic regression model (binary regression with a link=logit) with missing categorical covariates that are Missing at Random (MAR).
The other functions and data included in this package are
-
emforbeta
: The function to fit binary regression models with missing categorical covariates is implemented using a likelihood-based method, specifically the EM algorithm proposed by Ibrahim (1990). -
est
: Example using Eastern Cooperative Oncology Group clinical trial. -
meningitis
: Example using Meningococcal Disease Data. -
metastmelanoma
: Example from a cancer clinical trial metastatic melanoma – Kirkwood et al. (1996). -
emyxmiss
: The main function fits binary regression models while accounting for missing responses and missing categorical covariates. This function implements a novel likelihood-based method using the EM algorithm. For more information, refer to the work by Pradhan, Nychka, and Bandyopadhyay (2025). -
meningitis60ymis
: Meningococcal Disease Data with missing response variable. -
llkmiss
: Log-likelihood function for models with missing data with out using EM-algorithms. -
est45
: Example using Eastern Cooperative Oncology Group clinical trial – a subset of the 'est' data. -
simulateData
: Function to simulate response data. -
simulateCovariateData
: Function to simulate covariate data. -
felinedata
: Sykes et al. (1999) data, the risk factors for Chlamy, a chlamydial infection in cats. -
simulateMissDfYorX
: This function generates missing covariate or missing responses data. The missing data generation in the last two supplied covariates will be generated based on a predefined mechanisms. Missing data generation in the response variable will be based on the suppilied true alpha. -
sixcitydata
: Longitudinal study of health effects of air pollution using data from six cities Ware et al. (1984). -
ibrahim
: Example dataset used in Ibrahim (1990, JASA). -
testyxm
: Function for testing models with missing data.
Author(s)
Maintainer: Vivek Pradhan vpradhan2009@gmail.com
Authors:
Douglas Nychka nychka@mines.edu
Soutir Bandyopadhyay bsoutir@gmail.com
References
Firth, D. (1993). Bias reduction of maximum likelihood estimates, Biometrika, 80, 27-38. doi:10.2307/2336755.
Ibrahim, J. G. (1990). Incomplete data in generalized linear models. Journal of the American Statistical Association 85, 765–769.
Ibrahim, J. G., and Lipsitz, S. R. (1996). Parameter Estimation from Incomplete Data in Binomial Regression when the Missing Data Mechanism is Nonignorable, Biometrics, 52, 1071–1078.
Kosmidis, I., Firth, D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71-82. doi:10.1093/biomet/asaa052.
Louis, T. A. (1982). Finding the observed information when using the EM algorithm. Proceedings of the Royal Statistical Society, Ser B, 44, 226-233.
Maiti, T., Pradhan, V. (2009). Bias reduction and a solution of separation of logistic regression with missing covariates. Biometrics, 65, 1262-1269.
Pradhan, V., Nychka, D. and Bandyopadhyay, S. (2025). Beyond the Odds: Fitting Logistic Regression with Missing Data in Small Samples (submitted).
Pradhan, V., Nychka, D., and Bandyopadhyay, S. (2025). Addressing Missing Responses and Categorical Covariates in Binary Regression Modeling: An Integrated Framework (to be submitted).
Pradhan, V., Nychka, D., and Bandyopadhyay, S. (2025). Bridging Gaps in Logistic Regression: Tackling Missing Categorical Covariates with a New Likelihood Method (to be submitted).
Pradhan, V., Nychka, D., and Bandyopadhyay, S. (2025). glmFitMiss: Binary Regression with Missing Data in R (to be submitted).
See Also
emBinRegMAR, emBinRegMixedMAR, logRegMAR, meningitis, emforbeta, meningitis60ymis, emyxmiss, est, metastmelanoma, simulateCovariateData, est45, simulateData, felinedata, sixcitydata, ibrahim, testyxm, llkmiss