step_with_na {guideR}R Documentation

Apply step(), taking into account missing values

Description

When your data contains missing values, concerned observations are removed from a model. However, then at a later stage, you try to apply a descending stepwise approach to reduce your model by minimization of AIC, you may encounter an error because the number of rows has changed.

Usage

step_with_na(model, ...)

## Default S3 method:
step_with_na(model, ..., full_data = eval(model$call$data))

## S3 method for class 'svyglm'
step_with_na(model, ..., design)

Arguments

model

A model object.

...

Additional parameters passed to stats::step().

full_data

Full data frame used for the model, including missing data.

design

Survey design previously passed to survey::svyglm().

Details

step_with_na() applies the following strategy:

step_with_na() has been tested with stats::lm(), stats::glm(), nnet::multinom(), survey::svyglm() and survival::coxph(). It may be working with other types of models, but with no warranty.

In some cases, it may be necessary to provide the full dataset initially used to estimate the model.

step_with_na() may not work inside other functions. In that case, you may try to pass full_data to the function.

Value

The stepwise-selected model.

Examples

set.seed(42)
d <- titanic |>
  dplyr::mutate(
    Group = sample(
      c("a", "b", NA),
      dplyr::n(),
      replace = TRUE
    )
  )
mod <- glm(as.factor(Survived) ~ ., data = d, family = binomial())
# step(mod) should produce an error
mod2 <- step_with_na(mod, full_data = d)
mod2


## WITH SURVEY ---------------------------------------

library(survey)
ds <- d |>
  dplyr::mutate(Survived = as.factor(Survived)) |>
  srvyr::as_survey()
mods <- survey::svyglm(
  Survived ~ Class + Group + Sex,
  design = ds,
  family = quasibinomial()
)
mod2s <- step_with_na(mods, design = ds)
mod2s


[Package guideR version 0.4.0 Index]