decide_variable_type_univariate {SplitWise}R Documentation

Decide Variable Type (Univariate)

Description

For each numeric predictor, this function fits a shallow (maxdepth = 2) rpart tree directly on Y ~ x and tests whether a dummy transformation improves model fit.

Usage

decide_variable_type_univariate(
  X,
  Y,
  minsplit = 5,
  criterion = c("AIC", "BIC"),
  exclude_vars = NULL,
  verbose = FALSE
)

Arguments

X

A data frame of numeric predictors (no response).

Y

A numeric response vector.

minsplit

Minimum number of observations in a node to consider splitting. Default = 5.

criterion

A character string: either "AIC" or "BIC". Default = "AIC".

exclude_vars

A character vector of variable names to exclude from dummy transformations. These variables will always be treated as linear. Default = NULL.

verbose

Logical; if TRUE, prints messages for debugging. Default = FALSE.

Details

Dummy forms come from a shallow (maxdepth = 2) rpart tree fit to the data. We extract up to two splits:

The function then picks the form (linear, single-split dummy, or double-split dummy) that yields the lowest AIC/BIC. If a variable is listed in exclude_vars, it will always be used as a linear predictor (dummy transformation is never attempted).

Value

A named list of decisions, where each element is a list with:

type

Either "dummy" or "linear".

cutoffs

A numeric vector (length 1 or 2) if type = "dummy", or NULL if linear.

tree_model

The fitted rpart model (for reference) or NULL if excluded.


[Package SplitWise version 1.0.0 Index]