ordGroupBoot {Hmisc} | R Documentation |
Minimally Group an Ordinal Variable So Bootstrap Samples Will Contain All Distinct Values
Description
When bootstrapping models for ordinal Y when Y is fairly continuous, it is frequently the case that one or more bootstrap samples will not include one or more of the distinct original Y values. When fitting an ordinal model (including a Cox PH model), this means that an intercept cannot be estimated, and the parameter vectors will not align over bootstrap samples. To prevent this from happening, some grouping of Y may be necessary. The ordGroupBoot
function uses cutGn()
to group Y so that the minimum number in any group is guaranteed to not exceed a certain integer m
. ordGroupBoot
tries a range of m
and stops at the lowest m
such that either all B
tested bootstrap samples contain all the original distinct values of Y (if B
>0), or that the probability that a given sample of size n
with replacement will contain all the distinct original values exceeds aprob
(B
=0). This probability is computed approximately using an approximation to the probability of complete sample coverage from the coupon collector's problem and is quite accurate for our purposes.
Usage
ordGroupBoot(
y,
B = 0,
m = 7:min(15, floor(n/3)),
what = c("mean", "factor", "m"),
aprob = 0.9999,
pr = TRUE
)
Arguments
y |
a numeric vector |
B |
number of bootstrap samples to test, or zero to use a coverage probability approximation |
m |
range of minimum group sizes to test; the default range is usually adequate |
what |
specifies that either the mean |
aprob |
minimum coverage probability sought |
pr |
set to |
Value
a numeric vector corresponding to y
but grouped, containing eithr the mean of y
in each group or a factor variable representing grouped y
, either with the minimum m
that satisfied the required sample covrage
Author(s)
Frank Harrell
See Also
Examples
set.seed(1)
x <- c(1:6, NA, 7:22)
ordGroupBoot(x, m=5:10)
ordGroupBoot(x, m=5:10, B=5000, what='factor')