estimate_weights {maicplus} | R Documentation |
Derive individual weights in the matching step of MAIC
Description
Assuming data is properly processed, this function takes individual patient data (IPD) with centered covariates (effect modifiers and/or prognostic variables) as input, and generates weights for each individual in IPD trial to match the covariates in aggregate data.
The plot function displays individuals weights with key summary in top right legend that includes
median weight, effective sample size (ESS), and reduction percentage (what percent ESS is reduced from the
original sample size). There are two options of plotting: base R plot and ggplot
. The default
for base R plot is to plot unscaled and scaled separately. The default
for ggplot
is to plot unscaled and scaled weights on a same plot.
Usage
estimate_weights(
data,
centered_colnames = NULL,
start_val = 0,
method = "BFGS",
n_boot_iteration = NULL,
set_seed_boot = 1234,
boot_strata = "ARM",
...
)
## S3 method for class 'maicplus_estimate_weights'
plot(
x,
ggplot = FALSE,
bin_col = "#6ECEB2",
vline_col = "#688CE8",
main_title = NULL,
scaled_weights = TRUE,
bins = 50,
...
)
Arguments
data |
a numeric matrix, centered covariates of IPD, no missing value in any cell is allowed |
centered_colnames |
a character or numeric vector (column indicators) of centered covariates |
start_val |
a scalar, the starting value for all coefficients of the propensity score regression |
method |
a string, name of the optimization algorithm (see 'method' argument of |
n_boot_iteration |
an integer, number of bootstrap iterations. By default is NULL which means bootstrapping
procedure will not be triggered, and hence the element |
set_seed_boot |
a scalar, the random seed for conducting the bootstrapping, only relevant if
|
boot_strata |
a character vector of column names in |
... |
Additional |
x |
object from estimate_weights |
ggplot |
indicator to print base weights plot or |
bin_col |
a string, color for the bins of histogram |
vline_col |
a string, color for the vertical line in the histogram |
main_title |
title of the plot. For ggplot, name of scaled weights plot and unscaled weights plot, respectively. |
scaled_weights |
(base plot only) an indicator for using scaled weights instead of regular weights |
bins |
( |
Value
a list with the following 4 elements,
- data
a data.frame, includes the input
data
with appended column 'weights' and 'scaled_weights'. Scaled weights has a summation to be the number of rows indata
that has no missing value in any of the effect modifiers- centered_colnames
column names of centered effect modifiers in
data
- nr_missing
number of rows in
data
that has at least 1 missing value in specified centered effect modifiers- ess
effective sample size, square of sum divided by sum of squares
- opt
R object returned by
base::optim()
, for assess convergence and other details- boot_strata
'strata' from a boot::boot object
- boot_seed
column names in
data
of the stratification factors- boot
a n by 2 by k array or NA, where n equals to number of rows in
data
, and k equalsn_boot_iteration
. The 2 columns in the second dimension include a column of numeric indexes of the rows indata
that are selected at a bootstrapping iteration and a column of weights.boot
is NA when argumentn_boot_iteration
is set as NULL
Methods (by generic)
-
plot(maicplus_estimate_weights)
: Plot method for estimate_weights objects
Examples
data(centered_ipd_sat)
centered_colnames <- grep("_CENTERED", colnames(centered_ipd_sat), value = TRUE)
weighted_data <- estimate_weights(data = centered_ipd_sat, centered_colnames = centered_colnames)
# To later estimate bootstrap confidence intervals, we calculate the weights
# for the bootstrap samples:
weighted_data_boot <- estimate_weights(
data = centered_ipd_sat, centered_colnames = centered_colnames, n_boot_iteration = 100
)
plot(weighted_sat)
if (requireNamespace("ggplot2")) {
plot(weighted_sat, ggplot = TRUE)
}