select_opt {vecmatch} | R Documentation |
Select Optimal Parameter Combinations from Optimization Results
Description
select_opt()
is a helper function to filter and prioritize
results from optimize_gps()
based on the specific goals of a study.
Depending on the research design, certain pairwise comparisons or treatment
groups may be more important than others. For example:
You may want to prioritize matching between a specific groups (e.g. specific disease vs. controls), while ignoring other group comparisons during SMD evaluation.
You may wish to retain as many samples as possible from a critical group or set of groups, regardless of matching rates in other groups.
This function enables targeted selection of optimal parameter combinations by:
Evaluating SMDs for specific pairwise treatment group comparisons,
Selecting key covariates to assess balance,
Prioritizing matched sample size in selected treatment groups.
By combining these criteria, select_opt()
allows you to tailor the
optimization output to your study's focus - whether it emphasizes covariate
balance in targeted group comparisons or maximizing sample retention for
specific subgroups.
Usage
select_opt(
x,
smd_groups = NULL,
smd_variables = NULL,
smd_type = c("mean", "max"),
perc_matched = NULL
)
Arguments
x |
An object of class |
smd_groups |
A |
smd_variables |
A |
smd_type |
A |
perc_matched |
A |
Details
Optimization results are grouped into bins based on the
maximum SMD observed for each parameter combination. These bins follow
the same structure as in optimize_gps()
:
0.00-0.05
0.05-0.10
0.10-0.15
0.15-0.20
0.20-0.25
0.25-0.30
0.30-0.35
0.35-0.40
0.40-0.45
0.45-0.50
more than 0.50
Within each bin, models are first filtered based on their aggregated SMD
across the specified smd_groups
and smd_variables
, using the method
defined by smd_type
. Then, among the remaining models, the best-performing
one(s) are selected based on the percentage of matched samples - either
overall or in the specified treatment groups (perc_matched
).
Value
An S3 object of class select_result
, containing the filtered and
prioritized optimization results. The object includes:
A
data.frame
with selected parameter combinations and performance metrics.-
Attribute
param_df
: Adata.frame
with full parameter specifications (iter_ID
, GPS/matching parameters, etc.), useful for manually refitting or reproducing results.
The object also includes a custom print()
method that summarizes:
Number of selected combinations per SMD bin
Corresponding aggregated SMD (mean or max)
Overall or group-specific percentage matched
Examples
# Define formula and set up optimization
formula_cancer <- formula(status ~ age * sex)
opt_args <- make_opt_args(cancer, formula_cancer, gps_method = "m1")
## Not run:
withr::with_seed(8252, {
opt_results <- optimize_gps(
data = cancer,
formula = formula_cancer,
opt_args = opt_args,
n_iter = 2000
)
})
## End(Not run)
# Select optimal combinations prioritizing SMD balance and matching in key
# groups
## Not run:
select_results <- select_opt(
x = opt_results,
smd_groups = list(
c("adenoma", "controls"),
c("controls", "crc_beningn"),
c("crc_malignant", "controls")
),
smd_variables = "age",
smd_type = "max",
perc_matched = c("adenoma", "crc_malignant")
)
## End(Not run)