select_opt {vecmatch}R Documentation

Select Optimal Parameter Combinations from Optimization Results

Description

select_opt() is a helper function to filter and prioritize results from optimize_gps() based on the specific goals of a study. Depending on the research design, certain pairwise comparisons or treatment groups may be more important than others. For example:

This function enables targeted selection of optimal parameter combinations by:

By combining these criteria, select_opt() allows you to tailor the optimization output to your study's focus - whether it emphasizes covariate balance in targeted group comparisons or maximizing sample retention for specific subgroups.

Usage

select_opt(
  x,
  smd_groups = NULL,
  smd_variables = NULL,
  smd_type = c("mean", "max"),
  perc_matched = NULL
)

Arguments

x

An object of class best_opt_result, produced by the optimize_gps() function.

smd_groups

A list of pairwise comparisons (as character vectors of length 2) specifying which treatment group comparisons should be prioritized in SMD evaluation. Each element must be a valid pair of treatment levels. If NULL, all pairwise comparisons are used. Example: list(c("adenoma", "crc_malignant"), c("controls", "adenoma"))

smd_variables

A character vector of covariate names to include in the SMD evaluation. Must match variables listed in attr(x, "model_covs").

smd_type

A character string ("mean" or "max"), defining how to aggregate SMDs across covariates and comparisons. "max" selects combinations with the lowest maximum SMD; "mean" uses the average SMD.

perc_matched

A character vector of treatment levels for which the matching rate should be maximized. If NULL, overall perc_matched is used. If specified, only the sum of matching percentages for the listed groups is used for selection within each SMD category.

Details

Optimization results are grouped into bins based on the maximum SMD observed for each parameter combination. These bins follow the same structure as in optimize_gps():

Within each bin, models are first filtered based on their aggregated SMD across the specified smd_groups and smd_variables, using the method defined by smd_type. Then, among the remaining models, the best-performing one(s) are selected based on the percentage of matched samples - either overall or in the specified treatment groups (perc_matched).

Value

An S3 object of class select_result, containing the filtered and prioritized optimization results. The object includes:

The object also includes a custom print() method that summarizes:

Examples

# Define formula and set up optimization
formula_cancer <- formula(status ~ age * sex)
opt_args <- make_opt_args(cancer, formula_cancer, gps_method = "m1")
## Not run: 
withr::with_seed(8252, {
  opt_results <- optimize_gps(
    data = cancer,
    formula = formula_cancer,
    opt_args = opt_args,
    n_iter = 2000
  )
})

## End(Not run)
# Select optimal combinations prioritizing SMD balance and matching in key
# groups
## Not run: 
select_results <- select_opt(
  x = opt_results,
  smd_groups = list(
    c("adenoma", "controls"),
    c("controls", "crc_beningn"),
    c("crc_malignant", "controls")
  ),
  smd_variables = "age",
  smd_type = "max",
  perc_matched = c("adenoma", "crc_malignant")
)

## End(Not run)

[Package vecmatch version 1.2.0 Index]