fit_p {binaryRL} | R Documentation |
Step 3: Optimizing parameters to fit real data
Description
This function is designed to fit the optimal parameters of black-box
functions (models) to real-world data. Provided that the black-box
function adheres to the specified interface
(demo:
TD
,
RSTD
,
Utility
)
, this function can employ the various optimization algorithms detailed
below to find the best- fitting parameters for your model.
The function provides several optimization algorithms:
1. L-BFGS-B (from
stats::optim
)2. Simulated Annealing (
GenSA::GenSA
)3. Genetic Algorithm (
GA::ga
)4. Differential Evolution (
DEoptim::DEoptim
)5. Particle Swarm Optimization (
pso::psoptim
)6. Bayesian Optimization (
mlrMBO::mbo
)7. Covariance Matrix Adapting Evolutionary Strategy (
cmaes::cma_es
)8. Nonlinear Optimization (
nloptr::nloptr
)
For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/
Usage
fit_p(
data,
id = NULL,
n_trials = NULL,
fit_model = list(TD, RSTD, Utility),
funcs = NULL,
model_name = c("TD", "RSTD", "Utility"),
lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)),
upper = list(c(1, 1), c(1, 1, 1), c(1, 1, 1)),
initial_params = NA,
initial_size = 50,
iteration = 10,
seed = 123,
nc = 1,
algorithm
)
Arguments
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[vector]
Specifies the subject ID(s) for whom optimal parameters are to be fitted.
If you intend to fit all subjects within your dataset, it is strongly
recommended to use
|
n_trials |
[integer] Represents the total number of trials a single subject experienced in the experiment. If this parameter is kept at its default value of 'NULL', the program will automatically detect how many trials a subject experienced from the provided data. This information is primarily used for calculating model fit statistics such as AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion).
|
fit_model |
[list] A collection of functions applied to fit models to the data. |
funcs |
[character] A character vector containing the names of all user-defined functions required for the computation. When parallel computation is enabled (i.e., 'nc > 1'), user-defined models and their custom functions might not be automatically accessible within the parallel environment. Therefore, if you have created your own reinforcement learning model
that modifies the package's default four default functions
(default functions:
|
model_name |
[list] The name of fit modals |
lower |
[list] The lower bounds for model fit models |
upper |
[list] The upper bounds for model fit models |
initial_params |
[vector]
Initial values for the free parameters that the optimization algorithm will
search from. These are primarily relevant when using algorithms that require
an explicit starting point, such as
|
initial_size |
[integer]
This parameter corresponds to the population size in genetic
algorithms (
|
iteration |
[integer] The number of iterations the optimization algorithm will perform when searching for the best-fitting parameters during the fitting phase. A higher number of iterations may increase the likelihood of finding a global optimum but also increases computation time. |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run.
|
nc |
[integer] Number of cores to use for parallel processing. Since fitting optimal parameters for each subject is an independent task, parallel computation can significantly speed up the fitting process:
|
algorithm |
[character] Choose an algorithm package from 'L-BFGS-B', 'GenSA', 'GA', 'DEoptim', 'PSO', 'Bayesian', 'CMA-ES'. In addition, any algorithm from the 'nloptr' package is also supported. If your chosen 'nloptr' algorithm requires a local search, you need to input a character vector. The first element represents the algorithm used for global search, and the second element represents the algorithm used for local search. |
Value
The optimal parameters found by the algorithm for each subject,
along with the model fit calculated using these parameters.
This is returned as an object of class binaryRL
containing results
for all subjects with all models.
Note
The primary difference between 'fit_p' and 'rcv_d' lies in their data source: 'fit_p' fits parameters to real data, while 'rcv_d' fits parameters to synthetic data generated by models. Many studies inappropriately skip the 'rcv_d' step.
Imagine 'fit_p' as the actual boxing match, and 'rcv_d' as the weigh-in. Boxers of different weight classes shouldn't compete directly.
Similarly, if a competing model lacks the ability of parameter or model recovering, even if your proposed model outperforms it in 'fit_p', it doesn't necessarily make your proposed model a good one.
Examples
## Not run:
comparison <- binaryRL::fit_p(
data = binaryRL::Mason_2024_Exp2,
id = unique(binaryRL::Mason_2024_Exp2$Subject),
##-----------------------------------------------------------------------------##
##----------------------------- black-box function ----------------------------##
#funcs = c("your_funcs"),
fit_model = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility),
model_name = c("TD", "RSTD", "Utility"),
lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)),
upper = list(c(1, 10), c(1, 1, 10), c(1, 1, 10)),
##----------------------------- interation number -----------------------------##
iteration = 10,
##-------------------------------- algorithms ---------------------------------##
nc = 1, # <nc > 1>: parallel computation across subjects
# Base R Optimization
algorithm = "L-BFGS-B" # Gradient-Based (stats)
##-----------------------------------------------------------------------------##
# Specialized External Optimization
#algorithm = "GenSA" # Simulated Annealing (GenSA)
#algorithm = "GA" # Genetic Algorithm (GA)
#algorithm = "DEoptim" # Differential Evolution (DEoptim)
#algorithm = "PSO" # Particle Swarm Optimization (pso)
#algorithm = "Bayesian" # Bayesian Optimization (mlrMBO)
#algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes)
##-----------------------------------------------------------------------------##
# Optimization Library (nloptr)
#algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA")
##-------------------------------- algorithms ---------------------------------##
#################################################################################
)
result <- dplyr::bind_rows(comparison)
# Ensure the output directory exists before writing
if (!dir.exists("../OUTPUT")) {
dir.create("../OUTPUT", recursive = TRUE)
}
write.csv(result, "../OUTPUT/result_comparison.csv", row.names = FALSE)
## End(Not run)