sensitivity_analysis_qual {commecometrics} | R Documentation |
Perform sensitivity analysis on ecometric models (qualitative environmental variables)
Description
Evaluates how varying sample sizes affect the performance of ecometric models, focusing on two aspects:
-
Sensitivity (internal consistency): How accurately the model predicts environmental conditions on the same data it was trained on.
-
Transferability (external applicability): How well the model performs on unseen data.
It tests different sample sizes by resampling the data multiple times (bootstrap iterations), training an ecometric model on each subset, and evaluating prediction error and correlation.
Usage
sensitivity_analysis_qual(
points_df,
category_col,
sample_sizes,
iterations = 20,
test_split = 0.2,
grid_bins_1 = NULL,
grid_bins_2 = NULL,
parallel = TRUE,
n_cores = parallel::detectCores() - 1
)
Arguments
points_df |
Output first element of the list from |
category_col |
Name of the column containing the categorical trait. |
sample_sizes |
Numeric vector specifying the number of communities (sampling points)
to evaluate in the sensitivity analysis. For each value, a random subset of the data of that
size is drawn without replacement and then split into training and testing sets using the
proportion defined by |
iterations |
Number of bootstrap iterations per sample size (default = 20). |
test_split |
Proportion of data to use for testing (default = 0.2). |
grid_bins_1 |
Number of bins for the first trait axis. If |
grid_bins_2 |
Number of bins for the second trait axis. If |
parallel |
Logical; whether to run iterations in parallel (default = TRUE). |
n_cores |
Number of cores for parallelization (default = detectCores() - 1). |
Details
Two plots are generated:
-
Training Accuracy vs. Sample size: Reflects internal model consistency.
-
Testing Accuracy vs. Sample size: Reflects external model performance.
Parallel processing is supported to speed up the analysis.
Value
A list containing:
combined_results |
All raw iteration results. |
summary_results |
Mean accuracy per sample size. |
Examples
# Load internal data
data("geoPoints", package = "commecometrics")
data("traits", package = "commecometrics")
data("spRanges", package = "commecometrics")
# Summarize trait values at sampling points
traitsByPoint <- summarize_traits_by_point(
points_df = geoPoints,
trait_df = traits,
species_polygons = spRanges,
trait_column = "RBL",
species_name_col = "sci_name",
continent = FALSE,
parallel = FALSE
)
# Run sensitivity analysis for dominant land cover class
sensitivityQual <- sensitivity_analysis_qual(
points_df = traitsByPoint$points,
category_col = "vegetation",
sample_sizes = seq(40, 90, 10),
iterations = 5,
parallel = FALSE
)
# View results
head(sensitivityQual$summary_results)