p_value {SimBaRepro} | R Documentation |
p_value
Description
Given the observed statistic and the given seeds, this function finds the p-value.
The method uses simulation-based inference, where having fixed seeds, the parameter is searched which makes the observed statistics most "plausible".
In particular, the T_stat
function measures the "plausibility" of any data point and the procedure maximizes the rank of the observed T_stat
value relative to the "repro" 'T_stat' values.
The p-value is determined from the maximum rank and the corresponding parameter is returned.
Usage
p_value(
lower_bds,
upper_bds,
seeds,
generating_fun,
s_obs,
theta_init = NULL,
T_stat = ma_depth,
verbose = FALSE,
check_input = TRUE
)
Arguments
lower_bds |
A vector containing the lower bounds for the parameter search space. |
upper_bds |
A vector containing the upper bounds for the parameter search space. |
seeds |
A matrix (or array) of seeds for generating artificial statistics. |
generating_fun |
A function that takes the random seeds above and a parameter in the search space as inputs to generate artificial statistics. |
s_obs |
A vector representing the observed statistic. |
theta_init |
A vector specifying the starting point for the initial |
T_stat |
See Vignette for detailed explanation. |
verbose |
A Boolean variable indicating whether or not to print out the |
check_input |
A Boolean variable indicating whether or not to run checks on the function inputs. |
Value
A list containing the most likely parameter in the search region (theta_hat
) and its corresponding p-value (p_val
).
Examples
### Regular Normal
set.seed(123)
n <- 50 # sample size
R <- 50 # Repro sample size (should be at least 200 for accuracy in practice)
s_obs <- c(1.12, 0.67) # the observed sample mean and variance
seeds <- matrix(rnorm(R * (n + 2)), nrow = R, ncol = n + 2) # pre-generated seeds
# this function computes the repro statistics given the seeds and the parameter
s_sample <- function(seeds, theta) {
# generate the raw data points
raw_data <- theta[1] + sqrt(theta[2]) * seeds[, 1:n]
# compute the regular statistics
s_mean <- apply(raw_data, 1, mean)
s_var <- apply(raw_data, 1, var)
return(cbind(s_mean, s_var))
}
lower_bds <- c(-5, 0.01) # lower bounds for null hypothesis region
upper_bds <- c(5, 5) # upper bounds for null hypothesis region
result <- p_value(lower_bds, upper_bds, seeds, s_sample, s_obs)
print(result$p_val) # the largest p_value found
print(result$theta_hat) # the parameter corresponding to the largest p value