pval_gen {ezECM}R Documentation

Simulate p-values from earthquakes and explosions

Description

p-values are simulated for depth and first polarity discriminants to use in testing classification models.

Usage

pval_gen(
  sims = 100,
  grid.dim = c(800, 800, 30),
  seismometer = list(N = 100, max.depth = 2, Sig = NULL, sig.draws = c(15, 2)),
  explosion = list(max.depth = 5, prob = 0.4),
  pwave.arrival = list(H0 = 5, V = 5.9, optim.starts = 15),
  first.polarity = list(read.err = 0.95)
)

Arguments

sims

numeric stipulating the number of individual events to generate.

grid.dim

numeric 3-vector providing the extent of the coordinate system in c(X,Y,Z) to be used in km.

seismometer

list stipulating variables relating to the seismometers. Providing an incomplete list reverts to the default values. List elements are:

  • N a numeric stipulating the number of seismometers

  • max.depth is a numeric providing the maximum depth for a seismometer location in km

  • Sig supplying a numeric vector of length N to this argument will stipulate the variance in the error in observed arrival time of each of the N seismometers.

  • sig.draws a numeric 2-vector, if Sig is not provided the variance in arrival time at each station is drawn from MCMCpack::rinvgamma() using shape = sig.draws[1] and scale = sig.draws[2].

explosion

list stipulating variables regarding a detonation event. Providing an incomplete list reverts to the default values. List elements are:

  • max.depth is a numeric providing the maximum depth of an explosion in km

  • prob is the probability of an explosion, controlling the fraction of events which are explosions. Value provided must be in the interval [0,1]

pwave.arrival

list stipulating variables regarding the depth from p-wave arrival discriminant. Providing an incomplete list reverts to the default values. List elements are:

  • H0 a numeric providing the value of depth in km for the null hypothesis

  • V a numeric stipulating the velocity of p-waves in km. Used for simulating p-wave arrival times as an argument of time_fn().

  • optim.starts number of stats::optim() starts used to maximize likelihood of event location.

first.polarity

list stipulating variables regarding the depth from the polarity of first motion discriminant. List elements are:

  • read.err numeric in the interval [0,1] providing the probability of an error in reading the true first polarity.

Details

Methods are adapted from the discriminants listed in (Anderson et al. 2007).

Value

A data frame, with the number of rows equal to sims. Columns contain the p-values observed for each simulation along with the true event type.

Depth Discriminant

The equation below is used to model p-wave arrival time t_i at seismometer i.

t_i = t_0 + T(S_i, S_0) + \epsilon_i

Where t_0 is the time of the event, T() is a function modeling the arrival time (in this case time_fn), S_i is the location of seismometer i, S_0 is the location of the event, and \epsilon_i is normally distributed error with known variance \sigma^2_i. Given N seismometers, the MLE of the event time \hat{t}_0 can be solved as the following:

\hat{t}_0 = \frac{\mathrm{tr}\left(\Sigma^{-1} T_i\right) - \mathrm{tr}\left(\Sigma^{-1}T(S,S_0)\right)}{\mathrm{tr}(\Sigma^{-1})}

Where \mathrm{tr}() is the matrix trace operation, \Sigma is a matrix containing the elements of each \sigma_i^2 on the diagonal, T_i is a diagonal matrix containing each t_i, and T(S, S_0) is a diagonal matrix containing each T(S_i, S_0). This result is then plugged back into the first equation, which is then used in a normal likelihood function. Derivatives are taken of the likelihood so that a fast gradient based approch can be used to find the maximum likelihood estimate (MLE) of S_0.

The remainder of the calculation of the p-value is consistent with the Depth from P-Wave Arrival Times section of (Anderson et al. 2007). First note S_0 is equal to the 3-vector (X_0, Y_0, Z_0)^{\top}. Given the null hypothesis for the depth of \mathrm{H}_0: Z_0 \leq z_0, the MLE (\hat{X}_0, \hat{Y}_0) given Z_0 = z_0 is found. The sum of squared errors (SSE) is calculated as follows:

\mathrm{SSE}(S_0, t_0) = \sum_{i = 1}^N\left(\frac{t_i - t_0 - T(S_i, S_0)}{\sigma_i}\right)^2

If \mathrm{H}_0 is true then the following statistic has a central F distribution with 1 and N-4 degrees of freedom.

F_{1,N-4} = \frac{\mathrm{SSE}(S_0, t_0|Z_0 = z_0) - \mathrm{SSE}(S_0, t_0)}{\mathrm{SSE}(S_0, t_0)}

Because the test has directionality, the F statistic is then converted to a T statistic as such:

T_{N-4} = \mathrm{sign}(\hat{Z}_0 - z_0)\sqrt{F_{1,n-4}}

This T statistic is then used to compute the p-value

Polarity of First Motion

Under the null hypothesis that the event is an explosion, (and therefore the true polarity of first motion is always one), the error rate for mistakenly identifying the polarity of first motion is stipulated as the argument first.polarity$read.err. For an error rate of \theta the p-value can then be calculated as follows:

\sum_{i = 0}^n {N \choose i} \theta^i(1-\theta)^{N-i}

Where n is the number of stations where a positive first motion was observed, and N is the total number of stations.

References

Anderson DN, Fagan DK, Tinker MA, Kraft GD, Hutchenson KD (2007). “A mathematical statistics formulation of the teleseismic explosion identification problem with multiple discriminants.” Bulletin of the Seismological Society of America, 97(5), 1730–1741.

Examples


test.data <- pval_gen(sims = 5)



[Package ezECM version 1.0.0 Index]