par_hierarchy {chopin} | R Documentation |
Parallelize spatial computation by hierarchy in input data
Description
"Hierarchy" refers to a system,
which divides the entire study region into multiple subregions.
It is oftentimes reflected in an area code system
(e.g., FIPS for US Census geographies and
Nomenclature of Territorial Units for Statistics (NUTS), etc.).
future::multisession
, future::multicore
, future::cluster
,
future.mirai::mirai_multisession
in future::plan
will parallelize the work by splitting lower level features into
several higher level feature group.
For details of the terminology in future
package,
please refer to future::plan
documentation.
Each thread will process the number of lower level features
in each higher level feature. Be advised that
accessing the same file simultaneously with
multiple processes may result in errors.
Usage
par_hierarchy(
regions,
regions_id = NULL,
length_left = NULL,
pad = 0,
pad_y = FALSE,
fun_dist,
...,
.debug = FALSE
)
Arguments
regions |
|
regions_id |
character(1). Name of unique ID field in |
length_left |
integer(1). Length of the first characters of
the |
pad |
numeric(1). Padding distance for each subregion defined
by |
pad_y |
logical(1). Whether to filter y with the padded grid.
Should be TRUE when x is where the values are calculated.
Default is |
fun_dist |
|
... |
Arguments passed to the argument |
.debug |
logical(1). Default is |
Details
In dynamic dots (...
), fun_dist
arguments should include
x and y where sf/terra class objects or file paths are accepted.
Hierarchy is interpreted by the regions_id
argument first.
regions_id
is assumed to be a field name in the x
or y
argument
object. It is expected that regions
represents the higher level
boundaries and x
or y
in fun_dist
is the lower level boundaries.
However, if that is not the case, with trim
argument, the function
will generate the higher level codes from regions_id
by extracting
left-t
Whether x
or y
is searched is determined by pad_y
value.
pad_y = TRUE
will make the function attempt to find regions_id
in x
, whereas pad_y = FALSE
will look for regions_id
at
y
. If the regions_id
doesn't exist in x
or y
, the function
will utilize spatial relationship (intersects) to filter the data.
Note that dispatching computation by subregions based on the spatial
relationship may lead to a slight discrepancy in the result. For
example, if the higher and lower level features are not perfectly
aligned, there may be some features that are not included or duplicated
in the subregions. The function will alert the user if spatial relation-
ship is used to filter the data.
Value
a data.frame object with computation results.
For entries of the results, consult the function used in
fun_dist
argument.
Note
Virtually any sf/terra functions that accept two arguments
can be put in fun_dist
; however, be advised that
some spatial operations do not necessarily give the
exact result from what would have been done with single thread.
For example, distance calculated through this function may return the
lower value than actual because the computational region was reduced.
This would be the case especially where the target features
are spatially sparsely distributed.
Author(s)
Insang Song geoissong@gmail.com
See Also
future::multisession
, future::multicore
, future::cluster
,
future.mirai::mirai_multisession
, future::plan
, par_convert_f
Other Parallelization:
par_cut_coords()
,
par_grid()
,
par_grid_mirai()
,
par_hierarchy_mirai()
,
par_make_grid()
,
par_merge_grid()
,
par_multirasters()
,
par_multirasters_mirai()
,
par_pad_balanced()
,
par_pad_grid()
,
par_split_list()
Examples
lastpar <- par(mfrow = c(1, 1))
library(terra)
library(sf)
library(future)
library(future.mirai)
options(sf_use_s2 = FALSE)
future::plan(future.mirai::mirai_multisession, workers = 2)
nccnty <- sf::st_read(
system.file("shape/nc.shp", package = "sf")
)
nccnty <- sf::st_transform(nccnty, "EPSG:5070")
nccnty <- nccnty[seq_len(30L), ]
nccntygrid <- sf::st_make_grid(nccnty, n = c(200, 100))
nccntygrid <- sf::st_as_sf(nccntygrid)
nccntygrid$GEOID <- sprintf("%05d", seq_len(nrow(nccntygrid)))
nccntygrid <- sf::st_intersection(nccntygrid, nccnty)
rrast <- terra::rast(nccnty, nrow = 600, ncol = 1320)
terra::values(rrast) <- rgamma(7.92e5, 4, 2)
# Using raster path
rastpath <- file.path(tempdir(), "ncelev.tif")
terra::writeRaster(rrast, rastpath, overwrite = TRUE)
ncsamp <-
sf::st_sample(
nccnty,
size = 1e4L
)
# sfc to sf
ncsamp <- sf::st_as_sf(ncsamp)
# assign ID
ncsamp$kid <- sprintf("K-%05d", seq_len(nrow(ncsamp)))
res <-
par_hierarchy(
regions = nccnty,
regions_id = "FIPS",
fun_dist = extract_at,
y = nccntygrid,
x = rastpath,
id = "GEOID",
func = "mean"
)
future::plan(future::sequential)
mirai::daemons(0)
par(lastpar)