spod_available_data {spanishoddata} | R Documentation |
Get available data list
Description
Get a table with links to available data files for the specified data version. Optionally check (see arguments) the file size and availability of data files previously downloaded into the cache directory specified with SPANISH_OD_DATA_DIR environment variable (set by spod_set_data_dir()
) or a custom path specified with data_dir
argument. By default the data is fetched from Amazon S3 bucket where the data is stored. If that fails, the function falls back to downloading an XML file from the Spanish Ministry of Transport website. You can also control this behaviour with use_s3
argument.
Usage
spod_available_data(
ver = 2,
check_local_files = FALSE,
quiet = FALSE,
data_dir = spod_get_data_dir(),
use_s3 = TRUE,
force = FALSE
)
Arguments
ver |
Integer. Can be 1 or 2. The version of the data to use. v1 spans 2020-2021, v2 covers 2022 and onwards. See more details in codebooks with |
check_local_files |
Logical. Whether to check if the local files exist and get the file size. Defaults to |
quiet |
A |
data_dir |
The directory where the data is stored. Defaults to the value returned by |
use_s3 |
|
force |
Logical. If |
Value
A tibble with links, release dates of files in the data, dates of data coverage, local paths to files, and the download status.
- target_url
character
. The URL link to the data file.- pub_ts
POSIXct
. The timestamp of when the file was published.- file_extension
character
. The file extension of the data file (e.g., 'tar', 'gz').- data_ym
Date
. The year and month of the data coverage, if available.- data_ymd
Date
. The specific date of the data coverage, if available.- study
factor
. Study category derived from the URL (e.g., 'basic', 'complete', 'routes').- type
factor
. Data type category derived from the URL (e.g., 'number_of_trips', 'origin-destination', 'overnight_stays', 'data_quality', 'metadata').- period
factor
. Temporal granularity category derived from the URL (e.g., 'day', 'month').- zones
factor
. Geographic zone classification derived from the URL (e.g., 'districts', 'municipalities', 'large_urban_areas').- local_path
character
. The local file path where the data is (or going to be) stored.- downloaded
logical
. Indicator of whether the data file has been downloaded locally. This is only available ifcheck_local_files
isTRUE
.
Examples
# Set data dir for file downloads
spod_set_data_dir(tempdir())
# Get available data list for v1 (2020-2021) data
spod_available_data(ver = 1)
# Get available data list for v2 (2022 onwards) data
spod_available_data(ver = 2)
# Get available data list for v2 (2022 onwards) data
# while also checking for local files that are already downloaded
spod_available_data(ver = 2, check_local_files = TRUE)