get_latest_reptile_download {reptiledb.data} | R Documentation |
Get Latest Reptile Database Download Link
Description
This function retrieves the most recent download link for reptile database files from the Reptile Database website. It searches for files from the current year first, and if none are found, searches for files from the previous year.
Usage
get_latest_reptile_download(
base_url = "http://www.reptile-database.org/data/",
current_year = as.numeric(format(Sys.Date(), "%Y")),
file_types = c("xls", "xlsx", "zip"),
return_info = FALSE
)
Arguments
base_url |
Character string. The base URL of the reptile database data page. Default is "http://www.reptile-database.org/data/". |
current_year |
Numeric. The current year to search for files. Default is the current system year. |
file_types |
Character vector. File extensions to search for. Default is c("xls", "xlsx", "zip"). |
return_info |
Logical. If TRUE, returns a list with detailed information about the found file. If FALSE, returns only the URL. Default is FALSE. |
Details
The function performs web scraping on the specified URL to find download links. It prioritizes files from the current year, but will fall back to the previous year if no current year files are available.
The function requires the following packages: rvest, dplyr, and stringr. These packages must be installed before using this function.
Value
If return_info = FALSE
, returns a character string with the URL
of the most recent file, or NULL if no suitable file is found.
If return_info = TRUE
, returns a list containing:
- url
Character. The complete URL of the file
- filename
Character. The name of the file
- file_type
Character. The file extension
- extraction_date
Date. The date when the link was extracted
- source_page
Character. The source webpage URL
Returns NULL if no suitable file is found or if an error occurs during web scraping.
See Also
http://www.reptile-database.org/ for more information about the Reptile Database.
Examples
# Get just the URL - requires internet connection
url <- get_latest_reptile_download()
# Get detailed information
info <- get_latest_reptile_download(return_info = TRUE)
# Search for specific file types
zip_url <- get_latest_reptile_download(file_types = "zip")
# Search for files from a specific year
url_2024 <- get_latest_reptile_download(current_year = 2024)