join_it {RcensusPkg} | R Documentation |
join_it
Description
Outer join two dataframes that have a common column variable.
Function uses fast data.table
techniques to join two data.tables
by their common key values. Examples might include using the "GEOID" variable
as a key to join data from RcensusPkg::get_vintage_data()
with a
simple feature with its geometries for counties, states, countries for example
from RcensusPkg::tiger_*_sf()
. The resulting dataframe could then display the
geometries (with RplotterPkg::create_sf_plot()
) with an aesthetic mapping
(e.g. fill/color/size) with a joined data column. Joining could also take place
between two simple features (created by RcensusPkg::tiger_*_sf()
) or between
two dataframes (created by RcensusPkg::get_vintage_data()
).
The important thing to remember is that all the rows in 'df_2' will
be present in the resultant data.table
.
Usage
join_it(
df_1 = NULL,
df_2 = NULL,
key_1 = NULL,
key_2 = NULL,
negate = FALSE,
match = FALSE,
return_sf = FALSE,
na_rm = FALSE
)
Arguments
df_1 |
The first dataframe to be joined. |
df_2 |
The second dataframe to be joined with 'df_1'. All rows in this dataframe will be present in the resultant dataframe. |
key_1 |
A string that names the column from 'df_1' that is common to 'df_2'. |
key_2 |
A string that names the column from 'df_2' that is common to 'df_1'. |
negate |
An optional logical which if |
match |
An optional logical which if |
return_sf |
An optional logical which if |
na_rm |
An optional logical which if |
Value
A data.table
or simple feature object if 'return_sf' is TRUE
.
Examples
## Not run:
# Requires Census Bureau API key
# Get the median household income by tract for Washington DC and join
# this data with DC tract boundaries.
library(data.table)
library(httr2)
library(jsonlite)
library(sf)
library(usmap)
library(withr)
library(ggplot2)
library(RcensusPkg)
# Get the 2020 median household income data by tract for DC
dc_fips <- usmap::fips(state = "dc")
dc_B19013_dt <- RcensusPkg::get_vintage_data(
dataset = "acs/acs5",
vintage = 2020,
vars = "B19013_001E",
region = "tract",
regionin = paste0("state:", dc_fips)
)
# Get the simple feature DC tract geometries and join the data dataframe "dc_B19013_dt"
output_dir <- withr::local_tempdir()
if(!dir.exists(output_dir)){
dir.create(output_dir)
}
dc_tracts_sf <- RcensusPkg::tiger_tracts_sf(
state = dc_fips,
output_dir = output_dir,
general = TRUE,
delete_files = FALSE
)
# Join the data with simple feature object
dc_joined_sf <- RcensusPkg::join_it(
df_1 = dc_B19013_dt,
df_2 = dc_tracts_sf,
key_1 = "GEOID",
key_2 = "GEOID",
return_sf = TRUE
)
## End(Not run)