SD_data {MLBC}R Documentation

Job postings dataset

Description

A subset of data relating to job postings on the Lightcast platform for demonstrating bias correction methods with ML-generated variables.

Usage

SD_data

Format

SD_data

A data frame with 16315 rows and 6 columns:

city_name

Character. City of the job posting

naics_2022_2

Character. Type of business (NAICS industry classification)

salary

Numeric. Salary offered (response variable)

wfh_wham

Numeric. Binary label generated via ML, indicating whether remote work is offered (subject to measurement error)

soc_2021_2

Character. Occupation code (SOC classification)

employment_type_name

Character. Employment type (part time/full time)

Source

Proprietary data from Lightcast job postings platform

Examples

## Not run: 
data(SD_data)
fit <- ols_bca(log(salary) ~ wfh_wham + soc_2021_2 + naics_2022_2,
               data = SD_data, fpr = 0.009, m = 1000)

## End(Not run)

[Package MLBC version 0.2.2 Index]