details_decision_tree_spark {parsnip}R Documentation

Decision trees via Spark

Description

sparklyr::ml_decision_tree() fits a model as a set of ⁠if/then⁠ statements that creates a tree-based structure.

Details

For this engine, there are multiple modes: classification and regression

Tuning Parameters

This model has 2 tuning parameters:

Translation from parsnip to the original package (classification)

decision_tree(tree_depth = integer(1), min_n = integer(1)) |> 
  set_engine("spark") |> 
  set_mode("classification") |> 
  translate()
## Decision Tree Model Specification (classification)
## 
## Main Arguments:
##   tree_depth = integer(1)
##   min_n = integer(1)
## 
## Computational engine: spark 
## 
## Model fit template:
## sparklyr::ml_decision_tree_classifier(x = missing_arg(), formula = missing_arg(), 
##     max_depth = integer(1), min_instances_per_node = min_rows(0L, 
##         x), seed = sample.int(10^5, 1))

Translation from parsnip to the original package (regression)

decision_tree(tree_depth = integer(1), min_n = integer(1)) |> 
  set_engine("spark") |> 
  set_mode("regression") |> 
  translate()
## Decision Tree Model Specification (regression)
## 
## Main Arguments:
##   tree_depth = integer(1)
##   min_n = integer(1)
## 
## Computational engine: spark 
## 
## Model fit template:
## sparklyr::ml_decision_tree_regressor(x = missing_arg(), formula = missing_arg(), 
##     max_depth = integer(1), min_instances_per_node = min_rows(0L, 
##         x), seed = sample.int(10^5, 1))

Preprocessing requirements

This engine does not require any special encoding of the predictors. Categorical predictors can be partitioned into groups of factor levels (e.g. ⁠{a, c}⁠ vs ⁠{b, d}⁠) when splitting at a node. Dummy variables are not required for this model.

Case weights

This model can utilize case weights during model fitting. To use them, see the documentation in case_weights and the examples on tidymodels.org.

The fit() and fit_xy() arguments have arguments called case_weights that expect vectors of case weights.

Note that, for spark engines, the case_weight argument value should be a character string to specify the column with the numeric case weights.

Other details

For models created using the "spark" engine, there are several things to consider.

References


[Package parsnip version 1.3.2 Index]