Model predictions are modelled by a single decision tree, serving as an easy
to interprete surrogate to the original model.
As suggested in Molnar (see reference below), the quality of the surrogate
tree can be measured by its R-squared. The size of the tree can be modified
by passing ...
arguments to rpart::rpart()
.
light_global_surrogate(x, ...)
# Default S3 method
light_global_surrogate(x, ...)
# S3 method for class 'flashlight'
light_global_surrogate(
x,
data = x$data,
by = x$by,
v = NULL,
use_linkinv = TRUE,
n_max = Inf,
seed = NULL,
keep_max_levels = 4L,
...
)
# S3 method for class 'multiflashlight'
light_global_surrogate(x, ...)
An object of class "flashlight" or "multiflashlight".
Arguments passed to rpart::rpart()
, such as maxdepth
.
An optional data.frame
.
An optional vector of column names used to additionally group the results. For each group, a separate tree is grown.
Vector of variables used in the surrogate model.
Defaults to all variables in data
except "by", "w" and "y".
Should retransformation function be applied? Default is TRUE
.
Maximum number of data rows to consider to build the tree.
An integer random seed used to select data rows if n_max
is lower than
the number of data rows.
Number of levels of categorical and factor variables to keep.
Other levels are combined to a level "Other". This prevents rpart::rpart()
to
take too long to split non-numeric variables with many levels.
An object of class "light_global_surrogate" with the following elements:
data
A tibble with results.
by
Same as input by
.
light_global_surrogate(default)
: Default method not implemented yet.
light_global_surrogate(flashlight)
: Surrogate model for a flashlight.
light_global_surrogate(multiflashlight)
: Surrogate model for a multiflashlight.
Molnar C. (2019). Interpretable Machine Learning.
fit <- lm(Sepal.Length ~ ., data = iris)
x <- flashlight(model = fit, label = "lm", data = iris)
sur <- light_global_surrogate(x)
sur$data$r_squared
#> [1] 0.9225605
plot(sur)