Updates an "EffectData" object by

  • turning discrete values to factor (especially useful with the next option),

  • collapsing levels of categorical variables with many levels,

  • dropping empty bins,

  • dropping small bins,

  • dropping bins with missing name, or

  • sorting the variables by their importance, see effect_importance()-

Except for sort_by, all arguments are vectorized, i.e., you can pass a vector or list of the same length as object.

# S3 method for class 'EffectData'
update(
  object,
  sort_by = c("no", "pd", "pred_mean", "y_mean", "resid_mean", "ale"),
  to_factor = FALSE,
  collapse_m = 30L,
  collapse_by = c("weight", "N"),
  drop_empty = FALSE,
  drop_below_n = 0,
  drop_below_weight = 0,
  na.rm = FALSE,
  ...
)

Arguments

object

Object of class "EffectData".

sort_by

By which statistic ("pd", "pred_mean", "y_mean", "resid_mean", "ale") should the results be sorted? The default is "no" (no sorting). Calculated after all other update steps, e.g., after collapsing or dropping rare levels.

to_factor

Should discrete features be treated as factors? In combination with collapse_m, this can be used to collapse rare values of discrete numeric features.

collapse_m

If a factor or character feature has more than collapse_m levels, rare levels are collapsed into a new level "other". Standard deviations are collapsed via root of the weighted average variances. The default is 30. Set to Inf for no collapsing.

collapse_by

How to determine "rare" levels in collapse_m? Either "weight" (default) or "N". Only matters in situations with case weights w.

drop_empty

Drop empty bins. Equivalent to drop_below_n = 1.

drop_below_n

Drop bins with N below this value. Applied after collapsing.

drop_below_weight

Drop bins with weight below this value. Applied after collapsing.

na.rm

Should missing bin centers be dropped? Default is FALSE.

...

Currently not used.

Value

A modified object of class "EffectData".

Examples

fit <- lm(Sepal.Length ~ ., data = iris)
xvars <- colnames(iris)[-1]
feature_effects(fit, v = xvars, data = iris, y = "Sepal.Length", breaks = 5) |>
  update(sort = "pd", collapse_m = 2) |>
  plot()