This is a barebone implementation of Apley's ALE. Per bin, the local effect \(D_j\) is calculated, and then accumulated over bins. \(D_j\) equals the difference between the partial dependence at the lower and upper bin breaks using only observations within bin. To plot the values, we can make a line plot of the resulting vector against upper bin breaks. Alternatively, the vector can be extended from the left by the value 0, and then plotted against all breaks.

.ale(
  object,
  v,
  data,
  breaks,
  right = TRUE,
  pred_fun = stats::predict,
  trafo = NULL,
  which_pred = NULL,
  bin_size = 200L,
  w = NULL,
  g = NULL,
  ...
)

Arguments

object

Fitted model.

v

Variable name in data to calculate ALE.

data

Matrix or data.frame.

breaks

Bin breaks.

right

Should bins be right-closed? The default is TRUE. (No effect if g is provided.)

pred_fun

Prediction function, by default stats::predict. The function takes three arguments (names irrelevant): object, data, and ....

trafo

How should predictions be transformed? A function or NULL (default). Examples are log (to switch to link scale) or exp (to switch from link scale to the original scale). Applied after which_pred.

which_pred

If the predictions are multivariate: which column to pick (integer or column name). By default NULL (picks last column). Applied before trafo.

bin_size

Maximal number of observations used per bin. If there are more observations in a bin, bin_size indices are randomly sampled. The default is 200.

w

Optional vector with case weights.

g

For internal use. The result of as.factor(findInterval(...)). By default NULL.

...

Further arguments passed to pred_fun(), e.g., type = "response" in a glm() or (typically) prob = TRUE in classification models.

Value

Vector representing one ALE per bin.

References

Apley, Daniel W., and Jingyu Zhu. 2020. Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 (4): 1059–1086. doi:10.1111/rssb.12377.

Examples

fit <- lm(Sepal.Length ~ ., data = iris)
v <- "Sepal.Width"
.ale(fit, v, data = iris, breaks = seq(2, 4, length.out = 5))
#> [1] 0.2479445 0.4958889 0.7438334 0.9917779