This is a barebone implementation of Apley's ALE. Per bin, the local effect \(D_j\) is calculated, and then accumulated over bins. \(D_j\) equals the difference between the partial dependence at the lower and upper bin breaks using only observations within bin. To plot the values, we can make a line plot of the resulting vector against upper bin breaks. Alternatively, the vector can be extended from the left by the value 0, and then plotted against all breaks.
.ale(
object,
v,
data,
breaks,
right = TRUE,
pred_fun = stats::predict,
trafo = NULL,
which_pred = NULL,
bin_size = 200L,
w = NULL,
g = NULL,
...
)
Fitted model.
Variable name in data
to calculate ALE.
Matrix or data.frame.
Bin breaks.
Should bins be right-closed?
The default is TRUE
. (No effect if g
is provided.)
Prediction function, by default stats::predict
.
The function takes three arguments (names irrelevant): object
, data
, and ...
.
How should predictions be transformed?
A function or NULL
(default). Examples are log
(to switch to link scale)
or exp
(to switch from link scale to the original scale).
Applied after which_pred
.
If the predictions are multivariate: which column to pick
(integer or column name). By default NULL
(picks last column). Applied before
trafo
.
Maximal number of observations used per bin. If there are more
observations in a bin, bin_size
indices are randomly sampled. The default is 200.
Optional vector with case weights.
For internal use. The result of as.factor(findInterval(...))
.
By default NULL
.
Further arguments passed to pred_fun()
, e.g., type = "response"
in
a glm()
or (typically) prob = TRUE
in classification models.
Vector representing one ALE per bin.
Apley, Daniel W., and Jingyu Zhu. 2020. Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 (4): 1059–1086. doi:10.1111/rssb.12377.