vivid: An R package for Variable Importance and Variable Interactions Displays for Machine Learning Models

We present vivid, an R package for visualizing variable importance and variable interactions in machine learning models. The package provides a range of displays including heatmap and graph-based displays for viewing variable importance and interaction jointly and partial dependence plots in both a matrix layout and an alternative layout emphasizing important variable subsets. With the intention of increasing a machine learning models' interpretability and making the work applicable to a wider readership, we discuss the design choices behind our implementation by focusing on the package structure and providing an in-depth look at the package functions and key features. We also provide a practical illustration of the software in use on a data set.


Introduction
Our motivation behind the creation of the vivid package is to investigate machine learning models in a way that is simple to understand while also offering helpful insights into how variables affect the fit.We do this through the use of heatmaps, network graphs, and both a generalized pairs plot style partial dependence plot (PDP) (Friedman 2000) and a space saving PDP based on key variable subsets.While the techniques and fundamental goals of these visualizations have been discussed in Inglis, Parnell, and Hurley (2022a), we focus here on the implementation details of the package by providing a complete listing of the functions and arguments included in the vivid package with further examples indicating advanced usage beyond that previously shown.In this work we examine the decisions made when designing the package and provide an in-depth look at the package functions and features with the intention of making the work applicable to a larger readership.This article outlines the general architectural principles implemented in vivid, such as the data structures we use and data formatting, function design, filtering techniques, and more.We illustrate each function by way of a practical example.Our package is available on the Comprehensive R Archive Network at https://cran.rproject.org/web/packages/vividor on GitHub at https://github.com/AlanInglis/vivid.
In recent years machine learning (ML) algorithms have emerged as a valuable tool for both industry and science.However, due to the black-box nature of many of these algorithms it can be challenging to communicate the reasoning behind the algorithm's decision-making processes.With the need for transparency in ML growing it is important to gain understanding and clarity about how these algorithms are making predictions (Antunes et al. 2018;Felzmann et al. 2019).Many R packages are now available that aid in creating interpretable machine learning (IML) models such as iml (Molnar, Bischl, and Casalicchio 2018), DALEX (Biecek 2018), and lime (Hvitfeldt, Pedersen, and Benesty 2022).For a comprehensive review of IML, see Molnar (2022) and Biecek and Burzykowski (2021).
How we choose to visualize aspects of the model output is of vital importance in how a researcher can interpret and communicate their findings.Consequently, model summaries such as variable importance and variable interactions (VImp and VInt; together we term these VIVI) are frequently used in various fields to comprehend and explain the hidden structure in an ML fit.In ecology they are employed to determine the causes of ecological phenomena (e.g.Murray and Conner 2009); in meteorology VImp measures and partial dependence plots are used to examine air quality (e.g.Grange et al. 2018); in bioinformatics, understanding gene-environment interactions have made these measures an important tool for genomic analysis (e.g.Chen and Ishwaran 2012).
In Table 1 we summarize VIVI measures and visualizations provided by a selection of R packages.VIVI measures are categorized as global or local, depending on whether they refer to the entire predictor space or a specific sub-region.For instance, Individual Conditional Expectation (ICE) curves (Goldstein et al. 2015) and Local Interpretable Model-agnostic Explanations (LIME) (Ribeiro, Singh, and Guestrin 2016) are considered local methods (implementations of which can be found in the ICEbox package (Goldstein et al. 2015) and the lime package (Hvitfeldt, Pedersen, and Benesty 2022)), whereas partial dependence and permutation importance represent global methods.Packages that incorporate local or global methods in Table 1 can be found under the column 'Method'.Additionally, VIVI measures can be broken down into model specific (embedded) methods or model agnostic methods.The packages listed in Table 1 are grouped by whether they can compute model specific or agnostic measures and can be found under the column 'Measure'.In embedded methods the variable importance is incorporated into the ML algorithm.For example, random forests (RF) ( (Friedman 2000) use the tree structure to evaluate the performance of the model.Bayesian additive regression tree models (BART) (Chipman, George, and McCulloch 2010) also use an embedded method to obtain VIVI measures by looking at the proportion of splitting rules for variables or variable pairs used in the trees.The package randomForestExplainer (Paluszynska, Biecek, and Jiang 2020) provides a set of tools to understand what is happening inside a random forest and uses the concept of minimal depth (Ishwaran et al. 2010) to assess both importance and interaction strength by examining the position of a variable within the trees.Additionally the varImp (Probst 2020) package can be used to compute importance scores for the conditional inference random forest of the party package (Strobl et al. 2008).For gradient boosted machines, the EIX (Maksymiuk, Karbowiak, and Biecek 2021) package can be used to measure and identify VIVI and visualize the results.
Model-agnostic methods are techniques that can, in principle, be applied to any ML algorithm.Agnostic methods not only provide flexibility in relation to model selection but are also useful for comparing different fitted ML models.An example of a model agnostic approach for evaluating VImps is permutation importance (Breiman 2001).This method calculates the difference in a model's predictive performance following a variable's permutation; implementations are available in the packages iml, flashlight, DALEX and vip (Greenwell and Boehmke 2020).Each package provides options for specifying the performance metric to use in computing the model performance as well as providing options to select the number of permutations, with flashlight and DALEX packages additionally providing an option to select the number of observations that should be sampled for the calculation of variable importance.The vip package also provides a PDP/ICE-based variable importance method, which computes quantifies the variability in PDP and ICE plots.For VInts, Friedman's H-statistic (Friedman and Popescu 2008) is an agnostic interaction measure derived from the partial dependence by comparing a pair of variables' joint effects with the sum of their marginal effects.Packages iml and flashlight provide implementations.
Partial dependence plots (PDPs) were first introduced by Friedman (2000) as a model agnostic way to visualize the relationship between a specified predictor variable and the fit, averaging over other predictors' effects.Similar to PDPs, individual conditional expectation curves (ICE) (Goldstein et al. 2015) show the relationship between a specified predictor and the fit, fixing the levels of other predictors at those of a particular observation.PDP curves are then the average of the ICE curves over all observations in the dataset.R packages offering PDPs include iml, DALEX (which calls the ingredients package (Biecek and Baniecki 2023)), flashlight, and pdp (Greenwell 2017); the package ICEbox (Goldstein et al. 2015) provides ICE curves and variations.Each package provides options for selecting the size of the grid for evaluating the predictions, with flashlight and DALEX providing options to select the number of observations to consider in calculating the partial dependence.
In vivid we provide a suite of functions (see Table 2) for calculating and visualizing variable importance, interactions and the partial dependence.Our displays conveniently show (either model specific and agnostic) VImp and VInt jointly using heatmaps and network graphs.Through the use of seriation techniques, we group together variables with the greatest impact on the response.For our network plot, additional filtering to remove less influential variables and clustering options to group highly interacting variables are provided, thus providing a more informative picture identifying relevant features.Additionally, our implementation makes it possible to apply our methods to subsets of data, which leads to locally-based importance measures.Our generalized PDP (GPDP) displays partial dependence plots in a matrix layout combining univariate and bivariate partial dependence plots with variable scatterplots.This layout, coupled with seriation, allows for quick assessment of of how pairs of variables have an impact on the model fit.We furthermore provide a more compact version of the GDPD, the so-called zen-partial dependence plot (ZPDP) consisting only of those bivariate partial dependence plots with high VInt.All of our displays are designed to quickly identify how variables, both singly and jointly, affect the fitted response and can be used for regression or classification fits.As the output of our displays are ggplot2 objects (Wickham 2016), they are easily customizable and provide the flexibility to create custom VIVI visualizations.This paper is structured as follows.First we introduce a dataset and fit models that will be used as examples throughout this paper.Following this, we describe vivid functionality for calculating VIVI.We then move on to visualizations and focus on the functionality provided by the two functions viviHeatmap and viviNetwork for displaying VIVI, and two functions for displaying PDPs namely, pdpPairs and pdpZen.Finally we provide some concluding discussion.

Example: Data and Models
The well-known Boston housing data (Harrison Jr and Rubinfeld 1978) from the R package MASS (Venables and Ripley 2002) concerns prices of 506 houses and 14 predictor variables including property attributes such as number of rooms and social attributes including crime rate and pollution levels.The The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 Takes measured importance and interactions from the vip package and turns them into vivid matrix which can be used for plotting Utility response is the median value of owner-occupied homes in $1000s (medv).
We first fit a random forest (using the randomForest package).In order to avail of all the available embedded variable importance scores, the importance argument must be TRUE when calling the randomForest function.This allows any of the provided importance metrics to be used in vivid.However, in our following examples, we use an agnostic VImp measure supplied by the vivid package, which allows us to directly compare VImp values across different fits.library("randomForest") library("MASS") set.seed(1701)data("Boston") rf <-randomForest(medv ~., data = Boston) Next we fit a gradient boosted machine (using the xgboost package).For the GBM we set the maximum number of boosting iterations, nrounds, to 100 as no default is provided in xgboost.library("xgboost") gbst 1:13]), label = as.matrix(Boston[,14]),nrounds = 100, verbose = 0) In the following sections we will explain how aspects of the two fits can be compared with vivid software.We will also explain aspects of our software design with reference to these fits.

Calculating VIVI
The first step in using vivid is to calculate variable importance and interactions for a model fit.The vivi function calculates both of these, creating a square, symmetric matrix containing variable importance on the diagonal and variable interactions on the off-diagonal.Required inputs are a fitted ML model, a data frame on which the model was trained, and the name of the response variable for the fit.The returned matrix has importance and interaction values for all variables in the supplied data frame, excluding the response.Our visualizations functions viviHeatmap and viviNetwork are designed to show the results of a vivi calculation, but will work equally well for any square matrix with identical row and column names.(Note, the symmetry assumption is not required for viviHeatmap and viviNetwork uses interaction values from the lower-triangular part of the matrix only.) The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 The code snippet below shows the creation of a vivid matrix for the random forest fit.For clarity, we include all of the vivi function arguments for the random forest fit, though only the first three are required.Other inputs will be described in the section vivi function additional arguments.library("vivid") set.seed(1701)viviRf <-vivi(fit = rf, data = Boston, response = "medv", reorder = FALSE, normalized = FALSE, importanceType = "agnostic", gridSize = 50, nmax = 500, class = 1, predictFun = NULL, numPerm = 4) In the absence of any model-specific importance measure we use an agnostic permutation method described by Fisher, Rudin, and Dominici (2019) to obtain the variable importance scores.In this method a model error score (root mean square error) is calculated, then each feature is randomly permuted and the model error is re-calculated.The difference in performance is considered to be the variable importance score for that feature.By default the permutation is set to be replicated four times to account for variability.However, we provide an option to select a desired number of permutations (via the numPerm argument)).
The vivi function calculates both the importance and interactions using S3 methods.By default, the agnostic importance and interaction scores in vivid are computed using the generalized predict function from the condvis2 package (C.Hurley, OConnell, and Domijan 2022).Consequently, vivid can be used out-of-the-box with any model type that works with condvis2 predict (see CVpredict from condvis2 for a full list of compatible model types).To allow vivid to run with other model fits, a custom predict function must be passed to the predictFun argument (as discussed below).
The S3 method used to obtain the importance is called vividImportance.vivid relies on the flashlight package to calculate agnostic importance via flashlight::light_importance which currently works for numeric and numeric binary responses only.For model-specific variable importance, we provide individual methods to access importance scores for some of the most popular model fitting R packages, namely; ranger (Wright and Ziegler 2017), randomForest, mlr (Bischl et al. 2016), mlr3 (Lang et al. 2019), and parsnip (Kuhn and Vaughan 2022) (however, more could be added via additional methods).To access any available model-specific variable importance from the aforementioned packages, the importanceType argument must be set to equal the selected importance metric.For example, to select the percent increase in mean square error importance score from the randomForest package, the importanceType argument must be set to equal this measure as it is called within the randomForest package, that is; importanceType = "%IncMSE".If the importanceType is not set or is set to equal agnostic, then the agnostic importance is calculated.It should be noted that when comparing different model fits, using model-specific variable importance will result in importance measures that are not directly comparable (however, comparing model specific scores could be useful when comparing the same fitting procedure evaluated using different parameters).In the case that a practitioner may wish to use our visualizations to compare different model fits, we recommend using the agnostic permutation approach supplied by vivid to make a direct comparison of the importance measures.
For variable interactions, we use the model-agnostic Friedman's H-statistic to identify any pairwise interactions.As discussed in Inglis, Parnell, and Hurley (2022a), we recommend the unnormalized version of the H-statistic which prevents detection of spurious interactions which can occur when the bivariate partial dependence function (used in the construction of the H-statistic) is flat.In the case of a binary response classification model, we follow Hastie, Tibshirani, and Friedman (2009) and compute the H-statistic and partial dependence on the logit scale.
The vivi function calculates interactions using an S3 method called vividInteraction, which again relies on the flashlight package to calculate Friedman's H-statistic via flashlight::light_interaction.Friedman's H-statistic is the only interaction measure currently available in vivid, though the method of Greenwell, Boehmke, and McCarthy (2018) could also be used for this purpose.Embedded interaction measures could easily be incorporated via S3 methods in future.
flashlight simplifies the calculation of VIVI values as it allows a custom predict function to be supplied for the calculation of agnostic importance and the H-statistic; this flexibility means The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 importance and the H-statistic can be calculated for any ML model.Note that as flashlight importance and interaction functions act in a model agnostic way, they will give a VIVI of zero for variables in the dataset (except the response) that are not used by the supplied ML fit.We supply an internal custom predict function called CVpredictfun to both flashlight::light_importance and flashlight::light_interaction.CVpredictfun is a wrapper around CVpredict from the condvis2 package, which adds an option for the classification to select (via the class argument to vivi) the class to be used for prediction and calculates predictions on the logit scale by default.CVpredict accepts a broad range of fit classes thus streamlining the process of calculating VIVI.
In situations where the fit class is not handled by CVpredict (as is the case for the GBM model created from xgboost), supplying a custom predict function to the vivi function by way of the predictFun argument allows the agnostic VIVI values to be calculated.In the code snippet below, we build the vivid matrix for the GBM fit using a custom predict function, which must be of the form given in the code snippet.For brevity we omit some of the optional vivi function arguments.By default, the agnostic variable importance is used to allow for a direct comparison of the importance measures for both model fits.

vivi function additional arguments
The vivi function has 11 arguments.Some of these have been discussed above, including fit, data, response, importanceType, and predictFun.Here we provide a summary of the remaining arguments.First, the normalized argument determines if Friedman's H-statistic should be normalized or not (see Inglis, Parnell, and Hurley (2022a), for the pros and cons of each version).The arguments gridSize and nmax are used to set the size of the grid for evaluating the predictions and maximum number of data rows to consider, respectively, in the calculation of the H-statistic.Lowering the grid size can provide a significant speed boost, though at the expense of predictive accuracy.Additionally, sampling the data via nmax offers a speed boost.The default values for gridSize and nmax are 50 and 500, respectively.

Speed tests
A drawback of using Friedman's H-statistic as a measure of interaction is that it is a computationally expensive calculation, and may be especially time-consuming for models where prediction is slow.Figure 1 shows the build time (rounded to the nearest second) averaged over five runs for the creation of a vivid matrix with default parameters for different ML algorithms using the Boston Housing data.As the Boston housing data has 13 predictor variables, Friedman's H-statistic is computed for 91 predictor pairs.The ML algorithms are: GBM, random forest, support vector machine (SVM), neural network (NN), and k-nearest neighbors (KNN).The SVM, NN, and KNN were built using the e1071 (Meyer et al. 2021), nnet (Venables and Ripley 2002), and kknn (Schliep and Hechenbichler 2016) packages, with the KNN being built through the mlr3 (Lang et al. 2019) framework.Each of the models were built using their default settings and, for each model fit, the agnostic VIVI was measured.
The speed tests were performed on both a 2017 Mac-book Pro 2.3 GHz Dual-Core Intel Core i5 with 8GB of RAM and a 2021 32GB Mac-book M1 Pro.Here we are essentially comparing predict times for the various fits.The NN fit created using the nnet package was the fastest, followed by GBM.Both random forests are the slowest, and surprisingly, the older Mac beats its higher spec cousin for the randomForest fit.

Alternative construction of a vivid matrix
A vivid matrix may also be obtained from variable importance and interaction values calculated elsewhere.The package vip offers these, and evaluates interactions using a method called the feature importance ranking measure (FIRM, see Greenwell, Boehmke, and McCarthy (2018), for more details).

Heatmap of Variable Importance and Variable Interactions
The viviHeatmap function constructs a heatmap displaying both importance and interactions, with importance on the diagonal and interactions on the off-diagonals.A vivid matrix is the only required input, which does not necessarily need to be symmetric (for example, the interaction measures from the randomForestExplainer package are asymmetric and could be visualized using our heatmap. Color palettes for the importance and interactions are optionally provided via impPal and intPal arguments.For the default color palette we choose single-hue, color-blind friendly sequential color palettes from Zeileis et al. (2020), where low and high VIVI values are represented by low and high luminance color values respectively, aiding in highlighting values of interest.
The ordering of the heatmap is taken from the ordering of the input matrix.As reorder was set to FALSE when building both the random forest and GBM fit vivid matrix, the ordering of the heatmaps matches the variable order in the dataset.This is useful for directly comparing multiple heatmaps, however it does not necessarily lend itself for easy identification of the largest VIVI values.If we were to seriate both vivi matrices separately, we would end up with different optimal orderings for each matrix.An alternative is to create a common ordering by averaging over the two vivid matrix objects and applying the vividReorder function to the result.(This function uses a seriation algorithm based on the techniques of Earle and Hurley (2015) designed to place high interaction variables adjacently and to pull high VIVI variables towards the top-left; see Inglis, Parnell, and Hurley (2022a) for details.)Both VIVI matrices are then re-ordered using the newly obtained variable order.Figure 2: Agnostic variable importance and variable interaction scores for a random forest fit in (a) and GBM fit in (b) on the Boston housing data displayed as a heatmap.The random forest fit has weaker interactions and lower importance scores than the GBM fit.Both fits identify lstat as the most important followed by rm.In both fits, lstat has numerous interactions with other variables, notably crim in the random forest fit and nox in the GBM fit.

viviGBst <-viviGBst[ord,ord]
Arguments impLims and intLims specify the range of importance and interaction values to be mapped to colors.Default values are calculated from the maximum and minimum VIVI values in the vivid matrix.Importance and interaction values falling outside the supplied limits are squished to the closest limit.It can be useful to specify these limits in the situation where there is an extremely large VIVI value that dominates the display, or where we wish two or more plots to have the same limits for comparison purposes, as in the example below.The angle argument is used to rotate the x-axis labels.
Figure 2 shows our improved ordering so that variables with high VIVI values are pushed to the top left of the plots.Filtering can also be applied to the input matrix to display a subset of variables.When compared to the GBM fit in (b), the random forest fit in (a) has weaker interactions and lower importance scores.Both plots identify lstat as being the most important.Both fits also show that lstat (the percentage of lower status of the population) interacts with several other variables, though the interactions are much stronger for the GBM.Notably, the strongest interaction in both fits are different.These are lstat : crim (where crim is the per capita crime rate by town) for the random forest fit and lstat : nox (where nox is parts per 10 million nitrogen oxides concentration) for the GBM fit.

Network of Variable Importance and Variable Interactions
The viviNetwork function constructs a network graph displaying both importance and interactions.Similarly to viviHeatmap , this function takes a vivid matrix as the only required input and provides a visual representation of the magnitude of the importance and interaction values through the size of the nodes and edges in the graph, in addition to color.In the plot each variable is represented as a node, with its importance being represented through size and color such that larger, darker nodes indicate a higher importance.Each pairwise interaction is represented by a connecting edge, where larger interaction values get thicker, darker edges; Figure 3 provides an example.This type of plot benefits from being able to quickly identify the magnitude of the importance and interactions of the variables that have the most impact on the response.The viviNetwork function optional arguments follows the same conventions as viviHeatmap: custom color palettes for importance and interactions are provided via the impPal and intPal, and the range of VIVI values to be mapped to the colors are specified via the impLims and intLims.
By default, we choose a circular layout to display the graphs, as when coupled with the seriated vivid matrix, variables with high VIVI are grouped in a clock-wise arrangement starting at the top.This arrangement allows for easy identification of variables with high VIVI.Custom layouts are possible by providing a numeric matrix with two columns and one row per node to the layout argument.Additionally, any of the layouts available in the igraph package (Csardi and Nepusz 2006) The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 can be specified.The subject of network graph layouts has been extensively studied (for examples see Purchase (1997), Herman, Melançon, and Marshall (2000), Freeman (2000)).It has been shown that certain layouts of network graphs can significantly aid in cognitive interpretation.For example, Purchase (1997) note that reducing the number of edge crossovers is by far the most important aesthetic (even for small amounts of data), while maximizing symmetry has a lesser effect.Several of the layouts provided by the igraph package can aid interpretation (such as the Sugiyama layout algorithm (Sugiyama, Tagawa, and Toda 1981), which tries to minimize edge crossover.) We provide options to filter the graph via the intThreshold and removeNode arguments.This helps to highlight variables with high VIVI scores, which is useful in settings with many predictors.The intThreshold argument filters edges with weight (i.e., VInt value) below a specified value and removeNode removes nodes with no connecting edges after thresholding interaction values.We can optionally cluster similar variables together with respect to their VIVI scores via the cluster argument, thereby aiding in the process of highlighting variables of interest.The cluster argument can take either a vector of cluster memberships for nodes or an appropriate igraph clustering function.
We demonstrate network plots displaying VIVI values for the GBM fit.In Figure 3, we show both a default network plot including all variables in (a) and a filtered and clustered network plot in (b).For the filtered plot we select VIVI values in the top decile.This selection allows us to focus only on the variables with the most impact on the response.The variables that remain are lstat, nox, rm, crim, dis (weighted mean of distances to five Boston employment centers), tax (full-value property-tax rate per $10,000), and ptratio (pupil-teacher ratio by town).We then perform a hierarchical clustering treating variable interactions as similarities, with the goal of grouping together high-interaction variables.Here we manually select the number of groups we want to show via the cutree function (which cuts clustered data into a desired number of groups).Finally we rearrange the layout using igraph.Here, igraph::layout_as_star places the first variable (deemed most relevant using the VIVI seriation process above) at the center, which in Figure 3 (b) emphasizes its key role as the most important predictor which also has the strongest interactions.
We provide a conversion of vivid matrix objects to a data frame via an as.data.framemethod, as demonstrated below.This facilitates plotting with base R and ggplot2, for example a barplot of either VImp or VInt values.Note that while vivi returns a matrix of class vivid, the class attribute was dropped when the matrix was re-ordered.

Partial Dependence and Individual Conditional Expectation Curves Univariate Partial Dependence Plot
The pdpVars function constructs a grid of univariate PDPs with ICE curves for selected variables.We use ICE curves to assist in the identification of linear or non-linear effects.The fit, data frame used to train the model, and the name of the response variable are required inputs.In the code below, we show an example of the partial dependence and ICE curves for the first five features from the GBM vivid matrix, with output shown in Figure 4. We use the custom GBM predict function given previously.the number of ICE curves to be drawn.This is either a single number specifying the number of observations to be sampled for the ICE curves, or a vector of row indices, an option that is useful for example to display ICE curves from particular classes.The default value for nIce is 30, which allows individual curves to be seen.
The ordering of the PDPs is taken from the ordering of variables in the data set, or may be specified via the vars argument.In Figure 4, the ordering is taken directly from our seriated vivid matrix, thereby showing the top five most influential variables.As with the construction of the vivid matrix, the gridSize and nmax arguments determine the number of predictions.
In Figure 4 we can see from the changing PDP and ICE curves that lstat and rm have the clearest impact on the response, with the predicted median house price being higher for low values of lstat and high values of rm.Additionally, the predicted median house price appears to be higher for low values of dis before leveling off at around 2.5.The remaining variables have generally flat partial dependence and ICE curves.

Generalized Pairs Partial Dependence Plot
The pdpPairs function creates a generalized pairs partial dependence plot (GPDP).In our GPDP, we use a matrix layout and plot the univariate partial dependence (with ICE curves) on the diagonal, bivariate partial dependence on the upper diagonal and a scatterplot of raw variable values on the lower diagonal, where all colors are assigned to points and ICE curves by the predicted ŷ value.In the case of categorical predictors, the partial dependence for each factor level is shown in the upper-diagonal (for an example of this, see Inglis, Parnell, and Hurley (2022a)).As with the univariate PDP, the fit, data frame used to train the model, and the name of the response variable are required inputs.
set.seed( 1701) pdpPairs(data = Boston, fit = gbst, response = "medv", gridSize = 20, nIce = 50, vars = top5, convexHull = TRUE, fitlims = "pdp", predictFun = pFun) As with the univariate PDP, the ordering can be controlled via the var argument.By default, the ordering is taken from the order of the data.In the code above, we display only the interesting variables seen in previous plots by selecting the first five variables from our seriated vivid matrix.We also chose to display 50 ICE curves.As with pdpVars, additional arguments specify the color palette and number of ice curves, while gridSize and nmax determine the number of predictions.
For our GPDP, we follow the general design choices in vivid and specify the range of predicted values to be mapped to the colors via the fitlims argument.We set the default fit range for the color map for the GPDP to the range of the collection of PDP surfaces with fitlims = "pdp".The setting of this argument at its default value allows for maximum resolution of the bivariate PDPs.Since predictions for specific observations and ICE curves would likely exceed these bounds, the closest value within the color map's bounds is used to allocate colors.Alternatively fitlims = all" specifies that limits are calculated as the full range of predictions shown.
In the upper diagonals we exclude extrapolated areas from the bivariate PDPs to prevent interpretation of the PDPs in areas where there is no data.The removal of extrapolated areas can be prevented by specifying convexHull = FALSE.
In Figure 5, in addition to the univariate PDPs, we capture the effects of the variables on the response via the bivariate PDP on the upper-diagonal and the distribution of the data in the lowerdiagonal.The scatterplots are useful for determining if variables are highly correlated, as highly correlated variables may spuriously affect the partial dependence and give erroneous results (Apley and Zhu 2020).Of note are the variables lstat and rm.We can clearly see that when the number of rooms (rm) is high and the percentage of lower status of the population (lstat) is low, the predicted ŷ median house price value is high.This is exemplified in the changing bivariate PDP.From both the univariate and bivariate PDPs, we can see that lstat and rm have an impact on the response.As lstat decreases and rm increases, predicted median house price value goes up.The bivariate PDP of lstat : nox shows that as nox increases, the predicted value decreases.

Zen Partial Dependence Plots
The pdpZen function creates partial dependence plots utilizing a space-saving method based on graph Eulerians (Hierholzer and Wiener 1873).An Eulerian path, also known as an Eulerian trail, is a route that traverses each edge of a graph exactly once.When this path forms a closed loop, the traversal is referred to as an Eulerian tour.We call this display zen-partial dependence plots (ZPDP).The display is based on the zigzag expanded navigation plots, known as zenplots , available in the zenplots package (Hofert and Oldford 2020).Zenplots were created to display paired graphs of high-dimensional data focusing on the most important 2D displays.In our adaptation we show bivariate PDPs that focus on the variables with the largest interaction values in a compact zigzag layout, which is helpful when predictor space is high-dimensional.
The code below illustrates pdpZen, here displaying the first five variables from GBM's vivid matrix.Later we show an example focusing on high-interacting pairs of variables.We use the same convention as our previous PDPs with regard to color palette and limits, grid size, and the number of rows considered for evaluation.The ZPDP also has a variable rug plot on each axis to avoid interpretation problems that may occur in the presence of skewness.
pdpZen(data = Boston, fit = gbst, response = "medv", convexHull = TRUE, zpath = top5, predictFun = pFun) The argument zpath specifies the variables to be plotted, defaulting to all dataset variables aside from the response.In the code above, zpath is the vector lstat, nox, rm, crim and dis.The resulting plot shown in Figure 6 presents the bivariate PDP for every consecutive pair of variables in a zigzag layout.

Zen-paths
ZPDP are most useful when the bivariate PDPs plotted are selected to be an interesting subset of all pairwise plots.To obtain this subset, we consider a network graph displaying VIVI values, such as that in Figure 3  The zPath function takes four arguments.These are: viv -a matrix of interaction values, cutoffexclude interaction values below this threshold, method -a string indicating which method to use to create the path, and connect -a logical value indicating if separate Eulerians should be connected Two methods are provided, either "greedy.weighted"or "strictly.weighted".The first option uses the greedy Eulerian path algorithm (C.B. Hurley andOldford 2011, 2022) for connected graphs.This visits each edge at least once, beginning at the edge with the highest weight and traversing through the remaining edges, giving priority to the highest-weighted edge.Some edges may be visited more than once or additional edges may be visited if the number of nodes in the graph is not even.The second method "strictly.weighted"(provided by zenplot) visits edges strictly in decreasing order by weight (here the interaction values).If connect is TRUE the sequences obtained by the strictly weighted method are concatenated to form a single path.
response = "medv", zpath = zpGw, convexHull = TRUE, predictFun = pFun) pdpZen(data = Boston, fit = gbst, response = "medv", zpath = zpSw, convexHull = TRUE, predictFun = pFun) Note that there are 7 different variables involved in high interactions, which could be displayed in a 7 × 7 GPDP, showing a total of 21 bivariate PDPs.But only 8 of these have VInt values above the 90% quantile, and Figure 7(b) using the strictly.weightedpath shows just these bivariate PDPs compact layout.Using the greedy.weightedsorting method in (a) produces a smaller, neater plot but at the expense of including some plots that are not particularly interesting (for example the pair dis : ptratio).However, it should be noted that any of the arguments from the zenplot function from the zenplots package can be used with the pdpZen function.These include multiple options for different or custom layouts.

Summary
We have presented a detailed exposition of our R package vivid which contains a suite of integrated functions implementing algorithms and novel visualizations for exploring variable importance and variable interactions in machine learning models.Our techniques are intuitive, adaptable, easy to customize and facilitate model comparison.When building the vivid matrix to use in our heatmap and network visualizations, VIVI metrics that are model specific or model-agnostic may be employed.
For measuring interactions we currently only provide the option to use the agnostic Friedman's H-statistic.However, as outlined in the Calculating VIVI section, the inclusion of different VIVI measures is easily possible.
Our vivid package is a useful addition to the other packages in the area of model visualization, such as those discussed in the Introduction section.Our heatmap and network plots efficiently determine which variables have the greatest impact on the response.When coupled with the seriation, filtering, and clustering techniques, these visualizations enhance the interpretation of ML predictions.Our GPDP and ZPDP can be used to provide a thorough examination of the behavior of a fitted ML model by examining the individual variable effects and their pairwise interactions.These plots combine the bivariate PDP, ICE curves, and scatterplots of the raw variable values.They further allow focusing on subsets of variables with high VInt, and so allow us to efficiently explore a fitted ML model by focusing attention to only the most crucial aspects.
It has been noted that the presence of correlated variables can lead to biased VIVI measures.For example, Friedman and Popescu (2008) note that when subsets of variables are highly correlated, it becomes difficult to distinguish between low and higher order interactions among them in a straightforward manner when using the H-statistic.Similarly, Fisher, Rudin, and Dominici (2019) state that for permutation variable importance, the presence of highly correlated variables can affect the measured variable importance and propose using a method of conditional importance, such as that implemented for conditional random forests in the party package.Our GPDP can be used to identify any evident correlations by observing the plot of the data and the PDP with convex hull.However, careful consideration should be employed when interpreting the VIVI measures from vivid and we recommend evaluating any potential correlation between variables in conjunction with our proposed visualizations.Several R packages are available for assessing and visualising correlations, such as corrplot (Wei and Simko 2021) and corrgrapher (Morgen and Biecek 2020).
For future work, the inclusion of other model summaries could be incorporated into vivid, such as the interaction statistics described in Greenwell, Boehmke, and McCarthy (2018) or the use of accumulated local effects (ALE) of Apley and Zhu (2020).This latter method was created to address bias problems with partial dependency functions and could be used in place of the bivariate PDPs seen in both the GPDP and ZPDP.However the calculation of an agnostic, easily interpretable variable interaction measure that accounts for correlated variables remains an ongoing research goal.Allowing a user to view the variation across replicates when performing permutation importance could also be incorporated.This could be added into the heatmap graphic by way of a value suppressing uncertainty palette (Correll, Moritz, and Heer 2018), where the uncertainty in included in the visualization (for example, see Inglis, Parnell, and Hurley (2022b)).Additionally, providing alternative metrics such as AUC or accuracy for computation of permutation importance in classification, as well as computations and visualizations of higher order interactions, interactive graphics and incorporating multivariate response variables are interesting areas for future research.

Figure 1 :
Figure 1: Mean time over five runs, on two MacBooks, for the creation of a vivid matrix for different models.Times are highly dependent on the model fit, with NN the fastest and random forests the slowest.

Figure 3 :Figure 4 :
Figure 3: Network plots showing VIVI scores obtained from a GBM fit on the Boston housing data.In (a) we display the all values in a circle.In (b) we use a hierarchical clustering to group variable with high VIVI together and rearrange the layout using an igraph function.
All of our PDP variants handle categorical responses and predictors.The color palette is customized via the pal argument.In all of our PDPs, this defaults to a diverging palette which accentuates fitted values that differ from the average.Dark red and dark blue are used to indicate high and low values of ŷ respectively.The middle values are displayed in yellow.The nIce argument specifiesThe R Journal Vol.XX/YY, AAAA 20ZZISSN 2073-4859

TheFigure 5 :
Figure5: Filtered generalized pairs partial dependence plot for a GBM fit on the Boston housing data.From both the univariate and bivariate PDPs, we can see that lstat and rm have an impact on the response.As lstat decreases and rm increases, predicted median house price value goes up.The bivariate PDP of lstat : nox shows that as nox increases, the predicted value decreases.

Figure 6 :
Figure 6: Zen partial dependence plot for the GBM fit on the Boston data.Here we display first five variables from the GBM's 'vivid' matrix.Only plots for consecutive variables are shown.

Table 1 :
Summary of a selection of R packages that can be used to assess the variable importance, variable interactions, or partial dependence and if these metrics are global or local and model-specific or model-agnostic.A brief description of available visualizations for evaluating model behavior is also provided.

Table 2 :
Summary of functions available in the vivid package.The main construction function is vivi which is used to calculate the VIVI values for subsequent use in the visualizations.