SortedEffects: Sorted Causal Effects in R

Chernozhukov et al. (2018) proposed the sorted effect method for nonlinear regression models. This method consists of reporting percentiles of the partial effects, the sorted effects, in addition to the average effect commonly used to summarize the heterogeneity in the partial effects. They also propose to use the sorted effects to carry out classiﬁcation analysis where the observational units are classiﬁed as most and least affected if their partial effect are above or below some tail sorted effects. The R package SortedEffects implements the estimation and inference methods therein and provides tools to visualize the results. This vignette serves as an introduction to the package and displays basic functionality of the functions within.


The sorted effects method
Many empirical questions in econometrics, machine learning and statistics boil down to studying how changes in a key variable T affect an outcome variable of interest Y, holding fixed some control variables W. Such effects are called predictive partial effects (PEs), or treatment or structural partial effects when they have a causal or structural interpretation.Depending on the context, researchers often work with models that feature nonlinearity in the key variable of interest or nonlinearity in parameters.The first kind of models includes mean and quantile regressions in which T is interacted with W, while the second type includes generalized linear models such as logit and probit.The methods implemented in SortedEffects are designed to estimate and make inference on PEs in nonlinear models.
In nonlinear models the PEs vary with the underlying control variables.Consider the probit model as an example.The conditional probability of {Y = 1} is where Φ denotes CDF of standard normal distribution.Then, the PE of changing T on Y, holding W fixed at w, is Φ β + w γ − Φ w γ when T is binary with values 0 and 1, or φ tβ + w γ β, φ (u) = ∂Φ (u) /∂u, when T is continuous.Since W typically differ among observational units, the PEs are heterogeneous.
More generally, suppose we have a regression function g (X) corresponding to some characteristic of Y conditional on the covariates X := [T, W].Let ∆ (x) denote the PE of T on Y, holding W fixed.Then, ∆ (x) = g (t 1 , w) − g (t 0 , w) if T is discrete and takes values t 0 and t 1 , or ∆ (x) = ∂g (t, w) /∂t if T is continuous and t → g (t, w) is differentiable.As ∆ (x) is a function of x, the PE of an observa- tional unit, ∆ (X), is a random variable with a distribution induced by the distribution of X.A popular statistic to summarize the heterogeneity of ∆ (X) is the average partial effect (APE): where µ denotes the distribution of X in the population of interest.However, the APE might provide an incomplete summary of ∆ (X) as it neglects all the heterogeneity by design.Chernozhukov et al. (2018) proposed the sorted partial effect (SPE) method to provide a more complete summary of ∆ (X).This method consists on reporting the entire set of PEs, sorted in increasing order and indexed by a user-specified ranking in the population of interest.More specifically, the SPEs are defined as percentiles of the PE in the population of interest, that is The SPEs are also useful to conduct classification analysis (CA).This analysis consists of two steps.First, classify the observational units with PEs above or below some thresholds defined by tail SPEs in the most or least affected groups.Second, report and compare the moments or distributions of the The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 outcome and covariates in the two groups.
To apply the methods in practice, we need to replace ∆ and µ by sample analogs ∆ and μ, and quantify the sampling uncertainty.Chernozhukov et al. (2018) derived the theoretical underpinnings of the resulting empirical SPEs.We refer the interested reader to that paper for details.We expect the method to be helpful to a large audience.For example, medical researchers are often interested in estimating the treatment effect of a drug.Treated units can experience different effects due to individual characteristics such as age and health status.The SortedEffects package thus provides a way to visualize the heterogeneity in the effect.Researchers can also define the most and least affected groups and compare their characteristics.
The rest of the vignette proceeds as follows.We first introduce the main functions within the SortedEffect package and use an application to racial-based discrimination in mortgage lending to illustrate the command options.Then we provide another application to gender wage gap using the Current Population Survey (CPS) data.The empirical results show that the SPE method is effective in uncovering heterogeneous effects and demographic differences between the most and least affected groups in both applications.

Functions in the package
The package contains three main commands: spe, ca, and subpop.The package adds three methods to the generic plot() (plot.spe,plot.ca and plot.subpop) and summary() (summary.spe,summary.caand summary.subpop).In practice, users only need to type plot and summary since the generic commands will automatically dispatch the appropriate method.In this section we explain the options in each function respectively.Lastly we briefly explain the bootstrap procedures for inference and bias correction.We provide the mathematical expressions and an application to racial discrimation in mortgage lending to facilitate the understanding of the options of the commands.
The option samp_weight allows the user to input sampling weights.When samp_weight = NULL, the default, the package automatically uses a vector of ones, i.e. no sampling weights are used.
The package provides three options for the variable of interest var: 1. var_type = "binary" if var is a binary variable such as a black or female indicator.
2. var_type = "categorical" if var is a categorical (factor) variable with more than 2 levels.One example is means of transportation with labels "bus", "train" or "bike".In this case, the user needs to further specify the two labels to be compared with the option compare.If the user is interested in the effect of changing from bus to bike, then compare = c("bus","bike").If the data only have levels, say "1","2" and "3", the users need to input levels instead.
3. var_type = "continuous" if var is a continuous variable.The package obtains the PEs using a central difference numerical derivative of the form: where h is set to be 1e-7.We avoid the symbolic derivative because it cannot correctly interpret terms involving I() for transformations of the variables such as powers, and can therefore cause erroneous estimation.The code of this part is inspired by Thomas Leeper's vignette on the margins package (Leeper et al., 2018).
The option subgroup allows the user to specify the population of interest.The default is NULL, which corresponds to the entire population.If the user is interested in subpopulations, say households whose income is lower than 10, 000 dollars, and the data in use is called Data, then the user can set subgroup = Data[,"income"] <10000.When T is a binary treatment indicator, the user can specify that the population of interest is the treated population with subgroup = Data[,var] == 1.Note that users cannot input subgroup directly to the data option or using the subset option because the SPE methods use the whole sample to estimate the PEs.The option subgroup only specifies the population for the estimation of µ, the distribution of X.
The option us specifies the set of quantile indexes corresponding to the estimated SPE to be reported.The mathematical definition of each empirical SPE is The empirical SPE function then outputs the SPEs with quantile indexes in the set U .The option us specifies U .For example, us = c(0.25,0.5,0.75)specifies to report the SPEs corresponding to the three quartiles.
The option alpha specifies the significance level of the confidence bands.The default is alpha = 0.1, i.e. 90% confidence level.The option b specifies the number of bootstrap repetitions.The default is b = 500.
The package supports two types of bootstrap: nonparametric (boot_type = "nonpar") and weighted with standard exponential weights (boot_type = "weighted").The user can fix the random seed for bootstrap simulation with the option seed and decide whether or not to bias-correct the estimates with the option bc.The default is bc = TRUE.Bootstrap and bias-correction will be discussed in section 2.5.The package features parallel computing, which is convenient to speed up the bootstrap.The user can use the option parallel to turn on or off parallel computing.If parallel = TRUE, the option ncores allows the user to specify the number of cores in the parallel computing.The default is ncores = detectCores(), where detectCores() is a command of the package parallel.
The output of spe is a list containing four components: spe, ape, us and alpha.As the names indicate, spe stores the results for the SPE, and ape stores the results for the APE.Each component is a list with four elements: the estimates, lower and upper bounds of the uniform confidence bands, and bootstrapped standard errors obtained as rescaled interquartile ranges; see Chernozhukov et al. (2018).The other two components respectively store the percentile indices of the SPE and significance level of the confidence bands, which are used in the functions plot.spe and summary.spe.
The plot.spe function plots the result of spe in one graph that includes the SPEs, APE and their corresponding confidence bands.Its general syntax is plot.spe(object,ylim = NULL, main = NULL, sub = NULL, xlab = "Percentile Index", ylab = "Sorted Effects", ...) where object is the output of spe.The range of the x-axis is fixed to be the range of user-specified quantile index us.The options ylim, xlab and ylab respectively denote range of the y-axis and labels of the two axes.The options main and sub allow users to specify the main and sub titles of the plot.The option ... is an argument of the generic plot command that allows for further graphic parameters.
The syntax of summary.spe is as follows summary.spe(object,result = c("sorted", "average"), ...) If result = "sorted", the method provides a table that contain the SPE percentile indexes, estimates, bootstrap standard errors, pointwise and uniform confidence bands.If result = "average", the methods tabulates APE estimate, bootstrap standard error and confidence interval.
We illustrate the usage of the command with an empirical application to racial discrimination in mortgage lending.We use data on mortgage applications in Boston from 1990 (Munnell et al., 1996).The Federal Reserve Bank of Boston collected these data in relation to the Home Mortgage Disclosure Act (HMDA), which was passed to monitor minority access to the mortgage market.To retrieve the data from the package, issue the command data("mortgage") The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 The outcome variable, Y, is deny, a binary indicator for mortgage denial.The key variable of interest, T, is black, a binary indicator for the applicant being black, while the control variables, W, are financial and demographical characteristics that might affect the mortgage decision of the bank.These characteristics include the debt-to-income ratio (p_irat), expenses-to-income ratio (hse_inc), bad consumer credit (ccred), bad mortgage credit (mcred), credit problems (pubrec), denied mortgage insurance (denpmi), medium loan-to-house value (ltv_med), high loan-to-house value (ltv_high), self employed (selfemp), single (single), and high school graduate (hischl).The regression formula for the estimation of the PEs is specified as: fm <-deny ~black + p_irat + hse_inc + ccred + mcred + pubrec + ltv_med + ltv_high + denpmi + selfemp + single + hischl We invoke the spe command to calculate the bias-corrected estimates of the SPE at the quantile indexes {0.02, 0.03, . . . , 0.98} for the entire population using a logit model.We can also tabulate the result using summary.First, we apply the command to the APE and display the results in Table (1).
summary(test, result = "average") The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 confidence bands, whereas the columns labelled as ULB and UUB correspond to their uniform counterparts.
(t) denote the objects of interest in the least and most affected groups for the CA.Define These objects are indexed by the vector t, which specifies the variables of interest among the outcome and covariates.The option t is a vector that specifies t.Suppose the data has 5 variables ("a","b","c","d","e") and we are interested in "a" and "c", then we can either set t = c("a","c") directly or t = c(1,0,1,0,0).The second approach requires the user to know the order of the variables in the data set, which can be found with the command View.
Let Z denote the set of variables of interest.The package provides two types of objects of interest.If interest = "moment", then Λ u ∆, µ (t) include the means of the variables in Z for the least and most affected groups.If interest = "dist", then Λ u ∆, µ (t) includes the distributions of the variables in Z for the least and most affected groups.
If interest = "moment", ca estimates and makes inference on features of the chosen variables of interest in the least and most affected groups.These features are specified with the option cl.For example, if cl = "both", the command estimates the mean of the variables in the two affected groups.
The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 On the other hand, cl = "diff" estimates the differences of the means of the variables between the two groups.
If interest = "dist", the option range_cb specifies the region of interest for the domain of the distribution 2 .For example, if the variable of interest x is discrete, we can specify the region of interest as the support of x with range_cb = NULL.If x is continuous, we can specify range_cb = c(1:99)/100 if we are interested in the percentiles with indexes {0.01, 0.03, . . . , 0.99}.The default is range_cb = c(1:99)/100.The choice range_cb = NULL shuts down this feature by settimg the region of interest as all the distint values of the variable.
The output of ca depends on the choice of interest.For interest = "moment", the output is a list containing the estimates of Λ u ∆, µ (t), bootstrapped standard errors, pointwise and adjusted p-values.The null hypothesis for the p-values is that the estimated coefficient is zero.The p-values are adjusted for multiplicity to account for joint testing for all variables.In addition, users can adjust the pointwise p-values to account for joint testing for all simulataneous tests of categories within a factor.For example, if the variables of interest include a marital status factor "ms" with labels ("nevermarried","married","divorced","separated","widowed"), then users could consider adjusting the pointwise p-values within this factor.To illustrate how to define the option cat, suppose we have selected specified 3 variables of interest: t = c("a","b","c").Without loss of generality, assume "a" is not a factor, while "b" and "c" are two factors.Then we need to specify cat as cat = c("b","c").If cat = NULL, we report the unadjusted pointwise p-values.If interest = "dist", the output is a list containing the rearranged estimates, upper confidence bands and lower confidence bands for the variables of interest in both groups.
When interest = "moment", the user can use method summary.ca to tabulate the output.The general syntax is summary.ca(object,...) If cl = "both", the p-values are omitted from the table.When interest = "dist", users can plot the output for better visualization.The general syntax is plot.ca(object,var, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, ...) The user needs to input the variable for plotting with the option var.Note that the variable must be one of the variables specified in t.
CA <-ca(fm = fm, data = mortgage, var = "black", method = "logit", cl = "both", t = t, b = 500, bc = TRUE) summary(CA) Table (3) shows that the 10% of the applicants most affected by the racial mortgage denial gap are more likely to have either of the following characteristics relative to the 10% of the least affected applicants: mortgage denied, high debt-to-income ratio, black, high expense-to-income ratio, bad consumer or credit scores, credit problems, self employed, single, no high school diploma, and medium or high loan-to-income ratio.
Next we test if the differences in the characteristics between the two groups are statistically significant.To do so we set cl = "diff", which means taking difference between the two groups.The full command is as follows CAdiff <-ca(fm = fm, data = mortgage, var = "black", t = t, method = "logit", cl = "diff", b = 500, bc = TRUE) summary(CAdiff)  4) shows the results.The joint p-values account for the fact that we conduct simultaneous inference on 13 differences of variables.We employ the so-called "single-step" methods for controlling the family-wise error rate and obtain the p-values by bootstrap.We find that 8 differences are jointly statistically different from zero at the 5% level and 9 at the 10% level.We also plot the distributions of monthly debt-to-income ratio (p_irat) and monthly housing expenses-to-income ratio (hse_inc) for both groups.Such plots are useful if the user wants to visualize if there is stochastic dominance between the two groups.To do so we use the ca command and change the interest to "dist".t2 <-c("p_irat", "hse_inc") CAdist <-ca(fm = fm, data = mortgage, var = "black", method = "logit", t = t2, b = 500, interest = "dist") plot(CAdist, var = "p_irat", ylab = "Prob", xlab = "Monthly Debt-to-Income Ratio", sub = "logit model") plot(CAdist, var = "hse_inc", ylab = "Prob", xlab = "Monthly Housing Expenses-to-Income Ratio", sub = "logit model")

subpop
In addition to means and distributions, we can conduct inference on the sets of most and least affected units.Let Z be a compact subset of the support of the outcome and covariates.Define The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 as the set of the least affected units and as the set of the most affected units.The command subpop provides estimation and inference methods for these sets.The general syntax is subpop(fm, data, method = c("ols", "logit", "probit", "QR"), var_type = c("binary", "continuous", "categorical"), var, compare, subgroup = NULL, samp_weight = NULL, taus = c(5:95)/100, u = 0.1, alpha = 0.1, b = 500, seed = 1, parallel = FALSE, ncores = detectCores(), boot_type = c("nonpar", "weighted")) No new option is introduced in the command.For theoretical details, we refer the reader to Chernozhukov et al. (2015).
The output of subpop is a list containing six components: cs_most, cs_least, u, subgroup, most and least.As the names indicate, cs_most and cs_least denote the confidence sets for the most and least affected groups.u stores the percentile index that defines the most and least affected groups.subgroup stores the indicators for the population of interest specified with the option subgroup.most and least store the estimates of the most and least affected units respectively.The first four components are used in plot.subpop and the last two components can be visualized with summary.subpop.
The general syntax of summary.subpop is summary.subpop(object,affected = c("most", "least"), vars = NULL, ...) The option object is the output of subpop.The option affected allows users to tabulate either the most or the least affected units, and the option vars provides summary statistics for user-specified variables of interest.The summary statistics include the minimum, 1st quartile, median, mean, 3rd quaritle and maximum.The default is NULL, which produces summary statistics of all the variables.The plot.subpop function plots 2-dimensional projections of the confidence sets for the most and least affected units with respect to two variables.The general syntax is plot.subpop(object,varx, vary, xlim = NULL, ylim = NULL, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, overlap = FALSE, ...) The user needs to specify the two variables for the projection with varx and vary, and object should be specified as the output of subpop.The option overlap allows users to either keep or drop common observations in both confidence sets.The default is overlap = FALSE, which drops the observations.
We estimate the 10% most and least affected applicants in the mortgage application.
set_b <-subpop(fm, data = mortgage, method = "logit", var = "black", u = 0.1, alpha = 0.1, b = 500) Using summary, we can estimate the most/least affected applicants and report summary statistics of the variables of interest in the most/least affected groups.Table 5 lists the estimated most affected applicants.For the purpose of illustration we only show the first ten rows and columns.
groups <-summary(set_b, vars = c("p_irat", "hse_inc")) most_affected <-groups$most_affected The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 We can also report summary statistics of the variables of interest in the most and least affected groups using the output of summary.sum_stats_most <-groups$stats_most We finally plot the projection of the confidence sets for the most and least affected applicants with respect to p_irat and hse_inc.The package features nonparametric and weighted bootstrap.When boot_type = "nonpar", the package draws samples with replacement of the variables and samp_weight and run all estimation commands weighted by samp_weight.When boot_type = "weighted", the package draws weights from the standard exponential distribution and runs all estimation commands weighted by the product of these weights and samp_weight.We use the boot package (Canty and Ripley, 2017), which is flexible enough to accommodate both types.
Inference on CA The joint p-value for the hypothesis where c = (−1, 1) , tu (T ) is a bootstrap estimator of and Σu (t) is a bootstrap estimator of Σ u (t), the asymptotic variance of Λ u ∆, µ (t).5 The poitwise p-value for the hypothesis Λ +u ∆,µ (t) = Λ −u ∆,µ (t) is obtained by setting T = {t}.Inference on sets of least/most affected units The outer (1 where Σ (x, u) is an estimator of the asymptotic variance of ∆ (x) The critical value ĉ (1 − α) is the (1 − α)-quantile of the statistic: while the critical value c (1 − α) is the (1 − α)-quantile of the statistic: To implement sup x ∈ X : ∆ (x) = ∆ * µ (u) in the code, we find the minimum of | ∆ (x) − ∆ * µ (u) | among all x's.

Gender wage gap application
We analyze the gender wage gap using data from the U.S. March Supplement of the Current Population Survey (CPS) in 2015.The gender wage gap measures the difference in wages between female and male workers with the same observable characteristics.The SPE method allows us to look for heterogeneity in the gender wage gap and to identify the characteristics of the most and least affected workers.To retrieve the data, issue the command data(wage2015) The data contain the following variables: log hourly wages (lnw); a marital status factor ms with 5 categories widowed,divorced,separated,nevermarried,married; CPS sampling weights (weight); a indicator for female worker (female); an education attainment factor educ with 5 categories lhs (less than high school graduate), hsg (high school graduate), sc (some college), cg (college) and ad (advanced degree); a region factor region with 4 categories mw (midwest), so (south), we (west) and ne (northeast); potential work experience exp1 computed as max {0, age -years of educ − 7}; 4 powers of experience (exp2,exp3,exp4); an occupation factor occ with 5 categories manager, service, sales, construction and production, and an industry factor ind with 12 categories minery, construction, manufacture, retail, transport, information, finance, professional, education, leisure, services and public.
The CPS data contains sampling weights in the variable weight, so we will set samp_weight = wage2015$weight.Because women in general earn less than men, the PEs are predominately negative if we use the female indicator.To facilitate the interpretation of most and least affected groups we create an indicator called male, which assigns 0 to female workers instead.
cat <-c("ms", "educ", "region", "occ", "ind") Then we issue the ca command and tabulate the mean characteristics of the two groups Char <-ca(fm = fmla1, data = wage2015, samp_weight = wage2015$weight, var = "male", t = tw, cl = "both", b = 500, subgroup = wage2015[,"female"] == 1, boot_type = "weighted", bc = FALSE, u = 0.05) The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 Table ( 7) reports the estimates and standard errors obtained by weighted bootstrap with 500 replications.We find that, compared to the 5% least affected women, the 5% most affected women are much more likely to be married, much less likely to be never married, less likely to have an advanced degree, live in the South, don't live in Northeast and West, possess more potential experience, are more likely to have sales and non-managerial occupations, and work more often in manufacture, retail, transport and finance, and less often in education and leisure industries.

Estimate
Figure ( 6) shows that there are relatively more least affected women with low experience at all wage levels, more high affected women with high wages with between 15 and 45 years of experience, and more least affected women which are not married at all experience levels.USA ivanf@bu.edu

Figure 1 :
Figure 1: Output of plot.spefunction.Estimates and 90% confidence bands for the APE and SPE of black indicator on probability of mortgage denial.Confidence bands obtained by bootstrap with 500 replications.

Figure ( 2 )
Figure (2)  shows that for both variables the distribution in the most affected group first-order stochastically dominates the distribution in least affected group.

Figure 2 :
Figure 2: Output of plot.ca.Bias-corrected estimates and 90% confidence bands for the distribution of p_irat and hse_inc in the 10% most and least affected groups by the racial mortgage denial gap.Confidence bands obtained by bootstrap with 500 replications.
Figure (3) keeps the overlapped observations and shows that the most affected applicants tend to have higher levels of debt to income and expenses to income ratios.plot(set_b, varx = mortgage$p_irat, vary = mortgage$hse_inc, xlim = c(0, 1.5), ylim = c(0, 1.5), xlab = "Debt/Income", ylab = "Housing expenses/Income", overlap = TRUE) Inference Chernozhukov et al. (2018) derived the asymptotic distributions and bootstrap validity for the estimators of the SPE and classification analysis.The package uses bootstrap to compute standard errors and critical values for tests and confidence bands.

Figure 3 :
Figure 3: Projection of the 90% confidence sets for the groups of the 10% most and least affected applicants by the racial mortgage denial gap with respect to p_irat and hse_inc.Confidence regions obtained by bootstrap with 500 replications.

Figure 4 :
Figure 4: Estimates and 90% confidence bands for the APE and SPE of the gender wage gap for women.Confidence bands obtained by weighted bootstrap with 500 replications.

Figure 5 :
Figure 5: Estimates and 90% confidence bands for the APE and SPE of the gender wage gap for married and never married women.Confidence bands obtained by weighted bootstrap with 500 replications.

Table 1 :
Bias-corrected estimate and 90% band for APE of black indicator on probability of mortgage denial.
Next, we obtain the results for the SPE.To save space we only show the first 15 rows of summary(test) in Table(2).The columns labelled as PLB and PUB correspond to the lower and upper pointwise

Table 2 :
Bias-corrected estimates and 90% pointwise and uniform bands for SPE of black indicator on probability of mortgage denial.
caThe command ca provides estimation and inference methods for the CA.The general syntax is:

Table 3 :
Bias-corrected estimates and standard errors for the average characteristics of the groups with the 10% most and least effected applicants by the racial mortgage denial gap.

Table 4 :
Bias-corrected estimates, standard errors, and p-values for the differences in average characteristics between the groups with the 10% most and least affected applicants by the racial mortgage denial gap.P-values are for the hypothesis that the difference is zero.P-vals are pointwise p-values for each difference and JP-vals are joint p-values that account for simultaneous inference on all the differences.

Table 5 :
Elements of the estimated group of the 10% most affected applicants by racial mortgage denial gap.
Table (6) reports summary statistics for p_irat and hse_inc for the applicants in the most affected group.

Table 6 :
Summary statistics of the estimated group of the 10% most affected applicants by racial mortgage denial gap.
Nonlinear estimators are prone to finite-sample bias, and bootstrap methods can estimate the bias up to some asymptotic order.To bias correct the SPE, replace ∆ * µ with 2 ∆ * µ − ∆ * µ , where ∆ * µ is the mean of bootstrap draws.Similarly, for CA we replace Λ u is the mean of the bootstrap draws.Bias-corrected estimates and corresponding inference will be reported if bc = TRUE, the default.