This article describes an R package bqror that estimates Bayesian quantile regression in ordinal models introduced in Rahman (2016). The paper classifies ordinal models into two types and offers computationally efficient yet simple Markov chain Monte Carlo (MCMC) algorithms for estimating ordinal quantile regression. The generic ordinal model with 3 or more outcomes (labeled ORI model) is estimated by a combination of Gibbs sampling and Metropolis-Hastings algorithm, whereas an ordinal model with exactly 3 outcomes (labeled ORII model) is estimated using a Gibbs sampling algorithm only. In line with the Bayesian literature, we suggest using the marginal likelihood for comparing alternative quantile regression models and explain how to compute it. The models and their estimation procedures are illustrated via multiple simulation studies and implemented in two applications. The article also describes several functions contained within the bqror package, and illustrates their usage for estimation, inference, and assessing model fit.
Quantile regression defines the conditional quantiles of a continuous dependent variable as a function of the covariates without assuming any error distribution (Koenker and Bassett 1978). The method is robust and offers several advantages over least squares regression such as desirable equivariance properties, invariance to monotone transformation of the dependent variable, and robustness against outliers (Koenker 2005; Davino et al. 2014; Furno and Vistocco 2018). However, quantile regression with discrete outcomes is more complex because quantiles of discrete data cannot be obtained through a simple inverse operation of the cumulative distribution function (cdf). Besides, discrete outcome (binary and ordinal) modeling requires location and scale restrictions to uniquely identify the parameters (see Section 2 for details). Kordas (2006) estimated quantile regression with binary outcomes using simulated annealing, while Benoit and Van den Poel (2010) proposed Bayesian binary quantile regression where a working likelihood for the latent variable was constructed by assuming the error follows an asymmetric Laplace (AL) distribution (Yu and Moyeed 2001). The estimation procedure for the latter is available in the bayesQR package of R software (Benoit and Van den Poel 2017). A couple of recent works on Bayesian quantile regression with binary longitudinal (panel) outcomes are Rahman and Vossmeyer (2019) and Bresson et al. (2021). Extending the quantile framework to ordinal outcomes is more intricate due to the difficulty in satisfying the ordering of cut-points while sampling. Rahman (2016) introduced Bayesian quantile analysis of ordinal data and proposed two efficient MCMC algorithms. Since Rahman (2016), ordinal quantile regression has attracted some attention, such as in Alhamzawi (2016), Alhamzawi and Ali (2018), Ghasemzadeh et al. (2018), Rahman and Karnawat (2019), Ghasemzadeh et al. (2020), and Tian et al. (2021).
Ordinal outcomes occur in a wide class of applications in economics, finance, marketing, and the social sciences. Here, ordinal regression (e.g. ordinal probit, ordinal logit) is an important tool for modeling, analysis, and inference. Given the prevalence of ordinal models in applications and the recent theoretical developments surrounding ordinal quantile regression, an estimation package is essential so that applied econometricians and statisticians can benefit from a more comprehensive data analysis. At present, no statistical software (such as R, MATLAB, Python, Stata, SPSS, and SAS) have any package for estimating quantile regression with ordinal outcomes. The current paper fills this gap in the literature and describes the implementation of bqror package (version 1.6.0) for estimation and inference in Bayesian ordinal quantile regression.
The bqror package offers two MCMC algorithms. Ordinal model with 3 or more outcomes is estimated through a combination of Gibbs sampling (Casella and George 1992) and Metropolis-Hastings (MH) algorithm (Chib and Greenberg 1995). The method is implemented in the function quantregOR1
. For ordinal models with exactly 3 outcomes, the package presents a Gibbs sampling algorithm that is implemented in the function quantregOR2
. We recommend using this procedure for an ordinal model with 3 outcomes, since its simpler and faster. Both functions, quantregOR1
and quantregOR2
, report typical posterior summaries such as the mean, standard deviation, 95% credible interval, and inefficiency factor of the model parameters. To compare alternative quantile regression models, we recommend using the marginal likelihood over the deviance information criterion (DIC). This is because the “Bayesian approach” to compare models is via the marginal likelihood (Chib 1995; Chib and Jeliazkov 2001). So, the bqror package also provides functions for computing the marginal likelihood. Additionally, the package includes functions for calculating the covariate effects and example codes to produce trace plots of MCMC draws. Lastly, this paper uses the bqror package to demonstrate the estimation of quantile ordinal models on simulated data and real-life applications.
Ordinal outcomes are common in a wide class of applications in economics, finance, marketing, social sciences, statistics in medicine, and transportation. In a typical study, the observed outcomes are ordered and categorical; so for the purpose of analysis scores/numbers are assigned to each outcome level. For example, in a study on public opinion about offshore drilling (Mukherjee and Rahman 2016), responses may be recorded as follows: 1 for ‘strongly oppose’, 2 for ‘somewhat oppose’, 3 for ‘somewhat support’, and 4 for ‘strongly support’. The numbers have an ordinal meaning but have no cardinal interpretation. We cannot interpret a score of 2 as twice the support compared to a score of 1, or the difference in support between 2 and 3 is the same as that between 3 and 4. With ordinal outcomes, the primary modeling objective is to express the probability of outcomes as a function of the covariates. Ordinal models that have been extensively studied and employed in applications include the ordinal probit and ordinal logit models (Johnson and Albert 2000; Greene and Hensher 2010), but they only give information about the average probability of outcomes conditional on the covariates.
Quantile regression with ordinal outcomes can be estimated using the monotone equivariance property and provides information on the probability of outcomes at different quantiles. In the spirit of Albert and Chib (1993), the ordinal quantile regression model can be presented in terms of an underlying latent (or unobserved) variable zi as follows:
zi=x′iβp+ϵi,∀i=1,⋯,n,
Working directly with the AL likelihood (3) is not convenient for MCMC sampling. Therefore, the latent formulation of the ordinal quantile model (1), following Kozumi and Kobayashi (2011), is expressed in the normal-exponential mixture form as follows,
zi=x′iβp+θwi+τ√wiui,∀i=1,⋯,n,
The term “ORI model” describes an ordinal model in which the number of outcomes (J) is equal to or greater than 3, location restriction is imposed by setting γp,1=0, and scale restriction is achieved via constant variance (for a given value of p, variance of a standard AL distribution is constant; see Rahman (2016)). Note that in contrast to Rahman (2016), our definition of ORI model includes an ordinal model with exactly 3 outcomes. The location and scale restrictions are necessary to uniquely identify the parameters (see Jeliazkov et al. (2008), and Jeliazkov and Rahman (2012) for further details and a pictorial representation).
During the MCMC sampling of the ORI model, we need to preserve the ordering of cut-points (γp,0=−∞<γp,1<γp,2<…<γp,J−1<γp,J=∞). This is achieved by using a monotone transformation from a compact set to the real line. Many such transformations are available (e.g., log-ratios of category bin widths, arctan, arcsin), but the bqror package utilizes the logarithmic transformation (Albert and Chib 2001; Rahman 2016),
δp,j=ln(γp,j−γp,j−1),2≤j≤J−1.
With all the modeling ingredients in place, we employ the Bayes’ theorem and express the joint posterior distribution as proportional to the product of the likelihood and the prior distributions. As in Rahman (2016), we employ independent normal priors: βp∼N(βp0,Bp0), δp∼N(δp0,Dp0) in the bqror package. The augmented joint posterior distribution for the ORI model can thus be written as,
π(z,βp,δp,w|y)∝f(y|z,βp,δp,w)π(z|βp,w)π(w)π(βp)π(δp),∝{n∏i=1f(yi|zi,δp)}π(z|βp,w)π(w)π(βp)π(δp),∝n∏i=1{J∏j=11{γp,j−1<zi≤γp,j}N(zi|x′iβp+θwi,τ2wi)E(wi|1)}×N(βp|βp0,Bp0)N(δp|δp0,Dp0).
The conditional posterior distributions are derived from the augmented joint posterior distribution (6), and the parameters are sampled as per Algorithm 1. This algorithm is implemented in the bqror package. The parameter βp is sampled from an updated multivariate normal distribution and the latent weight w is sampled element-wise from a generalized inverse Gaussian (GIG) distribution. The cut-point vector δp is sampled marginally of (z,w) using a random-walk MH algorithm. Lastly, the latent variable z is sampled element-wise from a truncated normal (TN) distribution.
Algorithm~1: Sampling in ORI model.
The term “ORII model” is used for an ordinal model with exactly 3 outcomes (i.e., J=3) where both location and scale restrictions are imposed by fixing the cut-points. Since there are only 2 cut-points and both are fixed at some values, the scale of the distribution needs to be free. Therefore, a scale parameter σp is introduced and the quantile ordinal model is rewritten as follows:
zi=x′iβp+σpϵi=x′iβp+σpθwi+σpτ√wiui,∀i=1,⋯,n,γj−1<zi≤γj⇒yi=j,∀i=1,⋯,n;j=1,2,3,
The next step is to specify the prior distributions required for Bayesian inference. We follow Rahman (2016) and assume βp∼N(βp0,Bp0) and σp∼IG(n0/2,d0/2); where IG stands for an inverse-gamma distribution. These are the default prior distributions in the bqror package. Employing the Bayes’ theorem, the augmented joint posterior distribution can be expressed as,
π(z,βp,ν,σp|y)∝f(y|z,βp,ν,σp)π(z|βp,ν,σp)π(ν|σp)π(βp)π(σp),∝{n∏i=1f(yi|zi,σp)}π(z|βp,ν,σp)π(ν|σp)π(βp)π(σp),∝{n∏i=13∏j=11(γj−1<zi≤γj)N(zi|x′iβp+θνi,τ2σpνi)E(νi|σp)}×N(βp|βp0,Bp0)IG(σp|n0/2,d0/2),
The augmented joint posterior distribution, given by equation (8), can be utilized to derive the conditional posterior distributions and the parameters are sampled as presented in Algorithm 2. This involves sampling βp from an updated multivariate normal distribution and sampling σp from an updated IG distribution. The latent weight ν is sampled element-wise from a GIG distribution and similarly, the latent variable z is sampled element-wise from a truncated normal distribution.
Algorithm~2: Sampling in ORII model.
Rahman (2016) employed DIC (Spiegelhalter et al. 2002; Gelman et al. 2013) for model comparison. However, in the Bayesian framework alternative models are typically compared using the marginal likelihood or the Bayes factor (Poirier 1995; Greenberg 2012). Therefore, we prefer using the marginal likelihood (or the Bayes factor) for comparing two or more regression models at a given quantile.
Consider a model Ms with parameter vector Θs. Let f(y|Ms,Θs) be its sampling density and π(Θs|Ms) be the prior distribution; where s=1,…,S. Then, the marginal likelihood for the model Ms is given by the expression, m(y|Ms)=∫f(y|Θs,Ms)π(Θs|Ms)dΘs. The Bayes factor is the ratio of marginal likelihoods. So, for any two models Mq versus Mr, the Bayes factor, Bqr=m(y|Mq)m(y|Mr)=∫f(y|Mq,Θq)π(Θq|Mq)dΘq∫f(y|Mr,Θr)π(Θr|Mr)dΘr, can be easily computed once we have the marginal likelihoods.
Chib (1995) and later Chib and Jeliazkov (2001) showed that a simple and stable estimate of marginal likelihood can be obtained from the MCMC outputs. The approach is based on the recognition that the marginal likelihood can be written as the product of likelihood function and prior density over the posterior density. So, the marginal likelihood m(y|Ms) for model Ms is expressed as,
m(y|Ms)=f(y|Ms,Θs)π(Θs|Ms)π(Θs|Ms,y).
Chib (1995) refers to equation (9) as the basic marginal likelihood identity since it holds for all values in the parameter space, but typically computed at a high density point (e.g., mean, mode) denoted Θ∗ to minimize estimation variability. The likelihood ordinate f(y|Ms,Θ∗) is directly available from the model and the prior density π(Θ∗|Ms) is assumed by the researcher. The novel part is the computation of posterior ordinate π(Θ∗|y,Ms), which is estimated using the MCMC outputs. Since the marginal likelihood is often a large number, it is convenient to express it on the logarithmic scale. An estimate of the logarithm of marginal likelihood is given by,
lnˆm(y)=lnf(y|Θ∗)+lnπ(Θ∗)−lnˆπ(Θ∗|y),
We know from Section 2.1 that the MCMC algorithm for estimating the ORI model is defined by the following conditional posterior densities: π(βp|z,w), π(δp|βp,y), π(w|βp,z), and π(z|βp,δp,w,y). The conditional posteriors for βp, w, and z have a known form, but that of δp is not tractable and is sampled using an MH algorithm. Consequently, we adopt the approach of Chib and Jeliazkov (2001) to calculate the marginal likelihood for the ORI model.
To simplify the computational process (specifically, to keep the computation over a reasonable dimension), we estimate the marginal likelihood marginally of the latent variables (w,z). Moreover, we decompose the posterior ordinate as,
π(β∗p,δ∗p|y)=π(δ∗p|y)π(β∗p|δ∗p,y),
To obtain an estimate of π(δ∗p|y), we need to express it in a computationally convenient form. The parameter δp is sampled using an MH step, which requires specifying a proposal density. Let q(δp,δ′p|βp,w,z,y) denote the proposal density for the transition from δp to δ′p, and let,
αMH(δp,δ′p)=min{1,f(y|βp,δ′p)π(βp)π(δ′p)f(y|βp,δp)π(βp)π(δp)×q(δ′p,δp|βp,w,z,y)q(δp,δ′p|βp,w,z,y)},
We closely follow the derivation in Chib and Jeliazkov (2001) and arrive at the following expression of the posterior ordinate,
π(δ∗p|y)=E1{αMH(δp,δ∗p|βp,w,z,y)q(δp,δ∗p|βp,w,z,y)}E2{αMH(δ∗p,δp|βp,w,z,y)},
The estimation of the quantity in the denominator is tricky. This requires generating an additional sample (say of H iterations) from the reduced conditional densities: π(βp|w,z), π(w|βp,z), and π(z|βp,δ∗p,w,y), where note that δp is fixed at δ∗p in the MCMC sampling, and thus corresponds to a reduced run. Moreover, at each iteration, we generate
δ(h)p∼q(δ∗p,δp|β(h)p,w(h),z(h),y)≡fN(δp|δ∗p,ι2ˆD).
ˆπ(δ∗p|y)=M−1∑Mm=1{αMH(δ(m)p,δ∗p|Λ(m),y)q(δ(m)p,δ∗p|Λ(m),y)}H−1∑Hh=1{αMH(δ∗p,δ(h)p|Λ(h),y)}.
The computation of the posterior ordinate π(β∗p|δ∗p,y) is trivial. We have the sample of H draws {w(h),z(h)} from the reduced run, which are marginally of βp from the distribution π(w,z|δ∗p,y). These draws are utilized to estimate the posterior ordinate as,
ˆπ(β∗p|δ∗p,y)=H−1H∑h=1π(β∗p|δ∗p,w(h),z(h),y).
We know from Section 2.2 that the ORII model is estimated by Gibbs sampling and hence we follow Chib (1995) to compute the marginal likelihood. The Gibbs sampler consists of four conditional posterior densities given by π(βp|σp,ν,z), π(σp|βp,ν,z), π(ν|βp,σp,z), and π(z|βp,σp,ν,y). However, the variables (ν,z) are latent. So, we integrate them out and write the posterior ordinate as π(β∗p,σ∗p|y)=π(β∗p|y)π(σ∗p|β∗p,y), where the terms on the right hand side can be written as,
π(β∗p|y)=∫π(β∗p|σp,ν,z,y)π(σp,ν,z|y)dσpdνdz,π(σ∗p|β∗p,y)=∫π(σ∗p|β∗p,ν,z,y)π(ν,z|β∗p,y)dνdz,
The posterior ordinate π(β∗p|y) can be estimated as the ergodic average of the conditional posterior density with the posterior draws of (σp,ν,z). Therefore, π(β∗p|y) is estimated as,
ˆπ(β∗p|y)=G−1G∑g=1π(β∗p|σ(g)p,ν(g),z(g),y).
Substituting the two density estimates given by equations (16) and (17) into equation (10), we obtain an estimate of the logarithm of marginal likelihood,
lnˆm(y)=lnf(y|β∗p,σ∗p)+ln[π(β∗p)π(σ∗p)]−ln[ˆπ(β∗p|y)ˆπ(σ∗p|β∗p,y)],
This section explains the data generating process for simulation studies, the functions offered in the bqror package, and usage of the functions for estimation and inference in ordinal quantile models.
Data Generation: The data for the simulation study of the ORI model is generated from the regression: zi=x′iβ+ϵi, where β=(−4,5,6), (x2,x3)∼U(0,1), and ϵi∼AL(0,σ=1,p) for i=1,…,n. Here, U and AL denote a uniform distribution and an asymmetric Laplace distribution, respectively. The z values are continuous and are classified into 4 categories based on the cut-points (0,2,4) to generate ordinal values of y, the outcome variable. We follow the above procedure to generate 3 data sets with 500 observations (i.e., n=500) each. The 3 data sets correspond to the quantile p equaling 0.25, 0.50, and 0.75, and are stored as data25j4
, data50j4
, and data75j4
, respectively. Note that the last two letters in the name of the data object (i.e., j4
) denote the number of unique outcomes in the y variable.
We now describe the major functions for Bayesian quantile estimation of ORI model, demonstrate their usage, and note the inputs and outputs of each function.
quantregOR1: The quantregOR1
is the primary function for estimating Bayesian quantile regression in ordinal models with 3 or more outcomes (i.e., ORI model) and implements Algorithm 1. In the code snippet below, we first read in the data and then do the following: define the ordinal response variable (y
) and covariate matrix (xMat
), specify the number of covariates (k
) and number of outcomes (J
), and set the prior means and covariances for βp and δp. We then call the quantregOR1
function and specify the inputs: ordinal outcome variable (y
), covariate matrix including a column of ones (xMat
), prior mean (b0
) and prior covariance matrix (B0
) for the regression coefficients, prior mean (d0
) and prior covariance matrix (D0
) for the transformed cut-points, burn-in size (burn
), post burn-in size (mcmc
), quantile (p
), the tuning factor (tune
) to adjust the MH acceptance rate, and the auto correlation cutoff value (accutoff
). The last input verbose
, when set to TRUE
will print the summary output.
In the code below, we use a diffuse normal prior βp∼N(0k,10∗Ik) where 0k and Ik are matrices of dimension k×1 and k×k, respectively. The prior distribution can be further diffused (i.e., made less informative) by increasing the prior variance from 10 to say 100. Besides, the prior variance for δp should be small, such as 0.25∗IJ−2, since the distribution is on the transformed cut-points, which is on the logarithmic scale. If there is a need for prior elicitation, they can be designed from previous subject based knowledge or by estimating the model on a training sample and then using the results to form prior distributions (See Greenberg (2012), for examples.)
library('bqror')
data("data25j4")
y <- data25j4$y
xMat <- data25j4$x
k <- dim(xMat)[2]
J <- dim(as.array(unique(y)))[1]
b0 <- array(rep(0, k), dim = c(k, 1))
B0 <- 10*diag(k)
d0 <- array(0, dim = c(J-2, 1))
D0 <- 0.25*diag(J - 2)
modelORI <- quantregOR1(y = y, x = xMat, b0, B0, d0, D0, burn = 1125,
mcmc = 4500, p = 0.25, tune = 1, accutoff = 0.5,
verbose = TRUE)
[1] Summary of MCMC draws:
Post Mean Post Std Upper Credible Lower Credible Inef Factor
beta_1 -3.6434 0.4293 -2.8369 -4.5073 2.3272
beta_2 4.8283 0.5597 5.9577 3.7624 2.4529
beta_3 5.9929 0.5996 7.2474 4.8819 2.7491
delta_1 0.7152 0.1110 0.9616 0.5004 3.2261
delta_2 0.7456 0.0940 0.9281 0.5543 2.1497
[1] MH acceptance rate: 31.82%
[1] Log of Marginal Likelihood: -545.72
[1] DIC: 1060.56
The outputs from the quantregOR1
function are the following quantities: summary.bqrorOR1
, postMeanbeta
, postMeandelta
, postStdbeta
, postStddelta
, gammacp
, acceptancerate
, logMargLike
, dicQuant
, ineffactor
, betadraws
, and deltadraws
. A detailed description of each output is presented in the bqror package help file. In the summary, we report the posterior mean, posterior standard deviation, 95% posterior credible (or probability) interval, and the inefficiency factor of the quantile regression coefficients βp and transformed cut-points δp. These quantities are presented in the last five columns and labeled appropriately.
The posterior means of (βp,δp) are close to the true values used to generate the data with small standard deviations. So, the quantregOR1
function is successful in recovering the true values of the parameters. The inefficiency factor is computed from the MCMC samples using the batch-means method (Greenberg 2012). They indicate the cost of working with correlated samples. For example, an inefficiency factor of 3 implies that it takes 3 correlated draws to get one independent draw. As such, low inefficiency factor indicates better mixing and a more efficient MCMC algorithm. Inefficiency factor also bears a direct relationship with effective sample size, where the latter can be obtained as the total number of (post burn-in) MCMC draws divided by the inefficiency factor (Chib 2012). The inefficiency factors for (βp, δp) are stored in the object modelOR1
of class bqrorOR1
and can be obtained by calling modelORI$ineffactor
.
The third last row displays the random-walk MH acceptance rate for δp, for which the preferred acceptance rate is around 30 percent. The last two rows present the model comparison measures, the logarithm of marginal likelihood and the DIC. The logarithm of marginal likelihood is computed using the MCMC outputs from the complete and reduced runs as explained in Section 3, while the principle for computing the DIC is presented in Gelman et al. (2013). For any two competing models at the same quantile, the model with a higher (lower) marginal likelihood (DIC) provides a better model fit.
While the two model comparison measures are printed as part of the summary output, they can also be called individually. For example, the logarithm of marginal likelihood can be obtained by calling modelORI$logMargLike
. Whereas, the DIC can be obtained by calling modelORI$dicQuant$DIC
. Two more quantities that are part of the object modelORI$dicQuant
are effective number of parameters denoted pD and the deviance computed at the posterior mean. They can be obtained by calling modelORI$dicQuant$pd
and modelORI$dicQuant$dev
, respectively. Besides, post estimation, one may also use the command modelORI$summary
or summary.bqrorOR1(modelORI)
to extract and print the summary output.
covEffectOR1: The function covEffectOR1
computes the average covariate effect for different outcomes of ORI model at a specified quantile, marginally of the parameters and the remaining covariates. While a demonstration of this function is best understood in a real-life study and is presented in the application section, here we present the mechanics behind the computation of average covariate effect.
Suppose, we want to compute the average covariate effect when the l-th covariate {xi,l} is set to the values a and b, denoted as {xai,l} and {xbi,l}, respectively. We split the covariate and parameter vectors as follows: xai=(xai,l,xi,−l), xbi=(xbi,l,xi,−l), and βp=(βp,l,βp,−l), where −l in the subscript denotes all covariates (parameters) except the l-th covariate (parameter). We are interested in the distribution of the difference {Pr(yi=j|xbi,l)−Pr(yi=j|xai,l)} for 1≤j≤J, marginalized over {xi,−l} and (βp,δp), given the data y=(y1,⋯,yn)′. We marginalize the covariates using their empirical distribution and the parameters based on the posterior distribution.
To obtain draws from the distribution {Pr(yi=j|xbi,l)−Pr(yi=j|xai,l)}, we use the method of composition (see Chib and Jeliazkov (2006), Rahman and Vossmeyer (2019), and Bresson et al. (2021), for additional details). In this process, we randomly select an individual, extract the corresponding sequence of covariate values, draw a value (βp,δp) from the posterior distributions, and evaluate {Pr(yi=j|xbi,βp,δp)−Pr(yi=j|xai,βp,δp)}, where,
Pr(yi=j|xqi,βp,δp)=FAL(γp,j−xqi,lβp,l−x′i,−lβp,−l)−FAL(γp,j−1−xqi,lβp,l−x′i,−lβp,−l),
Data Generation: The data generating process for the ORII model closely resembles that of ORI model. In particular, 500 observations are generated for each value of p from the regression model: zi=x′iβ+ϵi, where β=(−4,6,5), (x2,x3)∼U(0,1) and ϵi∼AL(0,σ=1,p) for i=1,…,n. The continuous values of z are classified based on the cut-points (0,3) to generate 3 ordinal values for y, the outcome variable. Once again, we choose p equal to 0.25, 0.50, and 0.75 to generate three samples from the model, which are referred to as data25j3
, data50j3
, and data75j3
, respectively. The last two letters in the names of the data objects (i.e., j3
) denote the number of unique outcomes in the y variable.
quantregOR2: The function quantregOR2
implements Algorithm 2 and is the main function and for estimating Bayesian quantile regression in ORII model i.e., an ordinal model with exactly 3 outcomes. In the code snippet below, we first read the data, define the required quantities, and then call the quantregOR2
for estimating the quantile model. The function inputs are as follows: the ordinal outcome variable (y
), covariate matrix including a column of ones (xMat
), prior mean (b0
) and prior covariance matrix (B0
) for βp, prior shape (n0
) and scale (d0
) parameters for σp, second cut-point (gammacp2
), burn-in size (burn
), post burn-in size (mcmc
), quantile (p
), auto correlation cutoff value (accutoff
), and the verbose
option which when set to TRUE
(FALSE
) will (not) print the outputs. We use a relatively diffuse prior distributions on (βp,σp) to allow the data to speak for itself.
library('bqror')
data("data25j3")
y <- data25j3$y
xMat <- data25j3$x
k <- dim(xMat)[2]
b0 <- array(rep(0, k), dim = c(k, 1))
B0 <- 10*diag(k)
n0 <- 5
d0 <- 8
modelORII <- quantregOR2(y = y, x = xMat, b0, B0, n0, d0, gammacp2 = 3,
burn = 1125, mcmc = 4500, p = 0.25, accutoff = 0.5,
verbose = TRUE)
[1] Summary of MCMC draws:
Post Mean Post Std Upper Credible Lower Credible Inef Factor
beta_1 -3.8900 0.4560 -3.0578 -4.8346 2.4784
beta_2 5.8257 0.5341 6.9263 4.8243 2.1231
beta_3 4.7194 0.5227 5.7502 3.7194 2.2693
sigma 0.8968 0.0763 1.0587 0.7626 2.4079
[1] Log of Marginal Likelihood: -404.34
[1] DIC: 790.72
The outputs from the quantregOR2
function are the following: summary.bqrorOR2
, postMeanbeta
, postMeansigma
, postStdbeta
, postStdsigma
, logMargLike
, dicQuant
, ineffactor
, betadraws
, and sigmadraws
. A detailed description of each output is presented in the bqror package help file. Once again, we summarize the MCMC draws by reporting the posterior mean, posterior standard deviation, 95% posterior credible (or probability) interval, and the inefficiency factor of the quantile regression coefficients βp and scale parameter σp. The output also exhibits the logarithm of marginal likelihood and the DIC for ORII model, where the former is computed using the Gibbs output as explain in Section 3.
covEffectOR2: The function covEffectOR2
computes the average covariate effect for the 3 outcomes of ORII model at a specified quantile, marginally of the parameters and remaining covariates. The principle underlying the computation is analogous to that of ORI model and is explained below. An implementation of the function is presented in the tax policy application.
Suppose, we are interested in computing the average covariate effect for the l-th covariate {xi,l} for two different values a and b, and split the covariate and parameter vectors as: xai=(xai,l,xi,−l), xbi=(xbi,l,xi,−l), β=(βp,l,βp,−l). We are interested in the distribution of the difference {Pr(yi=j|xbi,l)−Pr(yi=j|xai,l)} for 1≤j≤J=3, marginalized over {xi,−l} and (βp,σp), given the data y=(y1,⋯,yn)′. We again employ the method of composition i.e., randomly select an individual, extract the corresponding sequence of covariate values, draw a value (βp,σp) from their posterior distributions, and lastly evaluate {Pr(yi=j|xbi,βp,σp)−Pr(yi=j|xai,βp,σp)}, where
Pr(yi=j|xqi,βp,σp)=FAL(γp,j−xqi,lβp,l−x′i,−lβp,−lσp)−FAL(γp,j−1−xqi,lβp,l−x′i,−lβp,−lσp),
In this section, we consider the educational attainment and tax policy applications from Rahman (2016) to demonstrate the real data applications of the proposed bqror package. While the educational attainment study shows the implementation of ordinal quantile regression in ORI model, the tax policy study highlights the use of ordinal quantile regression in ORII model. Data for both the applications are included as a part of the bqror package.
In this application, the goal is to study the effect of family background, individual level variables, and age cohort on educational attainment of 3923 individuals using data from the National Longitudinal Study of Youth (NLSY, 1979) (Jeliazkov et al. 2008; Rahman 2016). The dependent variable in the model, education degrees, has four categories: (i) Less than high school, (ii) High school degree, (iii) Some college or associate's degree, and (iv) College or graduate degree. A bar chart of the four categories is presented in Figure 1. The independent variables in the model include intercept, square root of family income, mother’s education, father’s education, mother’s working status, gender, race, indicator variables to point whether the youth lived in an urban area or South at the age of 14, and three indicator variables to indicate the individual’s age in 1979 (serves as a control for age cohort effects).
Figure 1: Bar chart showing the different categories of educational attainment. The number of responses (percentage) for each category are shown inside (at the top) of each bar.
To estimate the Bayesian ordinal quantile model on educational attainment data, we load the bqror package, prepare the required inputs and feed them into the quantregOR1
function. Specifically, we specify the outcome variable, covariate matrix (with covariates in order as in Rahman 2016), prior distributions for (βp,δp), burn-in size, number of post burn-in MCMC iterations, quantile value (p=0.5 for this illustration), and the values for tuning factor and autocorrelation cutoff.
library('bqror')
data("Educational_Attainment")
data <- na.omit(Educational_Attainment)
data$fam_income_sqrt <- sqrt(data$fam_income)
cols <- c("mother_work","urban","south", "father_educ","mother_educ",
"fam_income_sqrt","female", "black","age_cohort_2","age_cohort_3",
"age_cohort_4")
x <- data[cols]
x$intercept <- 1
xMat <- x[,c(12,6,5,4,1,7,8,2,3,9,10,11)]
yOrd <- data$dep_edu_level
k <- dim(xMat)[2]
J <- dim(as.array(unique(yOrd)))[1]
b0 <- array(rep(0, k), dim = c(k, 1))
B0 <- 1*diag(k)
d0 <- array(0, dim = c(J-2, 1))
D0 <- 0.25*diag(J - 2)
p <- 0.5
EducAtt <- quantregOR1(y = yOrd, x = xMat, b0, B0, d0, D0, burn = 1125,
mcmc = 4500, p, tune=1, accutoff = 0.5, TRUE)
[1] Summary of MCMC draws:
Post Mean Post Std Upper Credible Lower Credible Inef Factor
intercept -3.2546 0.2175 -2.8350 -3.6798 2.3143
fam_income_sqrt 0.2788 0.0230 0.3254 0.2337 2.1396
mother_educ 0.1242 0.0190 0.1619 0.0878 1.9499
father_educ 0.1866 0.0154 0.2165 0.1578 2.3062
mother_work 0.0664 0.0821 0.2287 -0.0930 1.9349
female 0.3492 0.0786 0.5070 0.2022 1.9086
black 0.4413 0.0997 0.6400 0.2506 1.9270
urban -0.0777 0.0971 0.1104 -0.2712 1.9019
south 0.0842 0.0880 0.2529 -0.0895 1.9153
age_cohort_2 -0.0345 0.1192 0.1963 -0.2660 1.7502
age_cohort_3 -0.0426 0.1223 0.2033 -0.2849 1.9053
age_cohort_4 0.4938 0.1212 0.7256 0.2570 1.7512
delta_1 0.8988 0.0276 0.9534 0.8461 4.6186
delta_2 0.5481 0.0313 0.6146 0.4890 3.6003
[1] MH acceptance rate: 26.8%
[1] Log of Marginal Likelihood: -4923.48
[1] DIC: 9781.91
The posterior results
mcmc <- 4500
burn <- round(0.25*mcmc)
nsim <- mcmc + burn
mcmcDraws <- cbind(t(EducAtt$betadraws), t(EducAtt$deltadraws))
color_scheme_set('darkgray')
bayesplot_theme_set(theme_minimal())
mcmc_trace(mcmcDraws[(burn+1):nsim, ], facet_args = list(ncol = 3))
Figure 2: Trace plots of the MCMC draws in the educational attainment study.
Figure 2 presents the trace plots of the MCMC draws, which can be generated by loading the bayesplot package and using the codes presented above. The purpose of trace plots is to show that the Markov chains have converged to the joint posterior distribution, such as shown in Figure 2. The idea is that the trace plots should show variation around a central value if the chain has converged, rather than drift without settling down.
Next, we utilize the covEffectOR1
function to compute the average covariate effect for a $10,000 increase in family income on the four categories of educational attainment. In general, the calculation of average covariate effect requires creation of either one or two new covariate matrices depending whether the covariate is continuous or indicator (binary), respectively. Since income is a continuous variable, a modified covariate matrix is created by adding $10,000 to each observation of family income. This is xMod2
in the code below and the increased income variable corresponds to xbi,l for i=1,⋯,n in Section 4.1. The second covariate matrix xMod1
(in the code below) is simply the covariate or design matrix xMat
; so with reference to Section 4.1, xai,l=xi,l for i=1,⋯,n. When the covariate of interest is an indicator variable, then xMod1
also requires modification as illustrated in the tax policy application.
We now call the covEffectOR1
function and supply the inputs to get the results.
xMat1 <- xMat
xMat2 <- xMat
xMat2$fam_income_sqrt <- sqrt((xMat1$fam_income_sqrt)^2 + 10)
EducAttCE <- covEffectOR1(EducAtt, yOrd, xMat1, xMat2, p = 0.5, verbose = TRUE)
[1] Summary of Covariate Effect:
Covariate Effect
Category_1 -0.0314
Category_2 -0.0129
Category_3 0.0193
Category_4 0.0250
The results shows that at the 50th quantile and for a $10,000 increase in family income, the probability of obtaining less than high school (high school degree) decreases by 3.14 (1.29) percent, while the probability of achieving some college or associate's degree (college or graduate degree) increases by 1.93 (2.50) percent.
Here, the objective is to analyze the factors that affect public opinion on the proposal to raise federal taxes for couples (individuals) earning more than $250,000 ($200,000) per year in the United States (US). The proposal was designed to extend the Bush Tax cuts for the lower and middle income classes, but restore higher rates for the richer class. Such a policy is considered pro-growth, since it is aimed to promote economic growth in the US by augmenting consumption among the low-middle income families. After extensive debate, the proposed policy received a two year extension and formed a part of the “Tax Relief, Unemployment Insurance Reauthorization, and Job Creation Act of 2010”.
Figure 3: Bar chart for public opinion on tax increase. The number of responses (percentage) for each category are shown inside (at the top) of each bar.
The data for the study was taken from the 2010-2012 American National Election Studies (ANES) on the Evaluations of Government and Society Study 1 (EGSS 1) and contains 1,164 observations. The dependent variable in the model, individual’s opinion on tax increase, has 3 categories: Oppose, Neither favor nor oppose, and Favor (see Figure 3). The covariates included in the model are the intercept, indicator variables for employment status, income above $75,000, bachelors’ degree, post-bachelors’ degree, computer ownership, cell phone ownership, and white race.
To estimate the quantile model on public opinion about federal tax increase, we load the bqror package, prepare the data, and provide the inputs into the quantregOR2
function. Specifically, we define the outcome variable, covariate matrix (with covariates in order as in Rahman 2016), specify the prior distributions for (βp,σp), choose the second cut-off value, burn-in size, number of post burn-in MCMC iterations, and values for quantile (p=0.5 for this illustration) and autocorrelation cutoff.
library(bqror)
data("Policy_Opinion")
data <- na.omit(Policy_Opinion)
cols <- c("Intercept","EmpCat","IncomeCat","Bachelors","Post.Bachelors",
"Computers","CellPhone", "White")
x <- data[cols]
xMat <- x[,c(1,2,3,4,5,6,7,8)]
yOrd <- data$y
k <- dim(x)[2]
b0 <- array(rep(0, k), dim = c(k, 1))
B0 = 1*diag(k)
n0 <- 5
d0 <- 8
FedTax <- quantregOR2(y = yOrd, x = xMat, b0, B0, n0, d0, gammacp2 = 3,
burn = 1125, mcmc = 4500, p = 0.5, accutoff = 0.5, TRUE)
[1] Summary of MCMC draws :
Post Mean Post Std Upper Credible Lower Credible Inef Factor
Intercept 2.0142 0.4553 2.9071 1.0959 1.4473
EmpCat 0.2496 0.2953 0.8294 -0.3270 1.6760
IncomeCat -0.5083 0.3323 0.1329 -1.1580 1.6700
Bachelors 0.0809 0.3744 0.8569 -0.6324 1.6726
Post.Bachelors 0.5082 0.4406 1.3964 -0.3435 1.5053
Computers 0.7167 0.3509 1.4078 0.0219 1.4975
CellPhone 0.8464 0.4027 1.6191 0.0444 1.4524
White 0.0627 0.3659 0.7579 -0.6502 1.4931
sigma 2.2205 0.1442 2.5300 1.9552 2.3161
[1] Log of Marginal Likelihood: -1174.11
[1] DIC: 2334.58
The results (see Footnote 1) from the MCMC draws are summarized above, where we report the posterior mean, posterior standard deviation, 95% posterior credible interval, and the inefficiency factor of the parameters (βp,σp). Additionally, the summary displays the logarithm of marginal likelihood and the DIC. Figure 4 presents the trace plots of the Gibbs draws, which can be generated by loading the bayesplot package and using the codes below.
mcmc <- 500
burn <- round(0.25*mcmc)
nsim <- mcmc + burn
mcmcDraws <- cbind(t(FedTax$betadraws), t(FedTax$sigmadraws))
color_scheme_set('darkgray')
bayesplot_theme_set(theme_minimal())
mcmc_trace(mcmcDraws[(burn+1):nsim, ], facet_args = list(ncol = 3))
Figure 4: Trace plots of the MCMC draws in the tax policy study.
Finally, we utilize the covEffectOR2
function to demonstrate the calculation of average covariate effect within the ORII framework. Below, we compute the average covariate effect for computer ownership (assume this is the l-th variable) on the 3 categories of public opinion about the tax policy. Here, the covariate is an indicator variable since you may either own a computer (coded as 1) or not (coded as 0). So, we need to create two modified covariate matrices. In the code snippet below, the first matrix xMat1
and the second matrix xMat2
are created by replacing the column on computer ownership with a column of zeros and ones, respectively. With reference to notations in Section 4.1 and Section 4.2, xai,l=0 and xbi,l=1 for i=1,⋯,n. We then call the covEffectOR2
function and supply the inputs to get the results.
xMat1 <- xMat
xMat1$Computers <- 0
xMat2 <- xMat
xMat2$Computers <- 1
FedTaxCE <- covEffectOR2(FedTax, yOrd, xMat1, xMat2, gammacp2 = 3, p = 0.5,
verbose = TRUE)
[1] Summary of Covariate Effect:
Covariate Effect
Category_1 -0.0396
Category_2 -0.0331
Category_3 0.0726
The result on covariate effect shows that at the 50th quantile, ownership of computer decreases probability for the first category Oppose (Neither favor nor oppose) by 3.96 (3.31) percent, and increases the probability for the third category Favor by 7.26 percent.
A wide class of applications in economics, finance, marketing, and the social sciences have dependent variables which are ordinal in nature (i.e., they are discrete and ordered, and are characterized by an underlying continuous variable). Modeling and analysis of such variables has been typically confined to ordinal probit or ordinal logit models, which offer information on the average probability of outcome variable given the covariates. However, a recently proposed method by Rahman (2016) allows Bayesian quantile modeling of ordinal data and thus presents the tool for a more comprehensive analysis and inference. The prevalence of ordinal responses in applications is well known and hence a software package that allows Bayesian quantile analysis with ordinal data will be of immense interest to applied researchers from different fields, including economics and statistics.
The current paper presents an implementation of the bqror package – the only package available for estimation and inference of Bayesian quantile regression in ordinal models. The package offers two MCMC algorithms for estimating ordinal quantile models. An ordinal quantile model with 3 or more outcomes is estimated by a combination of Gibbs sampling and MH algorithm, while estimation of an ordinal quantile model with exactly 3 outcomes utilizes a simpler and computationally faster algorithm that relies solely on Gibbs sampling. For both forms of ordinal quantile models, the bqror package also provides functions for calculating the covariate effects (for continuous as well as binary regressors) and measures for model comparison – marginal likelihood and the DIC. The paper explains how to compute the marginal likelihood from the MCMC outputs and recommends its use over the DIC for model comparison. Additionally, this paper demonstrates the usage of functions for estimation and analysis of Bayesian quantile regression with ordinal data on simulation studies and two applications related to educational attainment and tax policy. In the future, the current package will be extended to include ordinal quantile regression with longitudinal data and variable selection in ordinal quantile regression with cross section and/or longitudinal data.
Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-042.zip
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Maheshwari & Rahman, "bqror: An R package for Bayesian Quantile Regression in Ordinal Models", The R Journal, 2023
BibTeX citation
@article{RJ-2023-042, author = {Maheshwari, Prajual and Rahman, Mohammad Arshad}, title = {bqror: An R package for Bayesian Quantile Regression in Ordinal Models}, journal = {The R Journal}, year = {2023}, note = {https://doi.org/10.32614/RJ-2023-042}, doi = {10.32614/RJ-2023-042}, volume = {15}, issue = {2}, issn = {2073-4859}, pages = {39-55} }