The ridge regression estimator, one of the commonly used alternatives to the conventional ordinary least squares estimator, avoids the adverse effects in the situations when there exists some considerable degree of multicollinearity among the regressors. There are many software packages available for estimation of ridge regression coefficients. However, most of them display limited methods to estimate the ridge biasing parameters without testing procedures. Our developed package, lmridge can be used to estimate ridge coefficients considering a range of different existing biasing parameters, to test these coefficients with more than 25 ridge related statistics, and to present different graphical displays of these statistics.
For data collected either from a designed experiment or from an
observational study, the ordinary least squares (OLS) method does not
provide precise estimates of the effect of any explanatory variable
(regressor) when regressors are interdependent (collinear with each
other). Consider a multiple linear regression (MLR) model,
The OLS estimator (OLSE) of
Researchers may tempt to eliminate regressor(s) causing the problem by consciously removing regressors from the model. However, this method may destroy the usefulness of the model by removing relevant regressor(s) from the model. To control variance and instability of the OLS estimates, one may regularize the coefficients, with some regularization methods such as ridge regression (RR), Liu regression, and Lasso regression methods etc., as alternative to OLS. Computationally, RR suppresses the effects of collinearity and reduces the apparent magnitude of the correlation among regressors in order to obtain more stable estimates of the coefficients than the OLS estimates and it also improves accuracy of prediction (see Hoerl and Kennard 1970b; Montgomery and Peck 1982; Tripp 1983 etc.; Myers 1986; Rawlings et al. 1998; Seber and Lee 2003).
There are only a few software programs and R packages capable of
estimating and/ or testing of ridge coefficients. The design goal of our
lmridge
(Imdad and Aslam 2018a) is primarily to provide functionality of all possible
ridge related computations. The output of our developed package
(lmridge) is consistent with output of existing software/ R packages.
The package, lmridge also provides the most complete suite of tools
for ordinary RR, comparable to those listed in Table 1. For
package development and R documentation, we followed (Hadley 2015),
(Leisch 2008) and (R Core Team 2015). The
ridge package by
(Moritz and Cule 2017) and lm.ridge()
from the
MASS (Venables and Ripley 2002) also
provided guidance in coding.
NCSS | SAS | Stata | StatGraphics | lrmest | ltsbase | penalized | glmnet | ridge | lmridge | |
---|---|---|---|---|---|---|---|---|---|---|
Standardization of regressors | ||||||||||
Estimation and testing of ridge coefficient | ||||||||||
Estimation | ||||||||||
Testing | ||||||||||
SE of coef | ||||||||||
Ridge related statistics | ||||||||||
adj- |
||||||||||
m-scale & ISRM | ||||||||||
Variance | ||||||||||
Bias |
||||||||||
MSE | ||||||||||
F-test | ||||||||||
Shrinkage factor | ||||||||||
CN | ||||||||||
C |
||||||||||
DF | ||||||||||
EDF | ||||||||||
Eft | ||||||||||
Hat matrix | ||||||||||
Var-Cov matrix | ||||||||||
VIF | ||||||||||
Residuals | ||||||||||
Ridge fitted | ||||||||||
Predict | ||||||||||
Ridge model selection | ||||||||||
CV & GCV | ||||||||||
AIC & BIC | ||||||||||
PRESS | ||||||||||
Ridge related graphs | ||||||||||
Ridge trace | ||||||||||
VIF trace | ||||||||||
Bias, var, MSE | ||||||||||
CV, GCV | ||||||||||
AIC & BIC | ||||||||||
m-scale, ISRM | ||||||||||
DF, RSS, PRESS |
All available software and R packages mentioned in Table 1
are compared with our lmridge package. For multicollinearity
detection, NCSS statistical software (NCSS 11 Statistical Software 2016) computes VIF/TOL,
collin
option in
the model
statement is used to perform collinearity diagnostics while
for remedy of multicollinearity, RR can be performed using a ridge
option in proc reg
statement. The outVIF
option results in VIF
values. For RR, Stata (StataCorp 2014) has no built-in command, however
ridgereg
add-on is available that performs calculation on scalar k
.
The lrmest package (Dissanayake et al. 2016) computes
estimators such as OLS, ordinary RR (ORR), Liu estimator (LE), LE
type-1,2,3, Adjusted Liu Estimator (ALTE), and their type-1,2,3 etc.
Moreover, lrmest provides scalar mean square error (MSE), prediction
residual error sum of squares (PRESS) values of some of the estimators.
The testing of ridge coefficient is performed only on scalar k
,
however, for vector of k
, function rid()
of lrmest package returns
only MSE along with value of biasing parameter used. The function
optimum()
of lrmest package can be used to get the optimal scalar
MSE and PRESS values (Arumairajan and Wijekoon 2015). Statgraphics standardizes the
dependent variable and computes some statistics for detection of
collinearity such as plot=TRUE
is passed to the ltsbase()
function. There are three main
functions (i) ltsbase()
computes the minimum MSE values for six
models: OLS, ridge, ridge based on LTS, LTS, Liu, and Liu based on LTS
method for sequences of biasing parameters ranging from 0 to 1. If
print=TRUE
, ltsbase()
prints all the MSEs (along with minimum MSE)
for ridge, Liu, and ridge & Liu based on LTS method for the sequence of
biasing parameters given by the user, (ii) the ltsbaseDefault()
function returns the fitted values and residual of the six models (OLS,
ridge, Liu, LTS, and ridge & Liu based LTS methods) having minimum MSE,
and (iii) the ltsbaseSummary()
function returns the coefficients and
the biasing parameter for the best MSE among the four regression models.
The penalized package (Goeman et al. 2017) is designed for
penalized estimation in generalized linear models. The supported models
are linear regression, logistic regression, Poisson regression and the
Cox proportional hazard models. The penalized package allows an L1
absolute value ("LASSO") penalty, and L2 quadratic ("ridge") penalty
or a combination of the two. It is also possible to have a fused LASSO
penalty with L1 absolute value penalty on the coefficients and their
differences. The penalized package also includes facilities for
likelihood, cross-validation and for optimization of the tuning
parameter. The glmnet package (Friedman et al. 2010) has some
efficient procedures for fitting the entire LASSO or elastic-net
regularization path for linear regression, logistic and multinomial
regression model, Poisson regression and Cox model. The glmnet can
also be used to fit the RR model by setting alpha
argument to zero.
The ridge package fits linear and also logistic RR models, including
functions for fitting linear and logistic RR models for genome-wide SNP
data supplied as files names when the data are too big to read into R.
The RR biasing parameter is chosen automatically using the method
proposed by (Cule and De Iorio 2012), however value of biasing parameter can
also be specified for estimation and testing of ridge coefficients. The
function, lm.ridge()
from MASS only fits linear RR model and returns
ridge biasing parameters given by (Hoerl and Kennard 1970b) and (Venables and Ripley 2002) and
vector GCV criterion, given by (Golub et al. 1979).
There are other software and R packages that can be used to perform RR analysis such as S-PLUS (S-PLUS 2008), Shazam (Shazam 2011) and R packages such as RXshrink (Obenchain 2014), rrBLUP (Endelman 2011), RidgeFusion (Price 2014), bigRR (Shen et al. 2013), lpridge (Seifert 2013), genridge (Friendly 2017) and CoxRidge (Perperoglou 2015) etc.
This paper outlines the collinearity detection methods available in the
existing literature and uses the mctest (Imdad and Aslam 2018b) package
through an illustrative example. To overcome the issues of the
collinearity effect on regressors a thorough introduction to ridge
regression, properties of the ridge estimator, different methods for
selecting values of
Diagnosing collinearity is important to many researchers. It consists of two related but separate elements: (1) detecting the existence of collinear relationship among regressors and (2) assessing the extent to which this relationship has degraded the parameter estimates. There are many diagnostic measures used for detection of collinearity in the existing literature provided by various authors (Klein 1962; Farrar and Glauber 1967; Marquardt 1970; Theil 1971; Gunst and Mason 1977; Koutsoyiannis 1977; Belsley et al. 1980; Kovács et al. 2005; Curto and Pinto 2011; Fox and Weisberg 2011; Imdadullah et al. 2016). These diagnostics methods assist in determining whether and where some corrective action is necessary (Belsley et al. 1980). Widely used, and the most suggested diagnostics, are value of pair-wise correlations, variance inflation factor (VIF)/ tolerance (TOL) (Marquardt 1970), eigenvalues and eigenvectors (Kendall 1957), CN & CI (Belsley et al. 1980; Maddala 1988; Chatterjee and Hadi 2006), Leamer’s method (Greene 2002), Klein’s rule (Klein 1962), the tests proposed by Farrar and Glauber (Farrar and Glauber 1967), Red indicator (Kovács et al. 2005), corrected VIF (Curto and Pinto 2011) and Theil’s measures (Theil 1971), (see also (Imdadullah et al. 2016)). All of these diagnostic measures are implemented in the R package, mctest. Below, we use the Hald dataset (Hald 1952), for testing collinearity among regressors. We then use the lmridge package to compute the ridge coefficients for different ridge related statistics and methods of selection of ridge biasing parameter is also performed. For optimal choice of ridge biasing parameter, graphical representations of the ridge coefficients, vif values, cross validation criteria (CV & GCV), ridge DF, RSS, PRESS, ISRM and m-scale versus used ridge biasing parameter are considered. In addition graphical representation of model selection criteria (AIC & BIC) of ridge regression versus ridge DF is also performed. The Hald data are about heat generated during setting of 13 cement mixtures of 4 basic ingredients and used by (Hoerl et al. 1975). Each ingredient percentage appears to be rounded down to a full integer. The data set is already bundled in mctest and lmridge packages.
> library("mctest")
> x <- Hald[, -1]
> y <- Hald[, 1]
> mctest (x, y)
Call:
omcdiag(x = x, y = y, Inter = TRUE, detr = detr, red = red, conf = conf,
theil = theil, cn = cn)
Overall Multicollinearity Diagnostics
MC Results detection
Determinant |X'X|: 0.0011 1
Farrar Chi-Square: 59.8700 1
Red Indicator: 0.5414 1
Sum of Lambda Inverse: 622.3006 1
Theil's Method: 0.9981 1
Condition Number: 249.5783 1
1 --> COLLINEARITY is detected
0 --> COLLINEARITY is not detected by the test
The results from all overall collinearity diagnostic measures indicate the existence of collinearity among regressor(s). These results do not tell which regressor(s) are reasons of collinearity. The individual collinearity diagnostic measures can be obtained through:
> imcdiag(x = x, y, all = TRUE)
Call:
imcdiag(x = x, y = y, method = method, corr = FALSE, vif = vif,
tol = tol, conf = conf, cvif = cvif, leamer = leamer, all = all)
Individual Multicollinearity Diagnostics
VIF TOL Wi Fi Leamer CVIF Klein
X1 1 1 1 1 0 0 0
X2 1 1 1 1 1 0 1
X3 1 1 1 1 1 0 0
X4 1 1 1 1 0 0 1
1 --> COLLINEARITY is detected
0 --> COLLINEARITY is not detected by the test
X1, X2, X3, X4, coefficient(s) are non-significant may be due to multicollinearity
R-square of y on all x: 0.9824
* use method argument to check which regressors may be the reason of collinearity
Results from the most of individual collinearity diagnostics suggest
that all of the regressors are the reason for collinearity among
regressors. The last line of imcdiag()
function’s output suggests that
method
argument should be used to check which regressors may be the
reason of collinearity among different regressors. For further
information about method
argument, see the help file of imcdiag()
function.
In the seminal work by (Hoerl 1959, 1962, 1964) and
(Hoerl and Kennard 1970a,b) have developed ridge analysis
technique that purports the departure of the data from orthogonality.
(Hoerl 1962) introduced the RR, based on the James-Stein estimator by
stating that existence of correlation among regressors can cause errors
in estimating regression coefficients when applying the OLS method. The
RR is similar to the OLS method however, it shrinks the coefficients
towards zero by minimizing the MSE of the estimates, making the RR
technique better than the OLSE with respect to MSE, when regressors are
collinear with each other. A penalty (degree of bias) is imposed on the
size of coefficients in the RR to reduce their variances. However, the
expected values of these estimates are not equal to the true values and
tend to under estimate the true parameter. Though the ridge estimators
are biased but have lower MSE (more precision) than the OLSEs have, less
sensitive to sampling fluctuations or model misspecification if number
of regressors is more than the number of observations in a data set
(i.e.,
The design matrix
The ridge model coefficients are estimated as,
The addition of constant term
It is desirable to select the smallest value of
Let
The RR is the most popular among biased methods, because of its relationship to the OLS method and statistical properties of the RE are also well defined. Most of the RR properties have been discussed, proved and extended by many researchers such as (Hoerl and Kennard 1970a,b; Marquardt 1970; Newhouse and Oman 1971; Allen 1974; Hemmerle 1975; McDonald and Galarneau 1975). Table 2 lists the RR properties.
sr.# | Property | Formula |
---|---|---|
1) | Mean | |
2) | Shorter regression coeffs. | |
3) | Linear transformation | |
4) | Variance | |
5) | Var-Cov matrix | |
6) | Bias | |
7) | MSE | |
8) | Distance between |
|
9) | Inflated RSS | |
10) | ||
11) | Sampling fluctuations | The |
12) | Accurate prediction | |
13) | Wide range of |
|
14) | Optimal |
An optimal |
15) | DF Ridge | |
where |
||
16 | Effective no. of parameters | |
17 | Residual EDF |
Theoretically and practically, the RR is used to propose some new
methods for the choice of the biasing parameter
The optimal value of
We classified these estimation method as (i) Subjective or (ii) Objective
In all these methods, the selection of
Suppose, we have set of observations
method | formula | reference |
---|---|---|
(Kennard 1971; Mallows 1973) | ||
(Allen 1971; Allen 1974) | ||
CV | (Delaney and Chatterjee 1986) | |
GCV | (Golub et al. 1979) | |
ISRM | (Vinod 1976) | |
m-scale | (Vinod 1976) | |
Information criteria | (Akaike 1973; Schwarz 1978) | |
Effectiveness index (Eft) | (Lee 1979) |
There are other methods to estimate biasing parameter
Sr. # | Formula | Reference |
---|---|---|
1) | Hoerl and Kennard (1970b) | |
2) | Thisted (1976) | |
3) | Lawless and Wang (1976) | |
4) | Dwividi and Shrivastava (1978) | |
5) | Venables and Ripley (2002) | |
6) | Kibria (2003) | |
7) | Kibria (2003) | |
8) | Kibria (2003) | |
9) | Muniz and Kibria (2009) | |
10) | Muniz and Kibria (2009) | |
11) | Muniz and Kibria (2009) | |
12) | Muniz and Kibria (2009) | |
13) | Muniz and Kibria (2009) | |
14) | Muniz et al. (2012) | |
15) | Muniz et al. (2012) | |
16) | Muniz et al. (2012) | |
17) | Muniz et al. (2012) | |
18) | Muniz et al. (2012) | |
19) | Dorugade and Kashid (2010) | |
20) | Dorugade (2014) |
Investigating of the individual coefficients in a linear but biased
regression models such as ridge based, exact and non-exact
It has been established that
The statistic
Similarly, for testing the hypothesis
Our R package lmridge contains functions related to fitting of the RR
model and provides a simple way of obtaining the estimates of RR
coefficients, testing of the ridge coefficients, and computation of
different ridge related statistics, which prove helpful for selection of
optimal biasing parameter
The lmridge objects contain a set of standard methods such as
print(), summary(), plot()
and predict()
. Therefore, inferences can
be made easily using summary()
method for assessing the estimates of
regression coefficients, their standard errors, lmridge
which calls
lmridgeEst()
to perform required computations and estimation for given
values of non-stochastic biasing parameter
lmridge (formula, data, scaling = ("sc", "scaled", "centered"), K, ...)
The four arguments of lmridge() are described in Table 5:
Argument | Description |
---|---|
formula |
Symbolic representation for RR model of the form, response |
data |
Contains the variables that have to be used in RR model. |
K |
The biasing parameter, may be a scalar or vector. If a |
scaling |
The methods for scaling the predictors. The sc option uses the default scaling of the predictors in correlation form as described in (Belsley 1991; Draper and Smith 1998); the scaled option standardizes the predictors having zero mean and unit variance; and the centered option centers the predictors. |
The lmridge()
function returns an object of class "lmridge". The
function summary(), kest()
, and kstats1()
etc., are used to compute
and print a summary of the RR results, list of biasing parameter given
in Table 4, and ridge related statistics such as estimated
squared bias,
Object | Description |
---|---|
coef | A named vector of fitted ridge coefficients. |
xscale | The scales used to standardize the predictors. |
xs | The scaled matrix of predictors. |
y | The centered response variable. |
Inter | Whether an intercept is included in the model or not. |
K | The RR biasing parameter(s). |
xm | A vector of means of design matrix |
rfit | Matrix of ridge fitted values for each biasing parameter |
d | Singular values of the SVD of the scaled predictors. |
div | Eigenvalues of scaled regressors for each biasing parameter |
scaling | The method of scaling used to standardized the predictors. |
call | The matched call. |
terms | The terms object used. |
Z | A matrix |
Table 7 lists the functions and methods available in lmridge package:
Functions | Description |
---|---|
Ridge coefficient estimation and testing | |
lmridgeEst() |
The main model fitting function for implementation of RR models in R. |
coef() |
Display de-scaled ridge coefficients. |
lmridge() |
Generic function and default method that calls lmridgeEst() and returns an object of S3 class "lmridge" with different set of methods to standard generics. It has a print method for display of ridge de-scaled coefficients. |
summary() |
Standard RR output (coefficient estimates, scaled coefficients estimates, standard errors, |
Residuals, fitted values and prediction | |
predict() |
Produces predicted value(s) by evaluating lmridgeEst() in the frame newdata . |
fitted() |
Displays ridge fitted values for observed data. |
residuals() |
Displays ridge residuals values. |
press() |
Generic function that computes prediction residual error sum of squares (PRESS) for ridge coefficients. |
Methods to estimate |
|
kest() |
Displays various |
Ridge statistics | |
vcov() |
Displays associated Var-Cov matrix with matching ridge parameter |
hatr() |
Generic function that displays hat matrix from RR. |
infocr() |
Generic function that compute information criteria AIC and BIC. |
vif() |
Generic function that computes VIF values. |
rstats1() |
Generic function that displays different statistics of RR such as MSE, squared bias and |
rstats2() |
Generic function that displays different statistics of RR such as df, m-scale and LSRM etc., and have print method. |
Ridge plots | |
plot() |
Ridge and VIF trace plot against biasing parameter |
bias.plot() |
Bias-Variance tradeoff plot. Plot of ridge MSE, bias and variance against |
cv.plot() |
Cross validation plots of CV and GCV against biasing parameter |
info.plot() |
Plot of AIC and BIC against |
isrm.plot() |
Plots ISRM and m-scale measure. |
rplots.plot() |
Miscellaneous ridge related plots such as df-trace, RSS and PRESS plots. |
The use of lmridge is explained through examples by using the Hald dataset.
> library("lmridge")
> mod <- lmridge(y ~ X1 + X2 + X3 + X4, data = as.data.frame(Hald),
+ scaling = "sc", K = seq(0, 1, 0.001))
The output of linear RR from lmridge() function is assigned to an
object mod
. The first argument of the function is formula
, which is
used to specify the required linear RR model for the data provided as
second argument. The print
method for mod
, an object of class
"lmridge", will display the de-scaled coefficients. The output
(de-scaled coefficients) from the above command is only for a few
selected biasing parameter values.
Call:
lmridge.default(formula = y ~ ., data = as.data.frame(Hald),
K = seq(0, 1, 0.001))
Intercept X1 X2 X3 X4
K=0.01 82.67556 1.31521 0.30612 -0.12902 -0.34294
K=0.05 85.83062 1.19172 0.28850 -0.21796 -0.35423
K=0.5 89.19604 0.78822 0.27096 -0.36391 -0.28064
K=0.9 90.22732 0.65351 0.24208 -0.34769 -0.24152
K=1 90.42083 0.62855 0.23540 -0.34119 -0.23358
To get the ridge scaled coefficients mod$coef
can be used,
> mod$coef
K=0.01 K=0.05 K=0.5 K=0.9 K=1
X1 26.800306 24.28399 16.061814 13.316802 12.808065
X2 16.500987 15.55166 14.606166 13.049400 12.689060
X3 -2.862655 -4.83610 -8.074509 -7.714626 -7.570415
X4 -19.884534 -20.53939 -16.272482 -14.004088 -13.543744
Objects of class "lmridge" contain components such as rfit
, K
and
coef
etc. For fitted ridge model, the generic method summary()
is
used to investigate the ridge coefficients. The parameter estimates of
ridge model are summarized using a matrix of 5 columns namely
estimates, estimates (Sc), StdErr (Sc), t values (Sc) and
P(>|t|) for ridge coefficients. The following results are shown only
for
> summary(mod)
Call:
lmridge.default(formula = y ~ ., data = as.data.frame(Hald), K = 0.012)
Coefficients: for Ridge parameter K= 0.012
Estimate Estimate (Sc) StdErr (Sc) t-value (Sc) Pr(>|t|)
Intercept 83.1906 -246.5951 269.2195 -0.916 0.3837
X1 1.3046 26.5843 3.8162 6.966 0.0001 ***
X2 0.3017 16.2649 4.6337 3.510 0.0067 ***
X3 -0.1378 -3.0585 3.7655 -0.812 0.4377
X4 -0.3470 -20.1188 4.7023 -4.279 0.0021 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Ridge Summary
R2 adj-R2 DF ridge F AIC BIC
0.96990 0.95980 3.04587 134.14893 23.24068 58.30578
Ridge minimum MSE= 390.5195 at K= 0.012
P-value for F-test ( 3.04587 , 9.779581 ) = 2.914733e-08
The summary()
function also displays ridge related lmridge()
.
The kest()
function, which works with ridge fitted model, computes
different biasing parameters developed by researchers, see
Table 4. The list of different k
values (22 in numbers) may
help in deciding the amount of bias needs to be introduced in RR.
> kest(mod)
Ridge k from different Authors
k values
Thisted (1976): 0.00581
Dwividi & Srivastava (1978): 0.00291
LW (lm.ridge) 0.05183
LW (1976) 0.00797
HKB (1975) 0.01162
Kibria (2003) (AM) 0.28218
Minimum GCV at 0.01320
Minimum CV at 0.01320
Kibria 2003 (GM): 0.07733
Kibria 2003 (MED): 0.01718
Muniz et al. 2009 (KM2): 14.84574
Muniz et al. 2009 (KM3): 5.32606
Muniz et al. 2009 (KM4): 3.59606
Muniz et al. 2009 (KM5): 0.27808
Muniz et al. 2009 (KM6): 7.80532
Mansson et al. 2012 (KMN8): 14.98071
Mansson et al. 2012 (KMN9): 0.49624
Mansson et al. 2012 (KMN10): 6.63342
Mansson et al. 2012 (KMN11): 0.15075
Mansson et al. 2012 (KMN12): 8.06268
Dorugade et al. 2010: 0.00000
Dorugade et al. 2014: 0.00000
The rstats1()
and rstats2()
functions can be used to compute
different statistics for a given ridge biasing parameter specified in a
call to lmridge
. The ridge statistics are MSE, squared bias,
rstats1()
and rstats2()
functions,
for some (
> rstats1(mod)
Ridge Regression Statistics 1:
Variance Bias^2 MSE rsigma2 F R2 adj-R2 CN
K=0 3309.5049 0.0000 3309.5049 5.3182 125.4142 0.9824 0.9765 1376.8806
K=0.012 72.3245 318.1951 390.5195 4.9719 134.1489 0.9699 0.9598 164.9843
K=0.1 19.8579 428.4112 448.2692 5.8409 114.1900 0.8914 0.8552 22.9838
K=0.2 16.5720 476.8887 493.4606 7.6547 87.1322 0.8170 0.7560 12.0804
> rstats2(mod)
Ridge Regression Statistics 2:
CK DF ridge EP REDF EF ISRM m scale PRESS
K= 0 6.0000 4.0000 4.0000 9.0000 0.0000 3.9872 0.0000 110.3470
K= 0.012 4.8713 3.0459 3.2204 9.7796 10.1578 3.6181 0.9541 92.8977
K= 0.1 4.2246 2.5646 2.9046 10.0954 7.6829 2.8471 1.4354 121.2892
K= 0.2 3.8630 2.2960 2.7290 10.2710 6.9156 2.5742 1.7040 162.2832
The residuals, fitted values from the RR and predicted values of the
response variable residual()
,
fitted()
and predict()
, respectively. To obtain the Var-Cov
matrix, VIF and Hat matrix, the function vcov()
, vif()
and hatr()
can be used. The df are computed by following (Hastie and Tibshirani 1990).
The results for VIF, Var-Cov and diagonal elements of the hat matrix
from vif()
, vcov()
and hatr()
functions are given below for
> hatr(mod)
> hatr(mod)[[2]]
> diag(hatr(mod)[[2]])
> diag(hatr(lmridge(y ~ ., as.data.frame(Hald), K = c(0, 0.012)))[[2]])
1 2 3 4 5 6 7 8 9 10 11
0.39680 0.21288 0.10286 0.16679 0.24914 0.04015 0.28424 0.30163 0.12502 0.58426 0.29625
12 13
0.12291 0.16294
> vif(mod)
X1 X2 X3 X4
k=0 38.49621 254.42317 46.86839 282.51286
k=0.012 2.92917 4.31848 2.85177 4.44723
k=0.1 1.28390 0.51576 1.20410 0.39603
k=0.2 0.78682 0.34530 0.75196 0.28085
R> vcov(mod)
$`K=0.012`
X1 X2 X3 X4
X1 14.563539 1.668783 11.577483 4.130232
X2 1.668783 21.471027 3.066958 19.075274
X3 11.577483 3.066958 14.178720 4.598000
X4 4.130232 19.075274 4.598000 22.111196
Following are possible uses of some functions to compute different ridge related statistics. For detail description of these functions/ commands, see the lmridge package documentation.
> mod$rfit
> resid(mod)
> fitted(mod)
> infocr(mod)
> press(mod)
For given values of predict()
:
> predict(mod, newdata = as.data.frame(Hald[1 : 5, -1]))
K=0 K=0.012 K=0.1 K=0.2
1 78.49535 78.52225 79.75110 80.73843
2 72.78893 73.13500 74.32678 75.38191
3 105.97107 106.39639 106.04958 105.62451
4 89.32720 89.48443 89.52343 89.65432
5 95.64939 95.73595 96.56710 96.99781
The model selection criteria’s of AIC and BIC can be computed using
infocr()
function for each value of ridge()
.
For some
> infocr(mod)
AIC BIC
K=0 24.94429 60.54843
K=0.012 23.24068 58.30578
K=0.1 24.78545 59.57865
K=0.2 27.98813 62.62961
The effect of multicollinearity on the coefficient estimates can be
identified by using different graphical displays such as ridge, VIF and
df traces, plotting of RSS against df, PRESS vs plot()
function. The argument to plot functions are
abline = TRUE
, and type = c("ridge", "vif")
. By default, ridge trace
will be plotted having horizontal line parallel to horizontal axis at
> mod <- lmridge(y ~ ., data = as.data.frame(Hald), K = seq(0, 0.5, 0.001))
> plot(mod)
> plot(mod, type = "vif", abline = FALSE)
> plot(mod, type = "ridge", abline = TRUE)
> bias.plot(mod, abline = TRUE)
> info.plot(mod, abline = TRUE)
> cv.plot(mod, abline = TRUE)
The vertical lines in ridge trace and VIF trace suggest the optimal
value of biasing parameter
The bias-variance tradeoff plot (Figure 3) may be used to
select optimal bias.plot()
function. The vertical line in
bias-variance tradeoff plot shows the value of biasing parameter
The plot of model selection criteria AIC and BIC for choosing optimal
info.plot()
function may be used,
Function cv.plot()
plots the CV and GCV cross validation against
biasing parameter
> isrm.plot(mod)
The m-scale and ISRM (Figure 6) measures by (Vinod 1976) can
also be plotted from function of isrm.plot()
and can be used to judge
the optimal value of
Function rplots.plot()
plots the panel of three plots namely (i) df
trace, (ii) RSS vs
> rplots.plot(mod)
Our developed lmridge package provides the most complete suite of tools for RR available in R, comparable to those available as listed in Table 1. We have implemented functions to compute the ridge coefficients, testing of these coefficients, computation of different ridge related statistics and computation of the biasing parameter for different existing methods by various authors (see Table 4).
We have greatly increased the ridge related statistics and different
graphical methods for the selection of biasing parameter
Up to now, a complete suite of tools for RR was not available for an open source or paid version of statistical software packages, resulting in reduced awareness and use of developed ridge related statistics. The package lmridge provides a complete open source suite of tools for the computation of ridge coefficients estimation, testing and computation of different statistics. We believe the availability of these tools will lead to increase utilization and better ridge related practices.
lmridge, ridge, MASS, lrmest, ltsbase, penalized, glmnet, RXshrink, rrBLUP, RidgeFusion, bigRR, lpridge, genridge, CoxRidge
Agriculture, Distributions, Econometrics, Environmetrics, MachineLearning, MixedModels, NumericalMathematics, Psychometrics, Robust, Survival, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Ullah, et al., "lmridge: A Comprehensive R Package for Ridge Regression", The R Journal, 2018
BibTeX citation
@article{RJ-2018-060, author = {Ullah, Muhammad Imdad and Aslam, Muhammad and Altaf, Saima}, title = {lmridge: A Comprehensive R Package for Ridge Regression}, journal = {The R Journal}, year = {2018}, note = {https://rjournal.github.io/}, volume = {10}, issue = {2}, issn = {2073-4859}, pages = {326-346} }