lmridge : A Comprehensive R Package for Ridge Regression by

The ridge regression estimator, one of the commonly used alternatives to the conventional ordinary least squares estimator, avoids the adverse effects in the situations when there exists some considerable degree of multicollinearity among the regressors. There are many software packages available for estimation of ridge regression coefficients. However, most of them display limited methods to estimate the ridge biasing parameters without testing procedures. Our developed package, lmridge can be used to estimate ridge coefficients considering a range of different existing biasing parameters, to test these coefficients with more than 25 ridge related statistics, and to present different graphical displays of these statistics.


Introduction
For data collected either from a designed experiment or from an observational study, the ordinary least squares (OLS) method does not provide precise estimates of the effect of any explanatory variable (regressor) when regressors are interdependent (collinear with each other).Consider a multiple linear regression (MLR) model, where y is an n × 1 vector of observation on dependent variable, X is known design matrix of order n × p, β is a p × 1 vector of unknown parameters and ε is an n × 1 vector of random errors with mean zero and variance σ 2 I n , where I n is an identity matrix of order n.
The OLS estimator (OLSE) of β is given by which depends on characteristics of the matrix X X.If X X is ill-conditioned (near dependencies among various columns (regressors) of X X exist) or det(X X) ≈ 0, then the OLS estimates are sensitive to a number of errors, such as non-significant or imprecise regression coefficients (Kmenta, 1980) with wrong sign and non-uniform eigenvalues spectrum.Moreover, the OLS method, can yield high variances of estimates, large standard errors, and wide confidence intervals.Quality and stability of the fitted model may be questionable due to erratic behaviour of the OLSE in case when regressors are collinear.
Researchers may tempt to eliminate regressor(s) causing the problem by consciously removing regressors from the model.However, this method may destroy the usefulness of the model by removing relevant regressor(s) from the model.To control variance and instability of the OLS estimates, one may regularize the coefficients, with some regularization methods such as ridge regression (RR), Liu regression, and Lasso regression methods etc., as alternative to OLS.Computationally, RR suppresses the effects of collinearity and reduces the apparent magnitude of the correlation among regressors in order to obtain more stable estimates of the coefficients than the OLS estimates and it also improves accuracy of prediction (see Hoerl and Kennard, 1970a;Montgomery and Peck, 1982;Myers, 1986;Rawlings et al., 1998;Seber and Lee, 2003;Tripp, 1983, etc.).
There are only a few software programs and R packages capable of estimating and/ or testing of ridge coefficients.The design goal of our lmridge (Imdad and Aslam, 2018b) is primarily to provide functionality of all possible ridge related computations.The output of our developed package (lmridge) is consistent with output of existing software/ R packages.The package, lmridge also provides the most complete suite of tools for ordinary RR, comparable to those listed in Table 1.For package development and R documentation, we followed Hadley (2015), Leisch (2008) and R Core Team (2015).The ridge package by Moritz and Cule (2017) and lm.ridge() from the MASS (Venables and Ripley, 2002) also provided guidance in coding.
All available software and R packages mentioned in Table 1 are compared with our lmridge package.For multicollinearity detection, NCSS statistical software (NCSS 11 Statistical Software, 2016) computes VIF/TOL, R 2 , eigenvalue, eigenvector, incremental and cumulative percentage of eigenvalues and CN.For RR, ANOVA table, coefficient of variation, plot of residuals vs predicted, histogram and density trace of residuals are also available in NCSS.In SAS (Inc., 2011), collin option in the model statement is used to perform collinearity diagnostics while for remedy of multicollinearity, RR can be performed using a ridge option in proc reg statement.The outVIF option results in The R Journal Vol.10/2, December 2018 ISSN 2073-4859 The penalized package allows an L1 absolute value ("LASSO") penalty, and L2 quadratic ("ridge") penalty or a combination of the two.It is also possible to have a fused LASSO penalty with L1 absolute value penalty on the coefficients and their differences.The penalized package also includes facilities for likelihood, cross-validation and for optimization of the tuning parameter.The glmnet package (Friedman et al., 2010) has some efficient procedures for fitting the entire LASSO or elastic-net regularization path for linear regression, logistic and multinomial regression model, Poisson regression and Cox model.The glmnet can also be used to fit the RR model by setting alpha argument to zero.The ridge package fits linear and also logistic RR models, including functions for fitting linear and logistic RR models for genome-wide SNP data supplied as files names when the data are too big to read into R.The RR biasing parameter is chosen automatically using the method proposed by Cule and De Iorio (2012), however value of biasing parameter can also be specified for estimation and testing of ridge coefficients.The function, lm.ridge() from MASS only fits linear RR model and returns ridge biasing parameters given by Hoerl and Kennard (1970a) and Venables and Ripley (2002) and vector GCV criterion, given by Golub et al. (1979).
There are other software and R packages that can be used to perform RR analysis such as S-PLUS (S-PLUS, 2008), Shazam (Shazam, 2011) and R packages such as RXshrink (Obenchain, 2014), rrBLUP (Endelman, 2011), RidgeFusion (Price, 2014), bigRR (Shen et al., 2013), lpridge (Seifert, 2013), genridge (Friendly, 2017) and CoxRidge (Perperoglou, 2015) etc.This paper outlines the collinearity detection methods available in the existing literature and uses the mctest (Imdad and Aslam, 2018a) package through an illustrative example.To overcome the issues of the collinearity effect on regressors a thorough introduction to ridge regression, properties of the ridge estimator, different methods for selecting values of k, and testing of the ridge coefficients are presented.Finally, estimation of the ridge coefficients, methods of selecting a ridge biasing parameter, testing of the ridge coefficients, and different ridge related statistics are implemented in R within the lmridge.

Collinearity detection
Diagnosing collinearity is important to many researchers.It consists of two related but separate elements: (1) detecting the existence of collinear relationship among regressors and (2) assessing the extent to which this relationship has degraded the parameter estimates.There are many diagnostic measures used for detection of collinearity in the existing literature provided by various authors (Belsley et al., 1980;Curto and Pinto, 2011;Farrar and Glauber, 1967;Fox and Weisberg, 2011;Gunst and Mason, 1977;Imdadullah et al., 2016;Klein, 1962;Koutsoyiannis, 1977;Kovács et al., 2005;Marquardt, 1970;Theil, 1971).These diagnostics methods assist in determining whether and where some corrective action is necessary (Belsley et al., 1980).Widely used, and the most suggested diagnostics, are value of pair-wise correlations, variance inflation factor (VIF)/ tolerance (TOL) (Marquardt, 1970), eigenvalues and eigenvectors (Kendall, 1957), CN & CI (Belsley et al., 1980;Chatterjee and Hadi, 2006;Maddala, 1988), Leamer's method (Greene, 2002), Klein's rule (Klein, 1962), the tests proposed by Farrar and Glauber (Farrar and Glauber, 1967), Red indicator (Kovács et al., 2005), corrected VIF (Curto and Pinto, 2011) and Theil's measures (Theil, 1971), (see also Imdadullah et al. (2016)).All of these diagnostic measures are implemented in the R package, mctest.Below, we use the Hald dataset (Hald, 1952), for testing collinearity among regressors.We then use the lmridge package to compute the ridge coefficients for different ridge related statistics and methods of selection of ridge biasing parameter is also performed.For optimal choice of ridge biasing parameter, graphical representations of the ridge coefficients, vif values, cross validation criteria (CV & GCV), ridge DF, RSS, PRESS, ISRM and m-scale versus used ridge biasing parameter are considered.In addition graphical representation of model selection criteria (AIC & BIC) of ridge regression versus ridge DF is also performed.The Hald data are about heat generated during setting of 13 cement mixtures of 4 basic ingredients and used by Hoerl et al. (1975).Each ingredient percentage appears to be rounded down to a full integer.The data set is already bundled in mctest and lmridge packages.

Collinearity detection: Illustrative example
The R Journal Vol.The results from all overall collinearity diagnostic measures indicate the existence of collinearity among regressor(s).These results do not tell which regressor(s) are reasons of collinearity.The individual collinearity diagnostic measures can be obtained through: X1, X2, X3, X4, coefficient(s) are non-significant may be due to multicollinearity R-square of y on all x: 0.9824

* use method argument to check which regressors may be the reason of collinearity
Results from the most of individual collinearity diagnostics suggest that all of the regressors are the reason for collinearity among regressors.The last line of imcdiag() function's output suggests that method argument should be used to check which regressors may be the reason of collinearity among different regressors.For further information about method argument, see the help file of imcdiag() function.

Ridge regression analysis
In the seminal work by Hoerl (1959Hoerl ( , 1962Hoerl ( , 1964) ) and Hoerl and Kennard (1970b,a) have developed ridge analysis technique that purports the departure of the data from orthogonality.Hoerl (1962) introduced the RR, based on the James-Stein estimator by stating that existence of correlation among regressors can cause errors in estimating regression coefficients when applying the OLS method.The RR is similar to the OLS method however, it shrinks the coefficients towards zero by minimizing the MSE of the estimates, making the RR technique better than the OLSE with respect to MSE, when regressors are collinear with each other.A penalty (degree of bias) is imposed on the size of coefficients in the RR to reduce their variances.However, the expected values of these estimates are not equal to the true values and tend to under estimate the true parameter.Though the ridge estimators are biased but have lower MSE (more precision) than the OLSEs have, less sensitive to sampling fluctuations or model misspecification if number of regressors is more than the number of observations in a data set (i.e., p > n), and omitted variables specification bias (Theil, 1957).In summary, the RR procedure is intended to overcome the ill-conditioned situation, and is used to improve the estimation of regression coefficients when regressors are correlated and it also improves the accuracy of prediction (Seber and Lee, 2003).Obtaining the ridge model coefficients ( βR ) is relatively straight forward, because the ridge coefficients are obtained by solving a slightly modified form of the OLS method.
The R Journal Vol.10/2, December 2018 ISSN 2073-4859 The design matrix X in Eq. ( 1) can be standardized, scaled or centered.Usually, standardization of X matrix is done as described by Belsley et al. (1980) and Draper and Smith (1998), that is, X j = where j = 1, 2, • • • , p such that X j = 0 and X j X j = 1, where X j is the jth column of the matrix X.In this way, the new design matrix (say X) that contains the standardized p columns and the matrix X X will be correlation matrix of regressors.To avoid complexity of different notations and terms, the centered and scaled design matrix X will be represented by X and centered response variable as y.
The ridge model coefficients are estimated as, where βR k is the vector of standardized RR coefficients of order p × 1 and kI p is a positive semi-definite matrix added to the X X matrix.Note that for k = 0, βR k = βols .The addition of constant term k to diagonal element of X X (in other words addition of kI p to X X) in Eq. ( 3) is known as penalty and k is called the biasing or shrinkage parameter.Addition of this biasing parameter guarantees the invertibility of X X matrix, such that there is always a unique solution βR k exists (Draper and Smith, 1998;Hoerl and Kennard, 1970a;McCallum, 1970) and the condition number (CN) of X X + kI (CN k = λ 1 +kI λ p +kI ) also becomes smaller as compared to that of X X, where λ 1 is the largest and λ p is the smallest eigenvalues of the correlation matrix X X.Therefore, the ridge estimator (RE) is an improvement over the OLSE for collinear data.
It is desirable to select the smallest value of k for which stabilized regression coefficients occur and there always exists a particular value of k for which the total MSE of the REs is less than the MSE of the OLSE, however, the optimum value of k (which produces minimum MSE as compared to other values of ks) varies from one application to another and hence optimal value of k is unknown.Any estimator that has a small amount of bias, less variance and substantially more precise than an unbiased estimator may be preferred since it will have larger probability of being close to the true parameter being estimated.Therefore, criterion of goodness of estimation considered in the RR is the minimum total MSE.

Properties of the ridge estimator
Let X j denotes the jth column of X (1, 2, • • • , p), where X j = (x 1j , x 2j , • • • , x nj ) .As already discussed, assume that the regressors are centered such that x 2 ij = 1 and the response variable y is centered.
The RR is the most popular among biased methods, because of its relationship to the OLS method and statistical properties of the RE are also well defined.Most of the RR properties have been discussed, proved and extended by many researchers such as Allen (1974); Hemmerle (1975); Hoerl and Kennard (1970b,a); Marquardt (1970); McDonald and Galarneau (1975); Newhouse and Oman (1971).Table 2 lists the RR properties.
Theoretically and practically, the RR is used to propose some new methods for the choice of the biasing parameter k to investigate the properties of RE, since biasing parameter plays a key role while the optimal choice of k is the main issue in this context.In the literature, there are many methods for estimating the biasing parameter k (see Allen, 1974;Guilkey and Murphy, 1975;Hemmerle, 1975;Hoerl and Kennard, 1970b,a;McDonald and Galarneau, 1975;Obenchain, 1977;Hocking et al., 1976;Lawless and Wang, 1976;Vinod, 1976;Kasarda and Shih, 1977;Hemmerle and Brantle, 1978;Wichern and Churchill, 1978;Nordberg, 1982;Saleh and Kibria, 1993;Singh and Tracy, 1999;Wencheko, 2000;Kibria, 2003;Khalaf and Shukur, 2005;Alkhamisi et al., 2006;Alkhamisi and Shukur, 2007;Khalaf, 2013, among many more), however, there is no consensus about which method is preferable (Chatterjee and Hadi, 2006).Similarly, each of the estimation method of biasing parameter cannot guarantee to give a better k or even cannot give a smaller MSE as compared to that for the OLS.

Methods of selecting values of k
The optimal value of k is one which gives minimum MSE.There is one optimal k for any problem, while a wide range of k (0 < k < k opt ) give smaller MSE as compared to that of the OLS.For collinear data, a small change in k varies the RR coefficients rapidly.At some values of k, the ridge coefficients get stabilized and the rate of change slow down gradually to almost zero.Therefore, a disciplined way of selecting the shrinkage parameter is required that minimizes the MSE.The biasing parameter k depends on the true regression coefficients (β) and the variance of the residuals σ 2 , unfortunately The R Journal Vol.10/2, December 2018 ISSN 2073-4859 Var-Cov matrix Distance between βR and β βR and the true vector of β have minimum distance 9) Inflated RSS

11) Sampling fluctuations
The βR is less sensitive to the sampling fluctuation 12) Accurate prediction An optimal k always exists that gives minimum MSE 15) DF Ridge these are unknown, but they can be estimated from the sample data.
We classified these estimation method as (i) Subjective or (ii) Objective

Subjective methods
In all these methods, the selection of k is subjective or of judgmental nature and provides graphical evidence of the effect of collinearity on the regression coefficient estimates and also accounts for variation by the RE as compared to the OLSE.In these methods, the reasonable choice of k is done using the ridge trace, df trace, VIF trace and plotting of bias, variance, and MSE.The ridge trace is a graphical representation of regression coefficients βR , as a function of k over the interval [0, 1].The df trace and VIF trace are like the ridge trace plot in which EDF and VIF values are plotted against k.Similarly, plotting of bias, variance, and MSE from the RE may also be helpful in selecting an appropriate value of k.All these graphs can be used for selection of optimal (but judgmental) value of k from horizontal axis to assess the effect of collinearity on each of the coefficients.The effect of collinearity is depressed when value of k increases and all the values of the ridge coefficients, EDF and VIF values decrease and/ or may stabilize after certain value of k.These graphical representations do not provide a unique solution, rather they render a vaguely defined class of acceptable solutions.However, these traces are still useful graphical representations to check for some optimal k.

Objective methods
Suppose, we have set of observations (x 1 , y 1 ), (x 2 , y 2 ), • • • , (x n , y n ) and the RR model as given in Eq.
(3).Objective methods, to some extent, are similar to judgmental methods for selection of biasing parameter k, but they require some calculations to obtain these biasing parameters.Table 3 lists widely used methods to estimate the biasing parameter k already available in the existing literature.Table 3 also lists other statistics that can be used for the selection of the biasing parameter k.There are other method formula reference Kennard (1971); Mallows  (1973) Allen (1971,1974) CV Delaney and Chatterjee (1986 Lee (1979)

Testing of the ridge coefficients
Investigating of the individual coefficients in a linear but biased regression models such as ridge based, exact and non-exact t type and F test can be used.Exact t-statistics derived by Obenchain (1977) based on the RR for matrix G whose columns are the normalized eigenvectors of X X, is, where j = 1, 2, • • • , p, v ar( βR j − β j ) is an unbiased estimator of the variance of the numerator in Eq. ( 4), and , where g i is the ith row of G, ∆ is the (p × p) diagonal matrix with ith diagonal element given by δ i = λ i λ i +k and e i is the ith row of the identity matrix.It has been established that β R ∼ N(ZXβ, φ = ZΩZ ), where Z = (X X + kI p ) −1 X .Therefore, for jth ridge coefficient β R ∼ N(Z j Xβ, φ jj = Z j ΩZ j ) (see Aslam, 2014;Halawa and El-Bassiouni, 2000).Halawa and El-Bassiouni (2000) presented to tackle the problem of testing H 0 : β j = 0 by considering a non-exact t type test of the form, where βR j is the jth element of RE and S 2 ( βR j ) is an estimate of the variance of βR j given by the ith diagonal element of the matrix σ 2 (X X + kI p ) −1 X X(X X + kI p ) −1 .
The R Journal Vol.10/2, December 2018 ISSN 2073-4859 Sr. # Formula Reference 1) K HKB = p σ2 β β Hoerl and Kennard (1970a) 2) K TH = (p−2) σ2 β β Thisted (1976) 3) Lawless and Wang (1976) 4) Dwividi and Shrivastava (1978) Venables and Ripley ( 2002) Muniz and Kibria (2009) 10) Muniz and Kibria ( 2009) Muniz and Kibria ( 2009) Muniz and Kibria ( 2009) Muniz and Kibria (2009) 14)  The statistic t R j is assumed to follow a Student's t distribution with (n − p) d.f.(Halawa and El-Bassiouni, 2000).Hastie and Tibshirani (1990); Cule and De Iorio (2012) suggested to use [n − trace(H Rk )] d.f.For large sample size, the asymptotic distribution of this statistic is normal (Halawa and El-Bassiouni, 2000).Thus, H 0 is rejected when |T| > Z 1− α 2 .Similarly, for testing the hypothesis H 0 : β = β 0 , where β 0 is vector of fixed values.The F statistic for significance testing of the ORR estimator β R with E( βR ) = ZXβ and estimate of Cov(β R ) distributed as F(DF ridge , REDF) is The R package lmridge Our R package lmridge contains functions related to fitting of the RR model and provides a simple way of obtaining the estimates of RR coefficients, testing of the ridge coefficients, and computation of different ridge related statistics, which prove helpful for selection of optimal biasing parameter k.The package computes different ridge related measures available for the selection of biasing parameter k, and also computes value of different biasing parameters proposed by some researchers in the literature.
The lmridge objects contain a set of standard methods such as print(),summary(),plot() and predict().Therefore, inferences can be made easily using summary() method for assessing the estimates of regression coefficients, their standard errors, t values and their respective p values.The default function lmridge which calls lmridgeEst() to perform required computations and estimation for given values of non-stochastic biasing parameter k.The syntax of default function is, lmridge (formula,data,scaling = ("sc","scaled","centered"),K,...) The four arguments of lmridge() are described in Table 5:

Argument Description formula
Symbolic representation for RR model of the form, response ∼ predictors.data Contains the variables that have to be used in RR model.

K
The biasing parameter, may be a scalar or vector.If a K value is not provided, K = 0 will be used as the default value, i.e., the OLS results will be produced.scaling The methods for scaling the predictors.The sc option uses the default scaling of the predictors in correlation form as described in (Belsley, 1991;Draper and Smith, 1998); the scaled option standardizes the predictors having zero mean and unit variance; and the centered option centers the predictors.
The lmridge() function returns an object of class "lmridge".The function summary(),kest(), and kstats1() etc., are used to compute and print a summary of the RR results, list of biasing parameter given in Table 4, and ridge related statistics such as estimated squared bias, R 2 and variance etc., after addition of k to diagonal of X X matrix.An object of class "lmridge" is a list, the components of which are described in Table 6: Table 7 lists the functions and methods available in lmridge package:

The lmridge package implementation in R
The use of lmridge is explained through examples by using the Hald dataset.
> library("lmridge") > mod <-lmridge(y ~X1 + X2 + X3 + X4, data = as.data.frame(Hald), + scaling = "sc", K = seq(0, 1, 0.001)) The output of linear RR from lmridge() function is assigned to an object mod.The first argument of the function is formula, which is used to specify the required linear RR model for the data provided as second argument.The print method for mod, an object of class "lmridge", will display the de-scaled coefficients.The output (de-scaled coefficients) from the above command is only for a few selected biasing parameter values.

Functions Description
Ridge coefficient estimation and testing lmridgeEst() The main model fitting function for implementation of RR models in R.

lmridge()
Generic function and default method that calls lmridgeEst() and returns an object of S3 class "lmridge" with different set of methods to standard generics.It has a print method for display of ridge de-scaled coefficients.

summary()
Standard RR output (coefficient estimates, scaled coefficients estimates, standard errors, t values and p values); returns an object of class "summaryridge" containing the relative summary statistics and has a print method.
Residuals, fitted values and prediction predict()

isrm.plot()
Plots ISRM and m-scale measure.rplots.plot()Miscellaneous ridge related plots such as df-trace, RSS and PRESS plots.The rstats1() and rstats2() functions can be used to compute different statistics for a given ridge biasing parameter specified in a call to lmridge.The ridge statistics are MSE, squared bias, F statistics, ridge variance, degrees of freedom by Hastie and Tibshirani (1990), condition numbers, PRESS, R 2 , and ISRM etc.Following are the results using rstats1() and rstats2() functions, for some (K = 0, 0.012, 0.1, 0.2).The residuals, fitted values from the RR and predicted values of the response variable y can be computed using functions residual(), fitted() and predict(), respectively.To obtain the Var-Cov matrix, VIF and Hat matrix, the function vcov(), vif() and hatr() can be used.The df are computed by following Hastie and Tibshirani (1990).The results for VIF, Var-Cov and diagonal elements of the hat matrix from vif(), vcov() and hatr() functions are given below for K = 0.012.
> mod$rfit > resid(mod) > fitted(mod) > infocr(mod) > press(mod) For given values of X, such as for first five rows of X matrix, the predicted values for some K = 0, 0.012, 0.1, and 0.2 will be computed by predict(): > predict(mod, newdata = as.data.frame(Hald[1 : 5, The effect of multicollinearity on the coefficient estimates can be identified by using different graphical displays such as ridge, VIF and df traces, plotting of RSS against df, PRESS vs k, and the plotting of bias, variance, and MSE against K etc.Therefore, for selection of optimal k using subjective (judgmental) methods, different plot functions are also available in lmridge package.For example, the ridge (Figure 1) or vif trace (Figure 2) can be plotted using plot() function.The argument to plot functions are abline = TRUE, and type = c("ridge","vif").By default, ridge trace will be plotted having horizontal line parallel to horizontal axis at y = 0 and vertical line on x-axis at k having minimum GCV.
The bias-variance tradeoff plot (Figure 3) may be used to select optimal k using bias.plot()function.The vertical line in bias-variance tradeoff plot shows the value of biasing parameter k and horizontal line shows minimum MSE for ridge.
The plot of model selection criteria AIC and BIC for choosing optimal k (Figure 4), info.plot()function may be used, Function cv.plot() plots the CV and GCV cross validation against biasing parameter k for the optimal selection of k (see Figure 5), that is, The R Journal Vol.10/2, December 2018 ISSN 2073-4859

> isrm.plot(mod)
The m-scale and ISRM (Figure 6) measures by Vinod (1976) can also be plotted from function of isrm.plot() and can be used to judge the optimal value of k.
Function rplots.plot()plots the panel of three plots namely (i) df trace, (ii) RSS vs k and (iii) PRESS vs k and may be used to judge the optimal value of k, see Figure 7.

Summary
Our developed lmridge package provides the most complete suite of tools for RR available in R, comparable to those available as listed in Table 1.We have implemented functions to compute the ridge coefficients, testing of these coefficients, computation of different ridge related statistics and computation of the biasing parameter for different existing methods by various authors (see Table 4).
We have greatly increased the ridge related statistics and different graphical methods for the selection of biasing parameter k through lmridge package in R.   ridge coefficients estimation, testing and computation of different statistics.We believe the availability of these tools will lead to increase utilization and better ridge related practices.
Up to now, a complete suite of tools for RR was not available for an open source or paid version of statistical software packages, resulting in reduced awareness and use of developed ridge related statistics.The package lmridge provides a complete open source suite of tools for the computation of The R Journal Vol.10/2, December 2018 ISSN 2073-4859

Table 1 :
(Arumairajan and Wijekoon, 2015)ware and R packages.The testing of ridge coefficient is performed only on scalar k, however, for vector of k, function rid() of lrmest package returns only MSE along with value of biasing parameter used.The function optimum() of lrmest package can be used to get the optimal scalar MSE and PRESS values(Arumairajan and Wijekoon, 2015).Statgraphics standardizes the dependent variable and computes some statistics for detection of collinearity such as R 2 , adj-R 2 , and VIF.Statgraphics also facilitates to perform RR and computes different RR related statistics such as VIF and ridge trace for different biasing parameter used, R 2 , adj-R 2 and standard error of estimates etc.The ltsbase package(Kan-Kilinc and Alpu, 2013, 2015)computes ridge and Liu estimates based on the least trimmed squares (LTS) method.The MSE value from four regression models can be compared graphically if the argument plot=TRUE is passed to the ltsbase() function.There are three main functions (i) ltsbase() computes the minimum MSE values for six models: OLS, ridge, ridge based on LTS, LTS, Liu, and Liu based on LTS method for sequences of biasing parameters ranging from 0 to 1.If print=TRUE, ltsbase() prints all the MSEs (along with minimum (Goeman et al., 2017)2016)StataCorp, 2014)has no built-in command, however ridgereg add-on is available that performs calculation on scalar k.The lrmest package(Dissanayake et al., 2016)computes estimators such as OLS, ordinary RR (ORR), Liu estimator (LE), LE type-1,2,3, Adjusted Liu Estimator (ALTE), and their type-1,2,3 etc.Moreover, lrmest provides scalar mean square error (MSE), prediction residual error sum of squares (PRESS) values of some of the estimators.MSE) for ridge, Liu, and ridge & Liu based on LTS method for the sequence of biasing parameters given by the user, (ii) the ltsbaseDefault() function returns the fitted values and residual of the six models (OLS, ridge, Liu, LTS, and ridge & Liu based LTS methods) having minimum MSE, and (iii) the ltsbaseSummary() function returns the coefficients and the biasing parameter for the best MSE among the four regression models.The penalized package(Goeman et al., 2017)is designed for penalized estimation in generalized linear models.The supported models are linear regression, logistic The R Journal Vol.10/2, December 2018 ISSN 2073-4859 regression, Poisson regression and the Cox proportional hazard models.

Table 2 :
Properties of the ridge estimator.

Table 3 :
Objective methods for selection of biasing parameter k.
methods to estimate biasing parameter k.Table 4 lists various methods for the selection of biasing parameter k, proposed by different researchers.

Table 4 :
Different available methods to estimate k.

Table 7 :
Functions and methods in lmridge package.The summary() function also displays ridge related R 2 , adjusted-R 2 , df, F statistics, AIC, BIC and minimum MSE at certain k given in lmridge().kest()function,which works with ridge fitted model, computes different biasing parameters developed by researchers, see Table4.The list of different k values (22 in numbers) may help in deciding the amount of bias needs to be introduced in RR.
The model selection criteria's of AIC and BIC can be computed using infocr() function for each value of K used in argument of ridge().For some K = 0, 0.012, 0.1, and 0.2, the AIC and BIC values are: