Estimability Tools for Package Developers

When a linear model is rank-deficient, then predictions based on that model become questionable because not all predictions are uniquely estimable. However, some of them are, and the estimability package provides tools that package developers can use to tell which is which. With the use of these tools, a model object's predict method could return estimable predictions as-is while flagging non-estimable ones in some way, so that the user can know which predictions to believe. The estimability package also provides, as a demonstration, an estimability-enhanced epredict method to use in place of predict for models fitted using the stats package.


Introduction
Consider a linear regression or mixed model having fixed component of the matrix form Xβ β β.If X is not of full column rank, then there is not a unique estimate b of β β β.However, consider using λ λ λ b to estimate the value of some linear function λ λ λ β β β = ∑ j λ j β j .(We use x to denote the transpose of a vector x.)For some λ λ λs, the prediction depends on the solution b; but for others-the estimable ones-it does not.

An illustration
An example will help illustrate the issues.In the following commands, we create four predictors x1-x4 and a response variable y: Note that in each model, two regression coefficients are NA.This indicates that the associated predictors were excluded due to their linear dependence on the others.
It happens that cases 1, 3, and 4 are estimable, and the others are not.It would be helpful to know which predictions to trust, rather than a vague warning.The epredict function in estimability (Lenth, 2015) accomplishes this: > require("estimability") > rbind(epred1234 = epredict(mod1234, newdata = testset), + epred4321 = epredict(mod4321, newdata = testset)) 1 2 3 4 5 6 epred1234 14 NA 23 5 NA NA epred4321 14 NA 23 5 NA NA The results for non-estimable cases are indicated by NAs.Note that in both models, the same new-data cases are identified as estimable, and they are the ones that yield the same predictions.Note also that estimability was determined separately from each model.We do not need to compare two different fits to determine estimability.

Adding estimability checking to a modeling package
It is a simple matter to add estimability checking to the predict method(s) in a new or existing package.
1.The package should import the estimability package.
2. Use estimability::nonest.basis to obtain the basis for non-estimable functions of the regression coefficients, preferably in the model-fitting code.

} ... }
The nonest.basisfunction returns a matrix whose columns span the null space of the model matrix X. If, for some reason, the QR decomposition is not used in model fitting, there is also a nonest.basismethod that can be called with X itself.But if the QR decomposition of X is available, it is most efficient to use it.
The is.estble function tests each row of its first argument for estimability against the null basis provided in its second argument, and returns a logical vector.A third argument may be added to adjust the tolerance used in this test.It is important to remember that all columns of the model matrix be included.In particular, do not exclude the columns corresponding to NAs in the coefficient estimates, as they are needed for the estimability testing.Typically, the results of is.estble would be used to skip any non-estimable predictions and replace them with NA.
If there is not a rank deficiency in the model, nonest.basisreturns estimability::all.estble, which is a trivial null basis (still of matrix class) with which is.estble will return all TRUE.The nonest.basisfunction, when called with a qr object, can immediately detect if there is no rank deficiency and will return all.estble.If your model-fitting algorithm does not use the QR decomposition, it may be worth including additional code to check for non-singular models, in which case it sets the null basis to estimability::all.estble, rather than have nonest.basisperform the additional computation to figure this out.
The sample code above is typical for the S3 object model.If S4 objects are being used, you may want to instead include a slot of class "matrix" for nonest; or incorporate nonest as part of some other slot.
To illustrate briefly, consider the previous example.The null basis for mod1234 is obtained as follows:

Theory of estimability
The theory of estimability in linear models is well established, and can be found in almost any text on linear models, such as classics like Searle (1997) and Seber and Lee (2003).In this discussion, I will make specific references to portions of a more recent reference, Monahan (2008).
Before delving in, though, it is worth noting (based on what is said in some contributed packages' documentation and code) that there seems to be some confusion in the R community between nonestimability of a linear function λ λ λ β β β and the placement of nonzero λ j coefficients relative to the positions of NA values in the estimate b.It is not possible to assess estimability with this information.Note, for instance, in the example in the Introduction, we obtained two different bs.Between them we can find an NA in the estimates for every predictor except the intercept.In fact, while NA usually refers to unknown values, that is not the case here.An NA regression coefficient signals a coefficient that is constrained to zero in order to obtain an estimate.And there is nothing wrong with multiplying some of the λ j by zero.To assess estimability of λ λ λ β β β, we must look at all of the elements of λ λ λ, even those corresponding to NAs in the estimates.

General results
When X is not full-rank, the solution b to the normal equations is not unique.However, the predicted values, Xb, are uniquely estimable; that is, for any two solutions b 1 and b 2 , Xb 1 = Xb 2 .Note that the The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 ith element of Xb is x i b where x i is the ith row of X.Thus, x i β β β is estimable for each i, and so are any linear combinations of these.Indeed, λ λ λ β β β is estimable if and only if λ λ λ is in the row space of X-or equivalently, λ λ λ is in the column space of X (Monahan, 2008, Result 3.1).
From the above result, we can establish estimability of λ λ λ β β β either by showing that λ λ λ is in the column space of X , or by showing that it is orthogonal to the null space of X (Monahan, 2008, Methods 3.2 and 3.3, respectively).One way to implement the second idea is to note that if (X X) − is any generalized inverse of X X, then P = (X X)(X X) − is a projection onto the column space of X , and so I − P is a projection onto the null space of X.The columns of I − P thus comprise a basis for this null space.Accordingly, estimability of λ λ λ β β β can be determined by showing that λ λ λ (I − P) = 0 .
SAS uses I − P, as described above, to test estimability.Of course, in computation we need to set a tolerance for being close enough to 0. SAS deems λ λ λ β β β non-estimable if λ λ λ = 0 and max |λ λ λ (I − P)| > ψ • max |λ λ λ|, where ψ is a tolerance factor with a default value of 10 −4 .For details, see SAS Institute Inc.  (2013), the chapter on "Shared Concepts," documentation for the ESTIMATE statement and its SINGULAR option.

Methods used by estimability
In the estimability package, we obtain the null basis in a different way than shown above, in part because the QR decomposition of X is already (usually) available in an lm object.Suppose that X is n × p, then we can write X = QR, where Q (n × p) has orthonormal columns, and R (p × p) is upper triangular with nonzero diagonal elements.If X is of rank r < p, then columns are pivoted so that the linearly dependent ones come last, and we only need r columns of Q and the top r rows of R, along with the pivoting information.Call these pivoted, down-sized matrices Q and R respectively; and let X (still n × p) denote X with its columns permuted according to the pivoting.We now have that X = Q R, and R comprises the first r rows of an upper triangular matrix.Observe that each row of X is thus a linear combination of the rows of R-that is, the row space of X is the same as the row space of R.So the much smaller matrix R has everything we need to know to determine estimability.Now, let d = p − r denote the rank deficiency, and define Then S is an upper triangular matrix with nonzero diagonal-hence it is of full rank p = r + d.Let S = TU be the QR decomposition of S. We then have that T has orthonormal columns, and in fact it is a Gram-Schmidt orthonormalization of S. Accordingly, the first r columns of T comprise an orthonormalization of the columns of R , and thus they form a basis for the row space of R and hence of X. Letting N denote the last d columns of T, we have that N has orthonormal columns, all of which are orthogonal to the first r columns of T and hence to the row space of R and X.It follows that N is an orthonormal basis for the null space of X.After permuting the rows of N according to the original pivoting in the QR decomposition of X, we obtain a basis N for the null space of X.
To test estimability of λ λ λ β β β, first compute z = λ λ λ N, which is a d-vector.Theoretically, λ λ λ β β β is estimable if and only if z z = 0; but for computational purposes, we need to set a tolerance for its being suitably small.Again, we opt to deviate from SAS's approach, which is based on max |z i |.Instead, we deem λ λ λ β β β non-estimable when λ λ λ = 0 and z z ≥ τ • λ λ λ λ λ λ, where τ is a tolerance value.Our default value is τ = (10 −4 ) 2 = 10 −8 .The rationale for this criterion is that it is rotation-invariant: We could replace N ← VN where V is any p × p orthogonal matrix, and z z will be unchanged.Such rotation invariance seems a desirable property because the assessment of estimability does not depend on which null basis is used.

Bells and whistles
The estimability package's epredict function serves as a demonstration of adding estimability checking to the predict methods in the stats package.It works for lm, aov, glm, and mlm objects.It also provides additional options for the type argument of predict.When newdata is provided, type = "estimability" returns a logical vector showing which rows of newdata are estimable; and type = "matrix" returns the model matrix for newdata.
An accompanying enhancement is eupdate, which runs update on a model object and also adds a nonest member.Subsequent epredict calls on this object will use that as the null basis instead of having to reconstruct it.For example, > mod123 <-eupdate(mod1234, .

Conclusion
In rank-deficient linear models used for predictions on new predictor sets, it is important for users to know which predictions are to be trusted, and which of them would change if the same model had been specified in a different way.The estimability package provides an easy way to add this capability to new and existing model-fitting packages.