We provide a publicly available library FarmTest in the R programming system. This library implements a factor-adjusted robust multiple testing principle proposed by (Fan et al. 2019) for large-scale simultaneous inference on mean effects. We use a multi-factor model to explicitly capture the dependence among a large pool of variables. Three types of factors are considered: observable, latent, and a mixture of observable and latent factors. The non-factor case, which corresponds to standard multiple mean testing under weak dependence, is also included. The library implements a series of adaptive Huber methods integrated with fast data-driven tuning schemes to estimate model parameters and to construct test statistics that are robust against heavy-tailed and asymmetric error distributions. Extensions to two-sample multiple mean testing problems are also discussed. The results of some simulation experiments and a real data analysis are reported.
In the era of big data, large-scale multiple testing problems arise from a wide range of fields, including biological sciences such as genomics and neuroimaging, social science, signal processing, marketing analytics, and financial economics. When testing multitudinous statistical hypotheses simultaneously, researchers appreciate statistically significant evidence against the null hypothesis with a guarantee of controlled false discovery rate (FDR) (Benjamini and Hochberg 1995). Since the seminal work of (Benjamini and Hochberg 1995), multiple testing with FDR control has been extensively studied and successfully used in many applications. Most of the existing testing procedures are tailored to independent or weakly dependent hypotheses or tests. See, (Storey 2002), (Genovese and Wasserman 2004) and (Lehmann and Romano 2005), to name a few. The independence assumption, however, is restricted in real applications as correlation effects are ubiquitous in high dimensional measurements. Ignoring such strong dependency and directly applying standard FDR controlling procedures can lead to inaccurate false discovery control, loss of statistical power, and unreliable scientific conclusions.
Over the past decade, a multi-factor model has proven to be an effective tool for modeling cross-sectional dependence, with applications in genomics, neuroscience, and financial economics. Related references in the context of multiple testing include (Leek and Storey 2008), (Friguet et al. 2009), (Fan et al. 2012), (Desai and Storey 2011) and (Fan and Han 2017). A common thread of the aforementioned works is that the construction of test statistics and p-values heavily relies on the assumed joint normality of factors and noise, which is arguably another folklore regarding high dimensional data. Therefore, it is imperative to develop large-scale multiple testing tools that adjust cross-sectional dependence properly and are robust to heavy-tailedness at the same time.
Recently, (Fan et al. 2019) developed a
Factor-Adjusted Robust
Multiple Test (FarmTest) procedure for
large-scale simultaneous inference with highly correlated and
heavy-tailed data. Their emphasis is on achieving robustness against
both strong cross-sectional dependence and heavy-tailed sampling
distribution. Specifically, let
This article describes an R library named FarmTest, which implements the FarmTest procedure(s) developed in (Fan et al. 2019). It is a user-friendly tool to conduct large-scale hypothesis testing, especially when one or several of the following scenarios are present: the dimensionality is far larger than the sample size available; the data is heavy-tailed and/or asymmetric; there is strong cross-sectional dependence among the data. FarmTest is implemented using the Armadillo library (Sanderson and Curtin 2016) with Rcpp interfaces (Eddelbuettel and Francois 2011; Eddelbuettel and Sanderson 2014). A simple call of FarmTest package only requires the input of a data matrix and the null hypotheses to be tested. It outputs the hypotheses that are rejected, along with the p-values and some estimated parameters which may be of use in further analysis. Testing can be carried out for both one-sample and two-sample problems.
Another key feature of our package is that it implements several recently developed robust methods for fitting regression models (Zhou et al. 2018; Sun et al. 2020) and covariance estimation (Ke et al. 2019). When data is generated from a heavy-tailed distribution, test statistics that are based on the least-squares method are sensitive to outliers, which often causes significant false discoveries and suboptimal power (Zhou et al. 2018). The effect of heavy-tailedness is amplified by high dimensionality; even moderate-tailed distributions can generate very large outliers by chance, making it difficult to separate the true signals from spurious variables. As a result, large-scale multiple testing based on non-robust statistics may engender an excessive false discovery rate, which arguably is one of the causes of the current crisis in reproducibility in science. Moreover, to choose the multiple tuning parameters in robust regression and covariance estimation, we employ the recently developed data-driven procedures (Ke et al. 2019; Wang et al. 2020), which are particularly designed for adaptive Huber regression and are considerably faster than the cross-validation method used in (Fan et al. 2019).
We further remark that most existing multiple testing R packages do not
address the robustness against both heavy-tailed distribution and strong
dependence. The hypothesis testing function in R, named t.test
,
neither adjusts for strong dependence in the data nor estimates the
parameters in focus robustly. The built-in function p.adjust
or the
package
qvalue
(Storey 2002) only adjust user-input p-values for multiple testing and do not
address the problem of estimating the p-values themselves. The package
multcomp (Hothorn et al. 2008)
provides simultaneous testing tools for general linear hypotheses in
parametric models under the assumptions that the central limit theorem
holds. The package
multtest
(Pollard et al. 2005) is developed to implement non-parametric bootstrap and
permutation resampling-based multiple testing procedures. The multtest
can calculate test statistics based on ranked data which is robust
against outliers but yields biased mean estimators. In addition,
multtest cannot explicitly model the dependence structure in data. The
package mutoss is
designed to apply many existing multiple hypothesis testing procedures
with FDR control and p-value correction. Nevertheless, none of the tools
in mutoss is suitable to deal with both strong dependency and
heavy-tailedness. Moreover, existing packages are often difficult to
navigate since users need to combine many functions to perform multiple
tests.
In this section, we revisit the problem of simultaneous inference on the mean effects under a factor model and discuss the main ideas behind the FarmTest method developed by (Fan et al. 2019).
Suppose we observe
Here we take the two-sided test ((1)) as an example
to discuss the false discovery rate (FDR) control. For
Assume that there are
In this section, we discuss the construction of test statistics under
strong cross-sectional dependency captured by common factors.
Specifically, we allow the
Our package allows the common factor
As another key feature, the FarmTest method is robust against
heavy-tailed sampling distributions. Under such scenarios, the ordinary
least squares estimators can be suboptimal. Recently, (Fan et al. 2017) and
(Sun et al. 2020) proposed the adaptive Huber regression method, the core of
which is Huber’s
In this section, we formally describe the algorithms for the FarmTest procedure. We revisit and discuss procedures for the two scenarios with observable and unobservable/latent factors (Zhou et al. 2018; Fan et al. 2019). Notice that the two scenarios are inherently different in terms of estimating unknown parameters and constructing test statistics. Moreover, the selection of tuning parameters is based on the recent methods proposed by (Ke et al. 2019) and (Wang et al. 2020).
Suppose we observe independent data vectors
Input: Data
For
Estimation of residual variances
Construct test statistics
Compute p-values
Estimate the proportion of true alternatives:
Order the p-values as
Compute the rejection threshold
Reject each hypothesis in the set
An extension of Algorithm 1 to the two-sample problem is also
implemented in the package. Suppose we observe two independent samples
In this section, suppose we are given independent observations
Input: Data
For
Define the paired data
Define the covariance matrix estimator
For
Construct test statistics
Compute p-values
Estimate the proportion of true alternatives:
Compute the rejection threshold
Reject each hypothesis in the set
An extension of Algorithm 2 to the two-sample problem is also
included in the library. Suppose we observe two independent samples
$$
where
The FarmTest procedure involves multiple tuning parameters, including
the number of factors
The robustification parameter in the Huber loss plays an important role
in controlling the bias-robustness tradeoff. According to the
theoretical analysis in (Zhou et al. 2018), the optimal choice of
The FarmTest package is publicly available from the Comprehensive R
Archive Network (CRAN) and its GitHub page
https://github.com/XiaoouPan/FarmTest. It contains four core
functions. The main function farm.test
carries out the entire FarmTest
procedure, and outputs the testing results along with several useful
estimated model parameters. User-friendly summary, print, and plot
functions that summarize and visualize the test outcome are equipped
with farm.test
. The other three functions, huber.mean
, huber.cov
and huber.reg
implement data-driven robust methods for estimating the
mean vector and covariance matrix (Ke et al. 2019) as well as the regression
coefficients (Wang et al. 2020). In particular, the huber.reg
function uses
the gradient descent algorithm with Barzilai and Borwein step size
(Barzilai and Borwein 1988). In this section, we focus primarily on introducing the
farm.test
function, and demonstrate its usage with numerical
experiments.
We first present an example by applying the package to a synthetic
dataset. To begin with, we use the
rstiefel package
(Hoff 2012) to simulate a uniformly distributed random orthonormal matrix
as the loading matrix
library(FarmTest)
library(rstiefel)
library(mvtnorm)
n <- 120
p <- 400
K <- 5
set.seed(100)
B <- rustiefel(p, K) %*% diag(rep(sqrt(p), K))
FX <- rmvnorm(n, rep(0, K), diag(K))
p1 <- 100
strength <- 1
mu <- c(rep(strength, p1), rep(0, p - p1))
U <- rmvt(n, diag(p), 3)
X <- rep(1, n) %*% t(mu) + FX %*% t(B) + U
Using the data generated above, let us call the main function
farm.test
with all default optional parameters, and then print the
outputs.
output <- farm.test(X)
output
One-sample FarmTest with unknown factors
n = 120, p = 400, nFactors = 5
FDR to be controlled at: 0.05
Alternative hypothesis: two.sided
Number of hypotheses rejected: 104
As shown in the snapshot above, the function farm.test
correctly
estimates the number of factors, and rejects
FDP <- sum(output$reject > p1) / length(output$reject)
FDP
[1] 0.03846154
power <- sum(output$reject <= p1) / p1
power
[1] 1
All the outputs are incorporated into a list, which can be quickly
examined by names()
function. See Table 1 for detailed
descriptions of the outputs.
names(output)
[1] "means" "stdDev" "loadings" "eigenVal" "eigenRatio" "nFactors"
[7] "tStat" "pValues" "pAdjust" "significant" "reject" "type"
[13] "n" "p" "h0" "alpha" "alternative"
Output | Implication | Data type | R class |
means |
estimated means | matrix | |
stdDev |
estimated standard deviations | matrix | |
loadings |
estimated loading matrix | matrix | |
eigenVal |
eigenvalues of estimated covariance | matrix | |
eigenRatio |
eigenvalue ratios of estimated covariance | ( |
matrix |
nFactors |
(estimated) number of factors | positive integer | integer |
tStat |
test statistics | matrix | |
pValues |
p-values | matrix | |
pAdjust |
adjusted p-values | matrix | |
significant |
indicators of significance | boolean |
matrix |
reject |
indices of rejected hypotheses | vector | integer |
type |
whether factor is known | string | character |
n |
sample size | positive integer | integer |
p |
data dimension | positive integer | integer |
h0 |
null hypothesis | numeric | |
alpha |
nominal FDR level | numerical number | numeric |
alternative |
alternative hypothesis | string | character |
We can present the testing results using the affiliated summary
function.
head(summary(output))
means p-values p-adjusted significance
1 1.0947056 1.768781e-18 8.936997e-17 1
2 0.8403608 3.131733e-09 1.157817e-08 1
3 0.8668348 1.292850e-11 6.532295e-11 1
4 0.9273998 2.182485e-12 1.350281e-11 1
5 0.7257105 7.699350e-08 2.593465e-07 1
6 0.9473088 1.180288e-13 1.192712e-12 1
To visualize the testing results, in Figure 1 we present
several plots based on the outputs. From the histograms of estimated
means and test statistics, we see that data are generally categorized
into two groups, one of which has
In this section, we illustrate farm.test
function with other options
that allow us to call it more flexibly. When the factors are observable,
we can simply put the fX
, and
the output is formatted the same as before. As a remark, among all the
items listed in Table 1, eigenVal
and eigenRatio
, which
are eigenvalues and eigenvalue ratios of estimated covariance matrix,
are not available in this case; see Algorithm 1.
output <- farm.test(X, fX = FX)
output
One-sample FarmTest with known factors
n = 120, p = 400, nFactors = 5
FDR to be controlled at: 0.05
Alternative hypothesis: two.sided
Number of hypotheses rejected: 101
Consider one-sided alternatives alternative
and alpha
as follows:
output <- farm.test(X, alternative = "greater", alpha = 0.01)
output
One-sample FarmTest with unknown factors
n = 120, p = 400, nFactors = 5
FDR to be controlled at: 0.01
Alternative hypothesis: greater
Number of hypotheses rejected: 101
Users can specify null hypotheses by passing any vector with length h0
. In the next example, we consider the
output <- farm.test(X, h0 = rep(1, p), alpha = 0.01)
output
One-sample FarmTest with unknown factors
n = 120, p = 400, nFactors = 5
FDR to be controlled at: 0.01
Alternative hypothesis: two.sided
Number of hypotheses rejected: 300
When the factors are unknown, users can also specify the number of
factors based on some subjective grounds. In this case, Step 3 in
Algorithm 2 is avoided. For example, we run the function with
the number of factors chosen to be KX = 2
, which is less than the true
parameter
output <- farm.test(X, KX = 2)
power <- sum(output$reject <= p1) / p1
power
[1] 0.98
As a special case, if we declare KX = 0
in the function, a robust
multiple test without factor-adjustment is conducted.
output <- farm.test(X, KX = 0)
output
One-sample robust multiple test without factor-adjustment
n = 120, p = 400
FDR to be controlled at: 0.05
Alternative hypothesis: two.sided
Number of hypotheses rejected: 95
Finally, we present an example of two-sample FarmTest. Using the same
sampling distributions for the factor loading matrix, factors and noise
vectors, we generate another sample
m <- 150
set.seed(200)
BY <- rustiefel(p, K) %*% diag(rep(sqrt(p), K))
FY <- rmvnorm(m, rep(0, K), diag(K))
uY <- rmvt(m, diag(p), 3)
Y <- FY %*% t(BY) + uY
Then farm.test
function can be called with an additional argument Y
.
output <- farm.test(X, Y = Y)
output
Two-sample FarmTest with unknown factors
X.n = 120, Y.n = 150, p = 400, X.nFactors = 5, Y.nFactors = 5
FDR to be controlled at: 0.05
Alternative hypothesis: two.sided
Number of hypotheses rejected: 105
The output is formatted similarly as in Table 1, except that
means
, stdDev
, loadings
, eigenVal
, eigenRatio
, nFactors
and
n
now consist of two items for samples X
and Y
.
names(output$means)
[1] "X.mean" "Y.mean"
In this section, we assess and compare the performance of farm.test
function in the FarmTest package with the following methods:
t.test
;onesamp.marginal
function in the mutoss package;KX = 0
in the farm.test
function.For
In all the numerical experiments, we consider two-sided alternatives
with a nominal FDR level
For each model, we consider various combinations of sample size
Figures 2 and 3 depict the FDR and power curves
for either "fixed
In this section, we apply the FarmTest package to test the mean effects of stock returns. In capital asset pricing theory, the stock’s risk-adjusted mean return or "alpha" is a quantity of interest since it indicates the excessive return incurred from investing in a particular stock. If the efficient equity market hypothesis holds, we expect "alpha" to be zero. Hence, detecting non-zero alphas can help investors to identify market inefficiencies, that is, whether certain stocks exhibit an abnormal rate of return or are mispriced. As discussed in (Cont 2001), both cross-sectional dependency and heavy tailedness are silent features of stock returns.
In this study, we test the annual mean effects of stocks in the S&P500
index. The data is available on COMPUSTAT and CRSP databases. We find
that most of the stocks with continuous membership in the S&P500 index
from 2008 to 2016 have excess kurtosises greater than zero, indicating
tails heavier than that of a normal distribution. Also, more than
The nominal FDR level is set as
Method | Mean | Std. Dev. | Median | Min | Max |
FarmTest | 14.477 | 11.070 | 12 | 0 | 52 |
WMW-test | 10.991 | 1.005 | 11 | 8 | 12 |
RmTest | 8.147 | 14.414 | 3 | 0 | 68 |
We provide an R package to implement FarmTest, a flexible large-scale multiple testing method that is robust against strongly dependent and heavy-tailed data. The factor-adjustment procedure helps to construct weakly dependent test statistics, and also enhances statistical power by reducing the signal-to-noise ratio. Moreover, by exploiting the idea of adaptive Huber regression, the testing procedure is robust against heavy-tailed noise. The efficacy of our package is demonstrated on both real and simulated datasets.
FarmTest, Rcpp, multcomp, mutoss, rstiefel
Bayesian, ClinicalTrials, HighPerformanceComputing, NumericalMathematics, Survival
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Fan, et al., "FarmTest: An R Package for Factor-Adjusted Robust Multiple Testing", The R Journal, 2021
BibTeX citation
@article{RJ-2021-023, author = {Fan, Koushiki Bose, Jianqing and Ke, Yuan and Zhou, Xiaoou Pan, Wen-Xin}, title = {FarmTest: An R Package for Factor-Adjusted Robust Multiple Testing}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {12}, issue = {2}, issn = {2073-4859}, pages = {389-402} }